
Alexander Terekhov <terekhov <at> web.de> writes:
Hans Boehm wrote: [...]
For what it's worth, Sarita Adve is both an author of the report you cite
and
the original and perhaps strongest advocate for the "sequential consistency for data-race-free programs" programming model.
I'm not contra "sequential consistency for data-race-free programs" programming model for programs using locks. On PPC, for example, such programs don't even need hwsync. For programs with lock-free atomics OTOH, the races (concurrent accesses to the same locations with loads competing with concurrent stores) is a feature, not a bug, and SC is simply way too expensive (e.g. it needs hwsync on PPC) for use in default mode for lock-free atomics: C/C++ is "you don't pay for what you don't need".
regards, alexander.
The question is when the "sequential consistency for data-race-free programs" should extend to programs using atomic load, store, and RMW operations. The C++ committee, including me, came to the conclusion that the answer needs to be the yes; there are many cases in which the use of atomics is fairly straightforward and useful. And it should be possible to use them without leaving this relatively simple programming model. By doing so, you get a safe programming model by default. Since we do have explicit ordering primitives, you have the option of only paying for what you need. But 90%, or probably 99% of programmers will not know what they need here. And that's fine. This is entirely consistent with many other C++ design decisions. The default operator new allocates memory that lives as long as the process, even though that's more expensive than allocating memory local to the current stack frame or thread, and often one of those latter two options would be sufficient. But it would be nasty to use that as default behavior. The overhead of enforcing sequential consistency is unfortunately currently very platform-specific. On X86, it's increasingly minor, since it's possible to confine the added cost to stores and, as far as I can tell, the added cost is becoming much less than the cost of a coherence miss. And if your performance is limited by the cost of stores to shared variables, you are fairly likely to unavoidably see lots of coherence misses, so there is a hand- wavy argument that this is likely to be a minor perturbation. On other architectures, the costs are unfortunately larger. But my impression is that they are decreasing everywhere, as architects pay more attention to synchronization costs. Hans