
on Sat Aug 06 2011, Gottlob Frege <gottlobfrege-AT-gmail.com> wrote:
On Mon, Aug 1, 2011 at 10:49 AM, Dave Abrahams <dave@boostpro.com> wrote:
on Mon Aug 01 2011, Gordon Woodhull <gordon-AT-woodhull.com> wrote:
On Jul 31, 2011, at 1:05 PM, Dave Abrahams wrote:
I don't get it. Please, show me an example.
I am definitely not an expert. I'll quote a diagram from this fascinating paper by Adve and Boehm [1], using Dekker's mutual exclusion algorithm as an example
Initially X = Y = 0 Thread 1 X = 1; r1 = Y;
Thread 2 Y = 1; r2 = X;
Can r1 = r2 = 0?
Sure they can, that's a race condition: Thread1 modifies X while thread 2 reads it. Anything could happen. Why would anyone expect otherwise.
Most people are surprised that *both* r1 and r2 can be 0 at the same time.
"At the same time" is ill-defined until you put in some kind of synchronization. If you synchronize those two threads at the end I expect all their effects to have taken place. But they would still have a race condition so I wouldn't expect any particular result.
We have been trained that our code (single threaded) happens sequentially. So Y = 1; r2 = X;
means "set Y **THEN** read X" - *in that order*
But it doesn't mean that!
now, really, we don't know or care about the true order (from the point of view of this CPU) until we later read it (from this CPU), but that's how we imagine it. In the single-threaded world we grew up in.
Who is "we," Kimosabe? I *really* don't think about code that way. I generally try to avoid writing code that makes me think about the order in which things happen.
So imagine the race you mention - thread1 is writing X while thread2 is reading it. In the traditional view - ie assuming a thread of code happens in the same order that it was written - this means:
A. thread1 has not yet read Y (thread1 is currently writing X, and reading Y happens next) B. thread2 has finished setting Y (because it is currently reading X, which happens after setting Y) ---- C. therefore thread1 *must* read Y after it was set, therefore it will see Y==1, therefore r1 == 1.
But my point is that even if this reasoning held, it is a hopelessly un-scalable way to think about parallel execution. I can't reason that way about the behavior of a complex system without having a nervous breakdown, and I don't believe it's tractable for most other people either. That's why I'm saying that sequential consistency is not a game-changer in terms of reasoning about parallelism.
If you grew up coding in different circumstances (you once mentioned doing lots of music/sound stuff early on - which tends to be very lock-free-ish), then maybe you don't make the traditional "N happens before N+1" assumptions.
I don't think I ever had to confront the issue.
Having said all that, it doesn't mean thinking about re-orderings helps anything. That I agree with. The key isn't "here's more to think about" (reordering), it is "here's less to think about" - ie less to assume - stop thinking a single thread happens exactly one line after the other. But this seems to be hard to remember NOT to think about, since we've been thinking that way for years.
It is better to think about how writing shared variables exposes state, and whether that correctly exposes only invariants. You may already be thinking this way.
Yes, that's exactly what I'm saying.
eg: Imagine the invariant is assert(y >= x). Imagine p is local, and the invariant only need to hold once p is exposed globally.
p->x = p->y = 0; // invariant OK ... p->x = 10; // invariant temporarily broken, but local, so OK p->y = 20; // invariant restored // can now expose: globalP = p; // globally expose state here, after invariant is ensured.
Thinking of "invariant exposure" makes it simple to realize that the globalP = p is the point of exposure (or "publication"). For threading and our memory model, this is also exactly where the barrier is needed (if you want to think about barriers) and exactly where the atomic operation is needed (or where the mutex can be unlocked).
Yes.
So just make globalP atomic. No need to think about reordering.
Only if the atomic comes with the barriers that ensure the writes to *p have happened before globalP is modified... which is what sequential consistency (the default for atomics in C++11) provides.
But if people ask why the point of exposure is so important, or why globalP needs to be atomic or how it could possibly fail if globalP was just a pointer, you need to explain that p->y = 20 can happen (or be seen) after globalP = p. Which is like the r1 == r2 == 0 example - it goes against tradition.
No you don't. All you need to say is that *every* datastructure (even a pointer like globalP) has an invariant that is broken during modification. The implication is that unless globalP is atomic the other thread might see a broken value of it and then all bets are off.
Most people would think that once globalP has been set, then globalP->y = 20. Since that's the order the lines of code are written in. But that's only true if globalP is an atomic.
You don't even need to think as far as globalP->y if you assume globalP can be totally broken unless access is synchronized.
Once they believe and/or understand that "invariant exposure happens on shared variables therefore shared variables must be atomic", then reordering can be forgotten. You can then think of your code as blocks that happen between read/writes of atomics (ie atomics are the sequence points). Just like critical sections are blocks of code. You have blocks defined by published-invariant sequence points. Or something like that.
Something like that. *If* programming with atomics is for mortals.
So I agree "sequential consistency" doesn't help much. Nor does reordering. Think about publishing invariants.
I think about "not exposing broken invariants."
(Actually what "sequential consistency" of the atomics means is that you can now reason about the large code blocks defined by those sequence points, ignoring their details. So that is useful.)
Tony
(For the pedantic: in this example, sequential consistency was overkill. we only needed "release" semantics on globalP = p)
Right. -- Dave Abrahams BoostPro Computing http://www.boostpro.com