Re: [boost] [thread] Some thoughts about "atomic"

1 Dec 2006

      Roland Schwarz <roland.schwarz@chello.at> writes:
...
Anthony Williams wrote:
...
Roland Schwarz <roland.schwarz@chello.at> writes:
...
In particular in 
presence of multiple processors. I.e. an atomic lib is primarily about 
performance.
Not just about performance. It also enables the construction of the
higher-level primitives.
As you might know, this was the route I am following. But the primitives 
are not necessarily exposed to the user. To be more precise: From a user 
perspective an atomic lib is primarily about performance. Better?
Maybe. I'm not sure.
...
...
I think that the memory barrier and acquire/release semantics are just two
ways of talking about the same thing.
This is a point where I am still confused about. acquire/release are 
"one way" ordering constraints while memory barriers are "both way".
acquire/release gets me in a twist. Re-reading my refs, you're right.
...
This is as I understand it:
*)    memory barriers are primitives which have no other effect as to 
order memory access, i.e. they do not store or load anything by themselves.
Also they affect only memory operations and have no effect on others.
Also they should affect the compiler, as to disallow reordering across
the barrier.
There are 3: read_mb, write_mb and full_mb.
read_mb.
read_mb orders read access, i.e. no read from before may be moved after 
the barrier and no read from after may be moved before the barrier.
write_mb does the same for writes. full_mb disallows moving any read or 
write across the barrier, and so establishes total order.
Yes.
...
*)    acquire/release semantics on the other hand establish a 
conceptional different model of ordering. acquire disallows moving any 
access (read/write) from after the primitive to occur before it, but 
still allows accesses from before to occur after it. This is kind of a 
one-way sign for accesses.
release semantics is the other way round.
Also acquire/release is bound to an operation kind of an attribute of 
the operation, while memory barriers are operations on their own.
Yes. Acquire semantics only tend to apply to read operations, and Release
semantics to write ops.
...
So as I currently understand it, these two concepts are about the same 
issue, but are neither orthogonal nor can one be used to synthesize the 
other. I would be glad to be proved wrong.
You're right. I was getting confused.
...
Another observation: release/acquire semantics is closer to mutex 
behavior, since no harm is done when an operation from before mutex 
acquisition is moved inside the critical section. A barrier would not 
allow such an operation. True?
True.
...
...
As I understand it, on x86, the SFENCE instruction is a "Store Fence", which
is a "Write Barrier", and has "Release Semantics". Any store instructions
which happen before it on this CPU are made globally visible afterwards. No
stores instructions which occur afterwards on this CPU are permitted to be
globally visible beforehand.
This looks to me as it is possible in the acquire/release model to 
separate the operation from the "fence". Or viewed in the other 
direction, it is possible to optimize by attaching the (otherwise 
separate) fence to a instruction to save some cpu cycles. True?
This SFENCE instruction is just a store fence. Some other (non-fence)
instructions also have fence-like properties, but they tend to be full
barriers.
...
...
Again on x86, the LFENCE instruction is a "Load Fence", which is a "Read
Barrier", and has "Acquire Semantics". Any read instructions which happen
before it on this CPU must have already completed afterwards.
Are you really sure about this one?
Yes, apart from the "acquire semantics". The intel spec says:

    "Performs a serializing operation on all load-from-memory instructions
    that were issued prior the LFENCE instruction. This serializing operation
    guarantees that every load instruction that precedes in program order the
    LFENCE instruction is globally visible before any load instruction that
    follows the LFENCE instruction is globally visible."
...
...
No loads
instructions which occur afterwards on this CPU are permitted to be executed
beforehand.
This part of the statement makes sense to me.
I omitted the rest of your post, since I think it depends on the 
acquire/release versus memory barriers getting clarified first.
...
The details of the memory model, atomics, and visibility, and how it applies
to C++, are under discussion amongst C++ standards committee members. I would
imagine that you'd be welcome to join such discussions.
Hmm, not sure how I could join other than posting to some lists. Do you 
mean comp.lang.c++.moderated?
No. There is a cpp-threads mailing list, and the C++ Standards committee
reflectors. 

Peter Dimov told me how to get on cpp-threads.

Ask your national body (Here in the UK it's BSI, in Germany it's DIN, and in
the USA it's ANSI) about joining their C++ panel, and speak to Andy Koenig
about getting added to the committee reflectors. If your national body is
unhelpful, speak to Lois Goldthwaite (standards@accu.org), and she'll probably
let you join the BSI panel.

Anthony
-- 
Anthony Williams
Software Developer
Just Software Solutions Ltd
http://www.justsoftwaresolutions.co.uk