
Roland Schwarz <roland.schwarz@chello.at> writes:
Anthony Williams wrote:
Roland Schwarz <roland.schwarz@chello.at> writes:
In particular in presence of multiple processors. I.e. an atomic lib is primarily about performance.
Not just about performance. It also enables the construction of the higher-level primitives.
As you might know, this was the route I am following. But the primitives are not necessarily exposed to the user. To be more precise: From a user perspective an atomic lib is primarily about performance. Better?
Maybe. I'm not sure.
I think that the memory barrier and acquire/release semantics are just two ways of talking about the same thing.
This is a point where I am still confused about. acquire/release are "one way" ordering constraints while memory barriers are "both way".
acquire/release gets me in a twist. Re-reading my refs, you're right.
This is as I understand it:
*) memory barriers are primitives which have no other effect as to order memory access, i.e. they do not store or load anything by themselves. Also they affect only memory operations and have no effect on others. Also they should affect the compiler, as to disallow reordering across the barrier. There are 3: read_mb, write_mb and full_mb. read_mb. read_mb orders read access, i.e. no read from before may be moved after the barrier and no read from after may be moved before the barrier. write_mb does the same for writes. full_mb disallows moving any read or write across the barrier, and so establishes total order.
Yes.
*) acquire/release semantics on the other hand establish a conceptional different model of ordering. acquire disallows moving any access (read/write) from after the primitive to occur before it, but still allows accesses from before to occur after it. This is kind of a one-way sign for accesses. release semantics is the other way round.
Also acquire/release is bound to an operation kind of an attribute of the operation, while memory barriers are operations on their own.
Yes. Acquire semantics only tend to apply to read operations, and Release semantics to write ops.
So as I currently understand it, these two concepts are about the same issue, but are neither orthogonal nor can one be used to synthesize the other. I would be glad to be proved wrong.
You're right. I was getting confused.
Another observation: release/acquire semantics is closer to mutex behavior, since no harm is done when an operation from before mutex acquisition is moved inside the critical section. A barrier would not allow such an operation. True?
True.
As I understand it, on x86, the SFENCE instruction is a "Store Fence", which is a "Write Barrier", and has "Release Semantics". Any store instructions which happen before it on this CPU are made globally visible afterwards. No stores instructions which occur afterwards on this CPU are permitted to be globally visible beforehand.
This looks to me as it is possible in the acquire/release model to separate the operation from the "fence". Or viewed in the other direction, it is possible to optimize by attaching the (otherwise separate) fence to a instruction to save some cpu cycles. True?
This SFENCE instruction is just a store fence. Some other (non-fence) instructions also have fence-like properties, but they tend to be full barriers.
Again on x86, the LFENCE instruction is a "Load Fence", which is a "Read Barrier", and has "Acquire Semantics". Any read instructions which happen before it on this CPU must have already completed afterwards.
Are you really sure about this one?
Yes, apart from the "acquire semantics". The intel spec says: "Performs a serializing operation on all load-from-memory instructions that were issued prior the LFENCE instruction. This serializing operation guarantees that every load instruction that precedes in program order the LFENCE instruction is globally visible before any load instruction that follows the LFENCE instruction is globally visible."
No loads instructions which occur afterwards on this CPU are permitted to be executed beforehand.
This part of the statement makes sense to me.
I omitted the rest of your post, since I think it depends on the acquire/release versus memory barriers getting clarified first.
The details of the memory model, atomics, and visibility, and how it applies to C++, are under discussion amongst C++ standards committee members. I would imagine that you'd be welcome to join such discussions.
Hmm, not sure how I could join other than posting to some lists. Do you mean comp.lang.c++.moderated?
No. There is a cpp-threads mailing list, and the C++ Standards committee reflectors. Peter Dimov told me how to get on cpp-threads. Ask your national body (Here in the UK it's BSI, in Germany it's DIN, and in the USA it's ANSI) about joining their C++ panel, and speak to Andy Koenig about getting added to the committee reflectors. If your national body is unhelpful, speak to Lois Goldthwaite (standards@accu.org), and she'll probably let you join the BSI panel. Anthony -- Anthony Williams Software Developer Just Software Solutions Ltd http://www.justsoftwaresolutions.co.uk