
Dave Abrahams wrote: [... memory model ...]
It's not really different than locking. If you want to write to shared data, you need some way of making it not-a-race. It's just that when the data structure is small enough (like an int) you can make it atomic instead of putting a lock around it.
No. See: http://www.cl.cam.ac.uk/~pes20/cppppc/ Note that the proposed MM is still incomplete by (currently) not supporting atomic RMW operations (load-reserve/store-conditional) which are essential for locking. regards, alexander. P.S. I don't like C++11 MM atomics, I think that atomic loads and stores ought to support the following 'modes': Whether load/store is competing (default) or not. Competing load means that there might be concurrent store (to the same object). Competing store means that there might be concurrent load or store. Non-competing load/store can be performed non-atomically. Whether competing load/store needs remote write atomicity (default is no remote write atomicity). A remote-write-atomicity-yes load triggers undefined behaivior in the case of concurrent remote- write-atomicity-no store. Whether load/store has specified reordering constraint (default is no constraint specified) in terms of the following reordering modes: Whether preceding loads (in program order) can be reordered across it (can by default). Whether preceding stores (in program order) can be reordered across it (can by default). Whether subsequent loads (in program order) can be reordered across it (can by default). For load, the set of constrained subsequent loads can be limited to only dependant loads (aka 'consume' mode). Whether subsequent stores (in program order) can be reordered across it (can by default). For load, there is an implicit reordering constraint regarding dependent stores (no need to specify it). A fence/barrier operation can be used to specify reordering constraint using basically the same modes. Re C++11 MM, I'm still missing more fine-grained memory order labels such as in pseudo C++ example below. (I mean mo::noncompeting, mo::ssb/ssb_t (sink store barrier, a release not affecting preceding loads), slb/slb_t (a release not affecting preceding stores) below, and somesuch for relaxed acquire) // Introspection (for bool argument below) aside for a moment template<typename T, bool copy_ctor_or_dtor_can_mutate_object> class mutex_and_condvar_free_single_producer_single_consumer { typedef isolated< aligned_storage< T > > ELEM; size_t m_size; // > 1 ELEM * m_elem; // array of elements, init'ed by ctor atomic< ELEM * > m_head; // initially == m_elem atomic< ELEM * > m_tail; // initially == m_elem ELEM * advance(ELEM * elem) const { return (++elem < m_elem + m_size) ? elem : m_elem; } public: mutex_and_condvar_free_single_producer_single_consumer(); // ctor ~mutex_and_condvar_free_single_producer_single_consumer(); // dtor void producer(const T & value) { ELEM * tail = m_tail.load(mo::noncompeting); // may be nonatomic ELEM * next = advance(tail); while (next == m_head.load(mo::relaxed)) usleep(1000); new(tail) T(value); // placement copy ctor (make queued copy) m_tail.store(next, mo::ssb); // cheaper than mo::release } T consumer() { ELEM * head = m_head.load(mo::noncompeting); // may be nonatomic while (head == m_tail.load(mo::consume)) usleep(1000); T value(*head); // T's copy ctor (make a copy to return) head->~T(); // T's dtor (cleanup for queued copy) m_head.store(advance(head), type_list< mo::slb_t, mo::rel_t >:: element<copy_ctor_or_dtor_can_mutate_object>::type()); return value; // return copied T } }; Note also that given that example above presumes that no more than one thread can read from relevant atomic locations while they are written concurrently, there is definitely no need to pay the price of remote write atomicity even if it is run on 3+ way multiprocessor... IOW, hwsync is unneeded even if all mo::* above are changed to SC... but upcoming C++11 MM doesn't allow to express no-need-for-remote-write-atomicity for SC atomics.