Date: Wed, 22 Feb 2006 10:31:06 -0600 From: David Greene
Subject: Re: [Boost-users] [statechart] Asynchronous Machines To: boost-users@lists.boost.org Message-ID: <43FC91CA.8060801@obbligato.org> Content-Type: text/plain; charset=us-ascii Gottlob Frege wrote:
Very long answer: More correctly the problem isn't really 'cache coherence' in the traditional meaning of cache coherency (which is that the cache, for your cpu, is consistent with main memory, etc), it is the order of memory reads and writes, and Mutex's are guaranteed to do whatever is necessary to make sure all queued reads are read before you get the mutex lock (ie they force a memory 'acquire' barrier) and they make sure all writes are written before the mutex is released (a 'release' memory barrier).
I understand what you're saying and agree with you in that that's the current way hardware and software is implemented in the vast majority of cases.
However, the concepts of serializing access and maintaining memory consistency and conherence are orthogonal. There have been architectures (in academia, mostly) that require explicit software cache control, for example. One would have to include a cache flush in your examples. The theory is that by separating concerns the programmer (or compiler) has more freedom to loosen up implementations based on weaker requirements of the application, thereby gaining performance.
We're starting to see this much more in HPC systems, for example, where there are a multitude of synchronization primitives available with varying semantics that imply performance tradeoffs. Some machines cache remote memory (often under software control), others don't.
So I agree with you in the case of the typical machine architecture, but it won't necessarily hold in the future.
Obviously you understand the problems then. But I don't understand what you don't agree with - my explanation (obviously missing some details), or whether mutexes will work in the future? I'm saying that mutexes will always do whatever is necessary. For example, the pthreads spec tries very hard to describe itself in such a way so that it will 'just work' regardless of the underlying problems. And any other mutex/thread library will also work around the problems (flushing caches, doing whatever is necessary) or else not be worth using. P.S. do you have any links to some of the more esoteric synchronization primatives? I've been looking for variations on the typical acquire/release barriers as well as systems that don't have CAS, etc. All in the hopes of making at least a start on a useful atomics library. -Dave Thanks, Tony.