
Gottlob Frege wrote:
I'm saying that mutexes will always do whatever is necessary. For example, the pthreads spec tries very hard to describe itself in such a way so that it will 'just work' regardless of the underlying problems. And any other mutex/thread library will also work around the problems (flushing caches, doing whatever is necessary) or else not be worth using.
I agree that current mutex implementations will do this. I'm simply pointing out that the mutex concept (serializing access to a region of code) is orthogonal to making sure that code is correct in a memory consistency sense. The two concepts are obviously used in a synergistic way. An example of when this can be useful: if I'm on a NUMA architecture, I might like to serialize the threads on my local node (which provides hardware coherence) to make sure updates are ordered but I don't want to pay the cost of an expensive global sync operation to, for example, flush caches of remote nodes because I, as the programmer, know that the data in question is only ever accessed locally.
P.S. do you have any links to some of the more esoteric synchronization primatives? I've been looking for variations on the typical acquire/release barriers as well as systems that don't have CAS, etc. All in the hopes of making at least a start on a useful atomics library.
I'm thinking mostly of some of the things multithreading researchers have been doing with software cache coherence and the instructions vector machines like the Cray X1 line have to order references between the scalar and vector processors for local and remote memory. There are lots of variations to tune the operation to just what the programmer needs, and nothing more. -Dave