
Here is my implementation of Terekhov's *fair* read/write algorithm, with upgradable thrown in. This algorithm is just as you say: The OS alone decides who gets to run next.
http://home.twcny.rr.com/hinnant/cpp_extensions/upgrade_mutex.html
If you don't like the "upgradable" part, it strips out cleanly from the code. Just delete everything that refers to "upgradable" and you've got a read/write mutex using Terekhov's algorithm (and rename the mutex).
Just two comments for boosters: -> The Boost.Interprocess upgradable_mutex is this same as Howard's algorithm, because I wrote it based on Howard's thorough explanations. So if you want to use this algorithm, use Howard's code, since it's surely more tested than my implementation. I plan to use Howard's algorithm in my next version. -> Now that there is interest in implementing these mutex types, and the committee is considering adding synchronization primitives to the standard or TR2, I think we should implement synchronization algorithms in Boost.Thread that can be reused by other libraries, like Boost.Interprocess. I think that Howard's algorithm can be templatized so that everyone can plug its mutexes/condition variables there, and they don't have to reinvent the wheel. Even if we add atomic operations to that algorithm it will be still valid for shared memory synchronization primitives. My 2 cents, Ion