
Anthony Williams wrote:
Windows native Mutex objects are very expensive to use (as they are Kernel objects), and CRITICAL_SECTIONs have their own problems (such as lock hand-off and the lack of a timed_lock facility). The boost 1.35 implementation uses InterlockedXXX operations to try and avoid waiting on a kernel object, and an Event object for blocking waits. In my tests, this exhibits greater fairness, and runs faster on average than either Mutex or CRITICAL_SECTION based implementations. See http://www.justsoftwaresolutions.co.uk/articles/implementing_mutexes.html for the article I wrote for Overload describing the implementation and rationale.
Thanks for the link to the interesting article. What you end up with looks remarkably similar to how my futex-based mutex works. The main difference is that my counter saturates at 2, which makes things a bit simpler for me. With a bit of tweaking it could fit in to the Mutex<Futex>, Mutex<Yield>, Mutex<Spin> and Mutex<EvilSpinThenYield> scheme that I have been thinking about: struct WinEvent_WaitWake { // psuedo-code WinEvent_WaitWake(int& val) {} // val can be ignored void wait(int expected) { WaitForSingleEvent(); } // expected can be ignored void wake(int n_to_wake) { assert(n_to_wake==1); SetEvent(); } }; template <typename FUT_T = WinEvent_WaitWake> class Mutex: boost::noncopyable { int state; // 0=unlocked, 1=locked no waiters, 2=locked with waiters FUT_T fut; public: Mutex(): state(0), fut(state) {} ~Mutex() {} // Lock, waiting if necessary. inline void lock() { if (!try_lock()) { lock_body(); } } // Try to lock, but don't wait if it's not possible. // Return indicating whether lock is now held. inline bool try_lock() { return atomic_compare_and_swap(state,0,1); } inline void unlock() { if (atomic_post_dec(state) != 1) { unlock_body(); } } private: void lock_body() { while (atomic_swap(state,2) != 0) { // worry about a concurrent unlock() <HERE> fut.wait(2); // expected=2 means "return immediately if state!=2"; // this is needed for the futex implementation as it will drop // any wakes that were sent <HERE>. But with the win32 // implementation those wakes have been stored and will be // applied to this wait (IIUC), so the "expect" feature is // not needed. } } void unlock_body() { atomic_write(state,0); fut.wake(1); } }; Regards, Phil.