
"Peter Dimov" <pdimov@pdimov.com> writes:
Some wait times (2R+1W):
atomics: 7.673 microseconds lightweight_mutex (CRITICAL_SECTION): 3.069 us shared_mutex: 760 us rw_mutex (my implementation): 665 us (same problem) pthread_rwlock_t, pthreads-win32: 7.108 us rw_mutex (Hinnant/Terekhov): 85.532 us
This last line uses my reimplementation of Howard Hinnant's read/write mutex based on his description; Howard credits Alexander Terekhov with the original algorithm. It does stall the writer a bit in exchange for optimal reader throughput, but doesn't suffer from outright starvation.
I've been running your sp_atomic_mt_test with 8 readers and 1 writer, and various rw mutex implementations on my core2duo machine. Using pthread-win32 pthread_rwlock_t sees *serious* reader starvation: the whole thing completes in 0.5s, with 7 out of 8 reader threads having under 85000 iterations, and the 8th having just over 110000, for 1048576 writer iterations. Using your lightweight mutex runs in around 10s, with around 1000000 iterations for each reader. Using boost::shared_mutex or your implementation of the Hinnant/Terekhov rw-mutex takes 100-200s, with reader multipliers around 60-80. Statistically there's not a lot in it, though your Hinnant/Terekhov implementation runs a bit faster. shared_mutex algorithms are *hard* to get right. Anthony -- Anthony Williams | Just Software Solutions Ltd Custom Software Development | http://www.justsoftwaresolutions.co.uk Registered in England, Company Number 5478976. Registered Office: 15 Carrallack Mews, St Just, Cornwall, TR19 7UL