
Peter Dimov wrote: [...]
Here's TryEnterCriticalSection (-recursivity) for reference:
bool try_lock( critical_section * p ) { return atomic_compare_exchange( p->LockCount, -1, 0 ) == 0; }
The point of "swap based mutex" exercise was to avoid CAS because i386 doesn't have it. With CAS (and absent timedlock operation), it can be made safe and fast (better than MS "critical" stuff).
CRITICAL_SECTIONs have no timed_lock.
That's because timedlock() inevitable introduces a race.
Your version can be made posix-safe at the expense of three atomic ops instead of one in unlock:
void unlock() { atomic_increment( refs_ );
// as before
if( atomic_decrement( refs_ ) == 0 ) { CloseHandle( event_ ); } }
void destroy() { if( atomic_decrement( refs_ ) == 0 ) { CloseHandle( event_ ); } }
Yes. Or something along the lines of pthreads-win32 mutex (also two atomic ops... MS critical section lock/unlock ;-) ). I did it in order to support timedlock (it's used to solve the race with respect to timed out waiters, but the same scheme would work to synchronize the destruction of "swap based" thing -- simply lock/unlock that critsect in the destructor).
That's too slow, I guess?
Yes. AFAICS, the best solution (i386 does have atomic_bit_test_set): mutex_trylock: RETURN !atomic_bit_test_set_ddacq(&lock, 1) mutex_lock: WHILE atomic_bit_test_set_ddacq(&lock, 1) lock_queue_wait(&lock, 1) // wait if locked bit is set mutex_timedlock: WHILE atomic_bit_test_set_ddacq(&lock, 1) IF lock_queue_wait(&lock, 1, timeout) RETURN TRUE RETUN FALSE mutex_unlock: uintptr_t lock_queue; IF atomic_decrement_rel(lock_queue = &lock) THEN lock_queue_wake(lock_queue, 1) or something like that with kernel lock_queue interface. It would be safe, i386-friendly, memory efficient, and fast.
But maybe you'd be able to think of a way to somehow combine the refs_ manipulation with the lock_ manipulation?
It would need CAS and would still be "slow", I'm afraid.
The event pool solution would also work, but the problem there is that the pool would need its own synchronization.
But it can be slow (MS "native" critical or mutex or event/same would work just fine). You don't usually need bizillions lock create/destroy per nanosecond.
Unless it's an "interlocked slist".
Yep. regards, alexander.