[threads] adopt_lock_t question

Hi! I've come across something that puzzles me. I've got code that, condensed to relevant parts, looks like this: recursive_mutex m; lock_guard<recursive_mutex> l1(m); lock_guard<recursive_mutex> l2(m, adopt_lock_t()); What happens is that if -DNDEBUG is set, the code passes. But if -DDEBUG is set, l1's destructor aborts. Further investigation shows that the destructor calls the mutexe's unlock() function (as expected), which in turn tries to unlock it's internal mutex object - in my case a pthread mutex. The difference between the compilation modes is that the return value of that unlocking function is wrapped in a BOOST_VERIFY(), i.e. assert() call. As I understand the documentation, my code is perfectly valid. It may not make a large amount of sense to use adopt_lock_t as the nature of recursive_mutex should be to not block in l2's constructor - but that's a different matter. Now according to the documentation, lock_guard expects a Lockable, recursive_mutex is a Lockable, and Lockable's unlock() specifies very clearly what sort of conditions need to be met for the function to work: Precondition: The current thread owns *this. Effects: Releases ownership by the current thread. Postcondition: The current thread no longer owns *this. Throws: Nothing So there you have it, in my case the current thread does not own *this after l2's destructor ran - in fact, no thread does. Now there are a two inconsistencies here: 1. The code works fine with other Lockable implementations, such as non-recursive boost mutexes. 2. The behaviour is different depending on compilation mode. Yes, that's the idea behind asserts, but parts of the documentation suggest one behaviour is to be expected, other parts suggest the other is. From that I must draw the conclusion that either recursive_mutex and the unlock() documentation is broken, or the non-recursive mutexes are. Now conceptually, I'd expect unlock() to always succeed. That is, if it's postcondition can be fulfilled, it should always, regardless of compilation mode, exit cleanly. In that case the precondition docs should read something like "*this is either owned by the current thread, or by no thread at all.". I find that to be the case especially for recursive mutexes, as they're intended to be locked/unlocked multiple times in the same thread (though, technically, the calls should always be balanced). All of this is with version 1.35.0, but the docs for 1.36.0 don't seem to be different in the relevant areas. Maybe I'm missing something vital here. If so, I'd greatly appreciate any insight you guys might have. (Incidentally, I find it very hard to disable BOOST_VERIFY with a BOOST_DISABLE_ASSERTS define. I'll have to dig deeper where the re-enabling of asserts sneaks in, but on some machines that has no effect, on others it solves my particular problem, regardless of how correct or incorrect it might be.) Thanks, Jens

Jens Finkhäuser <jens@unwesen.de> writes:
Hi!
I've come across something that puzzles me. I've got code that, condensed to relevant parts, looks like this:
recursive_mutex m; lock_guard<recursive_mutex> l1(m); lock_guard<recursive_mutex> l2(m, adopt_lock_t());
That is an error. adopt_lock_t means that l2 takes ownership of the already-locked mutex (i.e. doesn't lock it again), but it doesn't force l1 to relinquish ownership, so both l2 and l1 think they own the one and only lock. When l2 is destroyed it will unlock the mutex, and then when l1 is destroyed it will unlock the mutex AGAIN, which is undefined behaviour because it doesn't own the mutex at this point.
What happens is that if -DNDEBUG is set, the code passes. But if -DDEBUG is set, l1's destructor aborts. Further investigation shows that the destructor calls the mutexe's unlock() function (as expected), which in turn tries to unlock it's internal mutex object - in my case a pthread mutex.
That makes sense: in debug mode the unlock function detects the error.
As I understand the documentation, my code is perfectly valid.
Unfortunately you are wrong.
It may not make a large amount of sense to use adopt_lock_t as the nature of recursive_mutex should be to not block in l2's constructor - but that's a different matter.
That is true.
Now according to the documentation, lock_guard expects a Lockable, recursive_mutex is a Lockable, and Lockable's unlock() specifies very clearly what sort of conditions need to be met for the function to work:
Precondition: The current thread owns *this. Effects: Releases ownership by the current thread. Postcondition: The current thread no longer owns *this. Throws: Nothing
So there you have it, in my case the current thread does not own *this after l2's destructor ran - in fact, no thread does.
Yes. The problem is that l1's destructor then *also* calls unlock.
Now there are a two inconsistencies here:
1. The code works fine with other Lockable implementations, such as non-recursive boost mutexes.
That's just a fluke of undefined behaviour: sometimes it works how you thought it would.
2. The behaviour is different depending on compilation mode. Yes, that's the idea behind asserts, but parts of the documentation suggest one behaviour is to be expected, other parts suggest the other is.
From that I must draw the conclusion that either recursive_mutex and the unlock() documentation is broken, or the non-recursive mutexes are.
Nope. Your usage of lock_guard is broken.
Now conceptually, I'd expect unlock() to always succeed. That is, if it's postcondition can be fulfilled, it should always, regardless of compilation mode, exit cleanly. In that case the precondition docs should read something like "*this is either owned by the current thread, or by no thread at all.".
No. Detecting whether the current thread owns the mutex adds unnecessary overhead. The win32 implementation of boost::mutex *does not know* which thread owns the mutex. Anthony -- Anthony Williams Author of C++ Concurrency in Action | http://www.manning.com/williams Custom Software Development | http://www.justsoftwaresolutions.co.uk Just Software Solutions Ltd, Registered in England, Company Number 5478976. Registered Office: 15 Carrallack Mews, St Just, Cornwall, TR19 7UL, UK

Hi! On Mon, Nov 03, 2008 at 08:00:08AM +0000, Anthony Williams wrote:
Jens FinkhÀuser <jens@unwesen.de> writes:
Hi!
I've come across something that puzzles me. I've got code that, condensed to relevant parts, looks like this:
recursive_mutex m; lock_guard<recursive_mutex> l1(m); lock_guard<recursive_mutex> l2(m, adopt_lock_t());
That is an error. adopt_lock_t means that l2 takes ownership of the already-locked mutex (i.e. doesn't lock it again), but it doesn't force l1 to relinquish ownership, so both l2 and l1 think they own the one and only lock. When l2 is destroyed it will unlock the mutex, and then when l1 is destroyed it will unlock the mutex AGAIN, which is undefined behaviour because it doesn't own the mutex at this point.
<snip/>
Thanks for your reply. I understand your argument, and what the code *does*, just not the reasoning: how then is adopt_lock_t *supposed* to work? It's not as if you can tell lock_guard to relinquish ownership of a mutex. If there is no need to adopt a mutex because there's only one lock in the code, adopt_lock_t is superfluous. If there are two locks involved, you'll always run into an issue where one lock gets to unlock the mutex first - and the other one fails. I've tried to find the standard proposal for this, and only found http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2447.htm Unfortunately, that page doesn't shed much light on adopt_lock_t's intended usage either. As it stands, the whole thing only makes sense if you assume that you've locked the internal, operating system specific mutex object with an operating system specific function prior to using adopt_lock_t... but that doesn't seem reasonable either. If someone could point me to documentation that details how adopt_lock_t is intended to be used (or give an example), that'd be helpful! Jens

Jens Finkhäuser <jens@unwesen.de> writes:
Hi!
On Mon, Nov 03, 2008 at 08:00:08AM +0000, Anthony Williams wrote:
Jens FinkhÀuser <jens@unwesen.de> writes:
Hi!
I've come across something that puzzles me. I've got code that, condensed to relevant parts, looks like this:
recursive_mutex m; lock_guard<recursive_mutex> l1(m); lock_guard<recursive_mutex> l2(m, adopt_lock_t());
That is an error. adopt_lock_t means that l2 takes ownership of the already-locked mutex (i.e. doesn't lock it again), but it doesn't force l1 to relinquish ownership, so both l2 and l1 think they own the one and only lock. When l2 is destroyed it will unlock the mutex, and then when l1 is destroyed it will unlock the mutex AGAIN, which is undefined behaviour because it doesn't own the mutex at this point.
<snip/>
Thanks for your reply. I understand your argument, and what the code *does*, just not the reasoning: how then is adopt_lock_t *supposed* to work? It's not as if you can tell lock_guard to relinquish ownership of a mutex.
No, you can't. If you need to do that, use unique_lock and the release() member function.
If there is no need to adopt a mutex because there's only one lock in the code, adopt_lock_t is superfluous. If there are two locks involved, you'll always run into an issue where one lock gets to unlock the mutex first - and the other one fails.
Yes. Don't do that. adopt_lock_t is for where you've locked the mutex *directly* using mutex.lock(), or where you've acquired a locked mutex from a unique_lock using unique_lock::release(). It works quite well with the boost::lock free functions: boost::mutex m1,m2; boost::lock(m1,m2); // lock both mutexes boost::lock_guard<boost::mutex> l1(m1,boost::adopt_lock); boost::lock_guard<boost::mutex> l2(m2,boost::adopt_lock); In this case, you can also use unique_lock with the defer_lock constructor: boost::unique_lock<boost::mutex> l1(m1,boost::defer_lock); boost::unique_lock<boost::mutex> l2(m2,boost::defer_lock); boost::lock(l1,l2); // lock both mutexes Anthony -- Anthony Williams Author of C++ Concurrency in Action | http://www.manning.com/williams Custom Software Development | http://www.justsoftwaresolutions.co.uk Just Software Solutions Ltd, Registered in England, Company Number 5478976. Registered Office: 15 Carrallack Mews, St Just, Cornwall, TR19 7UL, UK
participants (2)
-
Anthony Williams
-
Jens Finkhäuser