[thread] more problems with condition

Hi, I've posted this to the users list as well, but it might have got lost as I replied on the original thread. Well, I still have problems with the conditions, even with the latest SVN. The problem seems to be that inside a block like: Lock lock (mutex); while (something.empty ()) cond.wait (mutex); I'm still getting into a state where something.empty () == false and yet I'm waiting on the condition. The only other access happens via: Lock lock (mutex); something.push (...); cond.notify_one (); So I don't get it how something can go wrong, unless the transition from mutex.unlock () to cond.wait () is not fully atomic and the following happens: Thread 1: while (something.empty ()) cond.wait (mutex); // starts, releaes lock but is not waiting yet Now comes Thread 2: Lock lock (); something.push cond.notify_one (); // Nobody waiting lock.release (); Thread 1 again: cond.wait (mutex); // starts waiting I had a similar problem with my own condition implementation on Windows, with exactly the same syndromes (I've been following the "Strategies for Implementing POSIX Condition Variables on Win32" http://www.cs.wustl.edu/~schmidt/win32-cv-1.html - only got it working properly with SignalObjectAndWait). Currently, it always leaves exactly *one* thread waiting, while previously, it left several threads waiting, so it has improved but there is still a problem. Any ideas? Anything I can do to track this down further? Cheers, Anteru

On Saturday 19 April 2008 06:48, Anteru wrote:
Well, I still have problems with the conditions, even with the latest SVN. The problem seems to be that inside a block like:
Does the attached patch help? I do my development on Linux, so it's completely untested. -- Frank

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Saturday 19 April 2008 10:23, Frank Mori Hess wrote:
On Saturday 19 April 2008 06:48, Anteru wrote:
Well, I still have problems with the conditions, even with the latest SVN. The problem seems to be that inside a block like:
Does the attached patch help? I do my development on Linux, so it's completely untested.
Actually, I no longer think my earlier patch was useful. But what if the last notify_one() happens between the WaitForSingleObject() call in do_wait() and the start_wait_loop() call in the next iteration of the while loop? I don't understand the boost.thread code 100%, but it seems like it could be missed. The attached patch should eliminate that possibility. - -- Frank -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFICn4U5vihyNWuA4URApf/AJ9XoHwrN24icBpuRseMWHSi/PkibgCfUp01 iZfWAl76xeBJniBsdMkARyM= =5pJn -----END PGP SIGNATURE-----

Anteru
I've posted this to the users list as well, but it might have got lost as I replied on the original thread.
Well, I still have problems with the conditions, even with the latest SVN.
Yes, it's still broken. Thanks for catching this. My original one-line suggested fix is better than what I committed to SVN: it has more spurious wakes, but at least it should work. I'll revert SVN trunk to that until I've got a better solution. Anthony -- Anthony Williams | Just Software Solutions Ltd Custom Software Development | http://www.justsoftwaresolutions.co.uk Registered in England, Company Number 5478976. Registered Office: 15 Carrallack Mews, St Just, Cornwall, TR19 7UL
participants (3)
-
Anteru
-
Anthony Williams
-
Frank Mori Hess