Probably bug in condition::timed_wait()
data:image/s3,"s3://crabby-images/0b939/0b9397b80dfe65fd8e4e56664d7798dba1762adf" alt=""
Hi,
I'm porting custom library from UNIX to Windows and found that some
piece of code work correctly on UNIX and some time segfaulting on
Windows. I think, that problem is on boost::condition::timed_wait()
function. To be sure I write small test:
#include <stdexcept>
#include <iostream>
#include
data:image/s3,"s3://crabby-images/7e462/7e462d7dd00158b0a067f8a3b23a8e5edd2e9dce" alt=""
Pavel Syomin wrote:
Hi,
I'm porting custom library from UNIX to Windows and found that some piece of code work correctly on UNIX and some time segfaulting on Windows. I think, that problem is on boost::condition::timed_wait() function. To be sure I write small test:
[...] You are assuming that when condition::(timed_)wait returns the predicate (in your case m_counter != 0) is guaranteed to be true. This is not correct. condition::wait is allowed to return at any time (this is known as a "spurious wakeup".) You need to use a while loop. while( m_counter == 0 ) { if( !m_condition.timed_wait( lock, time ) ) return; // timeout } assert( m_counter != 0 ); --m_counter; I'm almost 100% certain that this was in the documentation once, but I can't find it now. Maybe it somehow got lost in the transition to the new doc layout.
data:image/s3,"s3://crabby-images/0b939/0b9397b80dfe65fd8e4e56664d7798dba1762adf" alt=""
Peter Dimov wrote:
You are assuming that when condition::(timed_)wait returns the predicate (in your case m_counter != 0) is guaranteed to be true. This is not correct. condition::wait is allowed to return at any time (this is known as a "spurious wakeup".) You need to use a while loop.
while( m_counter == 0 ) { if( !m_condition.timed_wait( lock, time ) ) return; // timeout }
assert( m_counter != 0 ); --m_counter;
I'm almost 100% certain that this was in the documentation once, but I can't find it now. Maybe it somehow got lost in the transition to the new doc layout.
Thank you! This is solve my problems, but why there is nothing about spurious wakeup in boost documentation?...
data:image/s3,"s3://crabby-images/7e462/7e462d7dd00158b0a067f8a3b23a8e5edd2e9dce" alt=""
Pavel Syomin wrote:
This is solve my problems, but why there is nothing about spurious wakeup in boost documentation?...
The older documentation http://web.archive.org/web/20041024220516/boost.org/libs/thread/doc/conditio... does have a few sentences that mention spurious wakeups; I've no idea how they got lost. I'll CC: the dev list.
data:image/s3,"s3://crabby-images/65735/65735192ead2ee1a1a66b8d00b06d39239008a20" alt=""
Peter Dimov wrote:
Pavel Syomin wrote:
Hi,
I'm porting custom library from UNIX to Windows and found that some piece of code work correctly on UNIX and some time segfaulting on Windows. I think, that problem is on boost::condition::timed_wait() function. To be sure I write small test:
[...]
You are assuming that when condition::(timed_)wait returns the predicate (in your case m_counter != 0) is guaranteed to be true. This is not correct. condition::wait is allowed to return at any time (this is known as a "spurious wakeup".) You need to use a while loop.
while( m_counter == 0 ) { if( !m_condition.timed_wait( lock, time ) ) return; // timeout }
Alternatively, you can avoid the need to explicitly use a while loop by calling the overloaded version of timed_wait() that takes a predicate argument. This version does the loop for you internally. E.g.: if (!(m_condition.timed_wait (lock, time, boost::bind<int> (boost::mem_fn (&AsyncQueue::m_counter), this)))) return; On a separate point, there is a bug in your usage of the xtime struct in main(): xtime_get(&time, TIME_UTC); time.nsec += 10000; queue.timed_pop(time); You cannot simply add a value to time.nsec and then use the struct, because nsec is required to be in the range [0,999999999]. [See http://www.cl.cam.ac.uk/~mgk25/time/c/.] By adding a value to nsec, you may cause it to overflow, which would invalidate the struct, and possibly cause unexpected behavior. You need to normalize the xtime struct by "carrying over" any excess part of nsec to sec. E.g.: #define NSECS_PER_SEC 1000000000 ... xtime_get(&time, TIME_UTC); time.nsec += 10000; if (time.nsec >= NSECS_PER_SEC) { time.sec += time.nsec / NSECS_PER_SEC; time.nsec = time.nsec % NSECS_PER_SEC; } queue.timed_pop(time); (This is an ugly but necessary consequence of Boost.Thread's unfortunate design decision to use a raw struct to represent time values. It would have been much better to use a class that encapsulates its internals, and that knows how to normalize itself.) Moshe
data:image/s3,"s3://crabby-images/0b939/0b9397b80dfe65fd8e4e56664d7798dba1762adf" alt=""
Hi!
Alternatively, you can avoid the need to explicitly use a while loop by calling the overloaded version of timed_wait() that takes a predicate argument. This version does the loop for you internally. E.g.:
if (!(m_condition.timed_wait (lock, time, boost::bind<int> (boost::mem_fn (&AsyncQueue::m_counter), this)))) return;
Thank you! I saw overloaded version of timed_wait() but didn't understand its purpose.
On a separate point, there is a bug in your usage of the xtime struct in main():
xtime_get(&time, TIME_UTC); time.nsec += 10000; queue.timed_pop(time);
You cannot simply add a value to time.nsec and then use the struct, because nsec is required to be in the range [0,999999999]. In really code I use sec, not nsec. But you are rigth.
participants (3)
-
Moshe Matitya
-
Pavel Syomin
-
Peter Dimov