Re: [boost] [thread] Timed waits in Boost.Thread potentially fundamentally broken on Windows (possibly rest of Boost too)

23 Jan 2015

      On Friday 23 January 2015 12:08:38 Niall Douglas wrote:
...
Dear all,
CC: Stephan @ Microsoft - Stephan I'd love to know what the MSVC STL
does below so we have the option of matching your behaviour.
During investigating this bug report for Boost.Thread
(https://svn.boost.org/trac/boost/ticket/9856) I have discovered a
very worrying situation: it would appear that potentially all timed
waits in Boost.Thread, and potentially in other parts of Boost, are
broken on Windows Vista and later and have been for some years.
The problem is in correct handling of timeouts. If one does this:
mutex mtx;
condition_variable cond;
unique_lock<mutex> lk(mtx);
assert(cv_status::timeout == cond.wait_for(lk, chrono::seconds(1)));
... one would reasonably expect occasional failures on POSIX due to
spurious wakeups. It turns out that this also spuriously fails on
Windows, which is a surprise probably to many as Windows hides signal
handling (actually APCs) inside its Win32 APIs and automatically
restarts the operation after interruption. There is, therefore, the
potential that quite a lot of code written to use Boost.Thread on
Windows makes the hard assumption that the assert above will never
fail.
That assert is false. Any code that assumes that cv.wait() does not spuriously 
wake up is buggy, Windows or not. This is specified so for 
std::condition_variable as well.
...
This raises the question about what to do with Boost.Thread. We have
the following options:
Option 1: Timed waits are allowed to spuriously fail by the standard,
so we mark this as wontfix and move on. Anyone using the predicate
timed waits has never seen a problem here anyway.
That's right. There's nothing to fix, spurious wakeups are expected and should 
be accounted for in the user's code regardless of the underlying operating 
system.
...
Option 2: We loop waiting until steady_clock (really
QueryPerformanceCounter under Boost) shows the requested timeout has
passed. Problem: This wastes battery power and generates needless
wakeups. A more intelligent implementation would ask Windows for the
thread quanta and transform timeouts to match the Vista kernel
scheduler in combination with always using deadline scheduling, but
this would slow down the timed waits implementation.
That is a missed notification waiting to happen.
...
Option 3: We adjust Boost.Thread to return timeouts when Windows
returns a timed out status code, even if the actual time waited is
considerably lower than the time requested. Problem: some code
written for POSIX where when you ask for a timeout you always get it
may misbehave in this situation.
That is simply incorrect. Why would you indicate a timeout when none occurred? 
This will surely break some timed code.

The standard description is pretty clear: return cv_status::timeout only when 
the timeout has expired, otherwise return cv_status::no_timeout. Boost.Thread 
should follow this.

Re: [boost] [thread] Timed waits in Boost.Thread potentially fundamentally broken on Windows (possibly rest of Boost too)

Andrey Semashev