[thread] Timed waits in Boost.Thread potentially fundamentally broken on Windows (possibly rest of Boost too)

Dear all,
CC: Stephan @ Microsoft - Stephan I'd love to know what the MSVC STL
does below so we have the option of matching your behaviour.
During investigating this bug report for Boost.Thread
(https://svn.boost.org/trac/boost/ticket/9856) I have discovered a
very worrying situation: it would appear that potentially all timed
waits in Boost.Thread, and potentially in other parts of Boost, are
broken on Windows Vista and later and have been for some years.
The problem is in correct handling of timeouts. If one does this:
mutex mtx;
condition_variable cond;
unique_lock<mutex> lk(mtx);
assert(cv_status::timeout == cond.wait_for(lk, chrono::seconds(1)));
... one would reasonably expect occasional failures on POSIX due to
spurious wakeups. It turns out that this also spuriously fails on
Windows, which is a surprise probably to many as Windows hides signal
handling (actually APCs) inside its Win32 APIs and automatically
restarts the operation after interruption. There is, therefore, the
potential that quite a lot of code written to use Boost.Thread on
Windows makes the hard assumption that the assert above will never
fail.
The reason why Windows spuriously fails above isn't due to spurious
wakeups, it is in fact due to changes in the Vista kernel scheduler
as documented at
https://technet.microsoft.com/en-us/magazine/2007.02.vistakernel.aspx.
In essence, if you now ask Windows to go sleep for X milliseconds,
Windows Vista onwards will in fact sleep for anywhere between zero
and X+N milliseconds where N is some arbitrarily long value. In other
words, timeouts in Windows are purely advisory, and are freely
ignored by the Windows kernel from Vista onwards. You can test this
for yourself using this little program which reduces the #9856 bug
report to its Win32 API essentials:
#include

On Friday 23 January 2015 12:08:38 Niall Douglas wrote:
Dear all,
CC: Stephan @ Microsoft - Stephan I'd love to know what the MSVC STL does below so we have the option of matching your behaviour.
During investigating this bug report for Boost.Thread (https://svn.boost.org/trac/boost/ticket/9856) I have discovered a very worrying situation: it would appear that potentially all timed waits in Boost.Thread, and potentially in other parts of Boost, are broken on Windows Vista and later and have been for some years.
The problem is in correct handling of timeouts. If one does this:
mutex mtx; condition_variable cond; unique_lock<mutex> lk(mtx); assert(cv_status::timeout == cond.wait_for(lk, chrono::seconds(1)));
... one would reasonably expect occasional failures on POSIX due to spurious wakeups. It turns out that this also spuriously fails on Windows, which is a surprise probably to many as Windows hides signal handling (actually APCs) inside its Win32 APIs and automatically restarts the operation after interruption. There is, therefore, the potential that quite a lot of code written to use Boost.Thread on Windows makes the hard assumption that the assert above will never fail.
That assert is false. Any code that assumes that cv.wait() does not spuriously wake up is buggy, Windows or not. This is specified so for std::condition_variable as well.
This raises the question about what to do with Boost.Thread. We have the following options:
Option 1: Timed waits are allowed to spuriously fail by the standard, so we mark this as wontfix and move on. Anyone using the predicate timed waits has never seen a problem here anyway.
That's right. There's nothing to fix, spurious wakeups are expected and should be accounted for in the user's code regardless of the underlying operating system.
Option 2: We loop waiting until steady_clock (really QueryPerformanceCounter under Boost) shows the requested timeout has passed. Problem: This wastes battery power and generates needless wakeups. A more intelligent implementation would ask Windows for the thread quanta and transform timeouts to match the Vista kernel scheduler in combination with always using deadline scheduling, but this would slow down the timed waits implementation.
That is a missed notification waiting to happen.
Option 3: We adjust Boost.Thread to return timeouts when Windows returns a timed out status code, even if the actual time waited is considerably lower than the time requested. Problem: some code written for POSIX where when you ask for a timeout you always get it may misbehave in this situation.
That is simply incorrect. Why would you indicate a timeout when none occurred? This will surely break some timed code. The standard description is pretty clear: return cv_status::timeout only when the timeout has expired, otherwise return cv_status::no_timeout. Boost.Thread should follow this.

On 23 Jan 2015 at 15:35, Andrey Semashev wrote:
Option 2: We loop waiting until steady_clock (really QueryPerformanceCounter under Boost) shows the requested timeout has passed. Problem: This wastes battery power and generates needless wakeups. A more intelligent implementation would ask Windows for the thread quanta and transform timeouts to match the Vista kernel scheduler in combination with always using deadline scheduling, but this would slow down the timed waits implementation.
That is a missed notification waiting to happen.
For reference for those pondering this option, there is no possibility of missed notifications on Windows as unless you use PulseEvent() (we don't), it can't happen on the win32 threading model. There is also an option 2a and 2b here: (a) loop the wait and (b) use deadline timer scheduling instead of timeouts. Note we already do the latter for larger timeouts, but it is currently not being adjusted for NT kernel quanta since Vista.
Option 3: We adjust Boost.Thread to return timeouts when Windows returns a timed out status code, even if the actual time waited is considerably lower than the time requested. Problem: some code written for POSIX where when you ask for a timeout you always get it may misbehave in this situation.
That is simply incorrect. Why would you indicate a timeout when none occurred? This will surely break some timed code.
Some would say that if Windows claims a timeout, we should return a timeout. I suspect this is what the Dinkumware STL will do, and for compatibility we may wish to match that.
The standard description is pretty clear: return cv_status::timeout only when the timeout has expired, otherwise return cv_status::no_timeout. Boost.Thread should follow this.
This is the current behaviour. However, and it is a big however, the semantics are subtly different. On POSIX you either get your wait as long as you asked or a spurious wakeup. On Windows you are ordinarily getting a wait between nothing and an arbitrary higher amount than requested. This is a "spurious wakeup on steroids". The key point here is that Windows spurious wakeups are occuring *much* more frequently than on POSIX. This has implications for battery life and plenty more. Niall -- ned Productions Limited Consulting http://www.nedproductions.biz/ http://ie.linkedin.com/in/nialldouglas/

On Friday 23 January 2015 13:23:48 Niall Douglas wrote:
On 23 Jan 2015 at 15:35, Andrey Semashev wrote:
Option 2: We loop waiting until steady_clock (really QueryPerformanceCounter under Boost) shows the requested timeout has passed. Problem: This wastes battery power and generates needless wakeups. A more intelligent implementation would ask Windows for the thread quanta and transform timeouts to match the Vista kernel scheduler in combination with always using deadline scheduling, but this would slow down the timed waits implementation.
That is a missed notification waiting to happen.
For reference for those pondering this option, there is no possibility of missed notifications on Windows as unless you use PulseEvent() (we don't), it can't happen on the win32 threading model.
There is, if a spurious wakeup happens. There is a window between returning from a wait function and re-locking the mutex. If you loop without checking for a condition, notifications in that window will be missed.
Option 3: We adjust Boost.Thread to return timeouts when Windows returns a timed out status code, even if the actual time waited is considerably lower than the time requested. Problem: some code written for POSIX where when you ask for a timeout you always get it may misbehave in this situation.
That is simply incorrect. Why would you indicate a timeout when none occurred? This will surely break some timed code.
Some would say that if Windows claims a timeout, we should return a timeout. I suspect this is what the Dinkumware STL will do, and for compatibility we may wish to match that.
We're not mimicking OS behavior in Boost.Thread - that's the point of it being a portability layer. The library implements standard C++ components and as such should behave as close to the standard as possible. If Windows is not well suited for it then oh well... MS should fix Windows then to be more efficient.
The standard description is pretty clear: return cv_status::timeout only when the timeout has expired, otherwise return cv_status::no_timeout. Boost.Thread should follow this.
This is the current behaviour. However, and it is a big however, the semantics are subtly different. On POSIX you either get your wait as long as you asked or a spurious wakeup. On Windows you are ordinarily getting a wait between nothing and an arbitrary higher amount than requested. This is a "spurious wakeup on steroids".
I don't see the difference. On POSIX, you're not guaranteed to be woken up exactly at the timeout either. And spurious wakeups can potentially happen as often as one can emit signals to the process. Granted, that usually doesn't happen that often, but conceptually this is not different from Windows.
The key point here is that Windows spurious wakeups are occuring *much* more frequently than on POSIX. This has implications for battery life and plenty more.
So we're talking about efficiency, not correctness as you originally stated?

On 23 Jan 2015 at 16:46, Andrey Semashev wrote:
For reference for those pondering this option, there is no possibility of missed notifications on Windows as unless you use PulseEvent() (we don't), it can't happen on the win32 threading model.
There is, if a spurious wakeup happens. There is a window between returning from a wait function and re-locking the mutex. If you loop without checking for a condition, notifications in that window will be missed.
Not in Boost.Thread. (Longer answer: Boost.Thread has a complex internal infrastructure of wait objects on Windows to enable emulation of thread cancellation amongst other things. We regularly unlock the user supplied mutex for extended periods of time. We fix that up by more special code in the Boost.Thread condition variable implementation. This is why mixing std::condition_variable with Boost.Thread does not work, but the wider point is that we don't lose notifications if Boost.Thread primitives are used)
Option 3: We adjust Boost.Thread to return timeouts when Windows returns a timed out status code, even if the actual time waited is considerably lower than the time requested. Problem: some code written for POSIX where when you ask for a timeout you always get it may misbehave in this situation.
That is simply incorrect. Why would you indicate a timeout when none occurred? This will surely break some timed code.
Some would say that if Windows claims a timeout, we should return a timeout. I suspect this is what the Dinkumware STL will do, and for compatibility we may wish to match that.
We're not mimicking OS behavior in Boost.Thread - that's the point of it being a portability layer.
The Dinkumware STL behaviour is not OS behaviour. It's one of the big three STLs.
The library implements standard C++ components and as such should behave as close to the standard as possible.
The standard says nothing about what is or is not a spurious wakeup unfortunately.
If Windows is not well suited for it then oh well... MS should fix Windows then to be more efficient.
Vista made these changes to scheduling for efficiency purposes. I suspect Boost.Thread was written for an XP or earlier target.
The standard description is pretty clear: return cv_status::timeout only when the timeout has expired, otherwise return cv_status::no_timeout. Boost.Thread should follow this.
This is the current behaviour. However, and it is a big however, the semantics are subtly different. On POSIX you either get your wait as long as you asked or a spurious wakeup. On Windows you are ordinarily getting a wait between nothing and an arbitrary higher amount than requested. This is a "spurious wakeup on steroids".
I don't see the difference. On POSIX, you're not guaranteed to be woken up exactly at the timeout either. And spurious wakeups can potentially happen as often as one can emit signals to the process. Granted, that usually doesn't happen that often, but conceptually this is not different from Windows.
My reading of http://pubs.opengroup.org/onlinepubs/009695399/functions/pthread_cond_ timedwait.html says that the timed wait may not return timed out if abstime has not passed. Unfortunately abstime is measured against the system clock which may arbitrarily move around, but that's the POSIX definition. C++ is written to use steady_clock which doesn't move around, but otherwise I believe the guarantees are the same.
The key point here is that Windows spurious wakeups are occuring *much* more frequently than on POSIX. This has implications for battery life and plenty more.
So we're talking about efficiency, not correctness as you originally stated?
Let's call it correctness of expectation of behaviour by the community. It's why I'm asking for advice here instead of unilaterally deciding on my own. None of the three options isn't without unhelpful consequence. For example, would the community be happy if on Windows timed waits always were at least the timeout interval requested? In other words, we guarantee that timeout intervals requested are honoured? Niall -- ned Productions Limited Consulting http://www.nedproductions.biz/ http://ie.linkedin.com/in/nialldouglas/

On Friday 23 January 2015 14:12:19 Niall Douglas wrote:
On 23 Jan 2015 at 16:46, Andrey Semashev wrote:
Some would say that if Windows claims a timeout, we should return a timeout. I suspect this is what the Dinkumware STL will do, and for compatibility we may wish to match that.
We're not mimicking OS behavior in Boost.Thread - that's the point of it being a portability layer.
The Dinkumware STL behaviour is not OS behaviour. It's one of the big three STLs.
What I'm saying is that Boost.Thread should implement the standard interface, not something else's. If Windows reports a timeout when it didn't actually happen (i.e. it has not been reached yet), Boost.Thread should not report the timeout to users. If this is what actually happens, and Dinkumware STL does that then there is a bug in Dinkumware STL, and we should not copy it.
If Windows is not well suited for it then oh well... MS should fix Windows then to be more efficient.
Vista made these changes to scheduling for efficiency purposes. I suspect Boost.Thread was written for an XP or earlier target.
I'm confused. Boost.Thread always implemented the standard behavior, with possibility of spurious wakeups. Windows before Vista did not exhibit spurious wakeups (which was ok) and since Vista it started doing this, and this improved efficiency (I assume, the estimate of the improvement had included the negative effect on the applications dealing with the wakeups). Boost.Thread is still behaving correctly wrt the standard. So, why would you want to change Boost.Thread and conceal spurious wakeups, making it less efficient? I'll reiterate that any current use of cv must deal with spurious wakeups already.
The standard description is pretty clear: return cv_status::timeout only when the timeout has expired, otherwise return cv_status::no_timeout. Boost.Thread should follow this.
This is the current behaviour. However, and it is a big however, the semantics are subtly different. On POSIX you either get your wait as long as you asked or a spurious wakeup. On Windows you are ordinarily getting a wait between nothing and an arbitrary higher amount than requested. This is a "spurious wakeup on steroids".
I don't see the difference. On POSIX, you're not guaranteed to be woken up exactly at the timeout either. And spurious wakeups can potentially happen as often as one can emit signals to the process. Granted, that usually doesn't happen that often, but conceptually this is not different from Windows.
My reading of http://pubs.opengroup.org/onlinepubs/009695399/functions/pthread_cond_ timedwait.html says that the timed wait may not return timed out if abstime has not passed. Unfortunately abstime is measured against the system clock which may arbitrarily move around, but that's the POSIX definition.
Note that this means that a wait is allowed to return either before or after the timeout. So this is basically what I said.
So we're talking about efficiency, not correctness as you originally stated? Let's call it correctness of expectation of behaviour by the community. It's why I'm asking for advice here instead of unilaterally deciding on my own. None of the three options isn't without unhelpful consequence.
For example, would the community be happy if on Windows timed waits always were at least the timeout interval requested? In other words, we guarantee that timeout intervals requested are honoured?
My opinion is that there is no point in that since any current use of a cv must involve a loop and a condition anyway (or wait with a condition). Trying to deal with spurious wakeups in the cv implementation is just a waste of resources.

Andrey Semashev wrote:
Trying to deal with spurious wakeups in the cv implementation is just a waste of resources.
It's worse than that - it's deceiving users that their code is correct, when it isn't.

On 23 Jan 2015 at 18:30, Andrey Semashev wrote:
Vista made these changes to scheduling for efficiency purposes. I suspect Boost.Thread was written for an XP or earlier target.
I'm confused. Boost.Thread always implemented the standard behavior, with possibility of spurious wakeups. Windows before Vista did not exhibit spurious wakeups (which was ok) and since Vista it started doing this, and this improved efficiency (I assume, the estimate of the improvement had included the negative effect on the applications dealing with the wakeups). Boost.Thread is still behaving correctly wrt the standard. So, why would you want to change Boost.Thread and conceal spurious wakeups, making it less efficient? I'll reiterate that any current use of cv must deal with spurious wakeups already.
Firstly, my thanks to both you and Peter for your thoughts on this. After sleeping on it for a night, this is what I'll do: I'm going to update Thread's clamping of timeouts to better match Windows since Vista onwards. This basically means asking Windows what its current scheduling quantum is (it can vary from moment to moment according to what applications request) and clamping the timeout sent to Windows APIs to that quantum. Therefore, if you asked for a 10ms timeout, Thread will clamp that probably to 15 ms before sending it onto Windows. I'll always round upwards, so there is a greater potential for a timed wait to take one quantum longer than it should before timing out, but then that can happen anyway at any time. None of this will remove spurious wakeups, but it will reduce their frequency considerably. It means that the CPU spends more time sleeping and less being woken up and being put back to sleep by a predicate check loop. If anyone has a problem with this solution, now is the time to speak. Niall -- ned Productions Limited Consulting http://www.nedproductions.biz/ http://ie.linkedin.com/in/nialldouglas/

Niall Douglas wrote:
After sleeping on it for a night, this is what I'll do: I'm going to update Thread's clamping of timeouts to better match Windows since Vista onwards. This basically means asking Windows what its current scheduling quantum is (it can vary from moment to moment according to what applications request) and clamping the timeout sent to Windows APIs to that quantum. Therefore, if you asked for a 10ms timeout, Thread will clamp that probably to 15 ms before sending it onto Windows. I'll always round upwards, so there is a greater potential for a timed wait to take one quantum longer than it should before timing out, but then that can happen anyway at any time.
...
If anyone has a problem with this solution, now is the time to speak.
I do. The point of the timeout waits is that you have a deadline. On a non-realtime OS, this deadline can never be absolute, but it still makes no sense for the library to deliberately arrange things so that this deadline will never be met. You have no business changing the timeout value that the user has given you. Your task is to tell the kernel what the user wants. What the kernel does with this information is its own business and its responsibility. It's not just the size of the quantum that matters, but also where in the quantum the thread is at the moment it makes the sleep call. The kernel knows that it would make no sense to wake up a thread and then immediately context-switch away from it, so it arranges things to avoid this by either waking it up earlier, so that it has a portion of the quantum still available, or later, at the start of the next quantum. Sometimes 'earlier' is closer to what you asked, so it wakes you up earlier. This is exactly how it must be.

On 24 Jan 2015 at 18:01, Peter Dimov wrote:
If anyone has a problem with this solution, now is the time to speak.
I do.
The point of the timeout waits is that you have a deadline. On a non-realtime OS, this deadline can never be absolute, but it still makes no sense for the library to deliberately arrange things so that this deadline will never be met.
You have no business changing the timeout value that the user has given you. Your task is to tell the kernel what the user wants. What the kernel does with this information is its own business and its responsibility.
It's not just the size of the quantum that matters, but also where in the quantum the thread is at the moment it makes the sleep call. The kernel knows that it would make no sense to wake up a thread and then immediately context-switch away from it, so it arranges things to avoid this by either waking it up earlier, so that it has a portion of the quantum still available, or later, at the start of the next quantum. Sometimes 'earlier' is closer to what you asked, so it wakes you up earlier. This is exactly how it must be.
You may not be aware that Thread already substantially manipulates the timeout sent to Windows. Firstly it converts any input timeouts/deadlines into a steady_clock deadline. That enters the Win32 implementation. This code then extracts a DWORD millisecond timeout interval suitable for feeding to Windows. If that timeout is 20 ms or higher, it takes a code path based around deadline scheduling kernel objects via CreateWaitableTimer. If that timeout is 19 ms or less it feeds it directly to the kernel wait composure routine. The problem with the above strategy is that it was clearly designed around when Windows had a fixed tick interval of 10ms multiples, the kernel didn't coalesce timers and the kernel wasn't tickless. What I'm planning to do is very simple: we always use deadline timer scheduling from now on, so quite literally the steady_clock deadline that comes in is exactly that handed to Windows unmodified. I was also going to try setting a tolerable delay via SetWaitableTimerEx() on Vista and later where the tolerable delay is calculated as 10% of the interval to the deadline but clamped to: timeGetTime() <= tolerable delay <= 250 ms So, no one is lying to the kernel, if anything I'm removing all lying to the kernel and giving it more information with which to delay timeouts. Would you find this acceptable? Niall -- ned Productions Limited Consulting http://www.nedproductions.biz/ http://ie.linkedin.com/in/nialldouglas/

Niall Douglas wrote:
So, no one is lying to the kernel, if anything I'm removing all lying to the kernel and giving it more information with which to delay timeouts. Would you find this acceptable?
Yes, of course. Thank you for the patient explanation.

On Sat, Jan 24, 2015 at 7:52 PM, Niall Douglas
On 24 Jan 2015 at 18:01, Peter Dimov wrote:
If anyone has a problem with this solution, now is the time to speak.
I do.
You may not be aware that Thread already substantially manipulates the timeout sent to Windows. Firstly it converts any input timeouts/deadlines into a steady_clock deadline. That enters the Win32 implementation. This code then extracts a DWORD millisecond timeout interval suitable for feeding to Windows. If that timeout is 20 ms or higher, it takes a code path based around deadline scheduling kernel objects via CreateWaitableTimer. If that timeout is 19 ms or less it feeds it directly to the kernel wait composure routine.
The problem with the above strategy is that it was clearly designed around when Windows had a fixed tick interval of 10ms multiples, the kernel didn't coalesce timers and the kernel wasn't tickless.
I'm not sure if that was done so in account for a specific time quant duration. My understanding is that this was mainly done for two reasons. First, to make shorter waits more efficient. Second, waitable timers take into account system time shifts, which is more important for longer waits. FWIW, as I remember Windows quant duration is adjustable both by user and application and by default on desktop is about 15 ms. So it makes little sense to aim for a specific quant value, much less a multiple of 10 ms. The optimization for shorter waits may not be relevant anymore, but in order to make the change for use of waitable timers for all absolute waits one should conduct some performance tests. And there's another consideration. Waitable timers are useful for absolute waits. For relative waits I would still like Boost.Thread to use relative system waits. The reason is that relative waits do not react to system time shifts and do not require a waitable timer kernel object.
What I'm planning to do is very simple: we always use deadline timer scheduling from now on, so quite literally the steady_clock deadline that comes in is exactly that handed to Windows unmodified. I was also going to try setting a tolerable delay via SetWaitableTimerEx() on Vista and later where the tolerable delay is calculated as 10% of the interval to the deadline but clamped to:
timeGetTime() <= tolerable delay <= 250 ms
I'm not sure I understood this. timeGetTime() returns system time since boot. Did you mean that you would somehow discover the quant duration and use it for the tolerable delay?

On 24 Jan 2015 at 21:29, Andrey Semashev wrote:
The problem with the above strategy is that it was clearly designed around when Windows had a fixed tick interval of 10ms multiples, the kernel didn't coalesce timers and the kernel wasn't tickless.
I'm not sure if that was done so in account for a specific time quant duration. My understanding is that this was mainly done for two reasons. First, to make shorter waits more efficient.
I can see that. Right now it creates a brand new waitable timer object every single timed wait if the interval >= 20 ms. I would lazy initialise and cache one in thread local data instead.
Second, waitable timers take into account system time shifts, which is more important for longer waits.
They _optionally_ do yes. They can do relative as well as absolute waits, and if the value was absolute then it is adjusted for system clock shifts. Thread already has support for this.
FWIW, as I remember Windows quant duration is adjustable both by user and application and by default on desktop is about 15 ms. So it makes little sense to aim for a specific quant value, much less a multiple of 10 ms.
The quant can be reduced to 0.9 ms, or be anywhere between 0.9 and ~15 ms and can change at any time.
The optimization for shorter waits may not be relevant anymore, but in order to make the change for use of waitable timers for all absolute waits one should conduct some performance tests.
Agreed.
And there's another consideration. Waitable timers are useful for absolute waits. For relative waits I would still like Boost.Thread to use relative system waits. The reason is that relative waits do not react to system time shifts and do not require a waitable timer kernel object.
This is interesting actually. The win32 interruptible_wait() and uninterruptible_wait() functions consume a detail::timeout which is capable of transferring either relative or absolute timeout. However condition_variable timed wait and thread timed join never calls with anything but an absolute deadline timeout, and converts all relative to absolute. Oddly enough this_thread::sleep_until seems to convert its absolute timeout to a relative one, and thunks through this_thread::sleep_for, so here all absolutes are converted to relative. So we have the odd situation that condition_variable and thread join are always implemented as absolute timeouts, while thread sleep is always implemented as relative timeouts. I also see that timed mutex always converts absolute to relative too on win32. Some of this isn't win32 specific either. I can see some code there which looks like these are decisions being made by upper layers of code and so would affect POSIX as well. I've added this issue as https://svn.boost.org/trac/boost/ticket/10967. Some references to code: Always uses absolute: https://github.com/boostorg/thread/blob/develop/include/boost/thread/w in32/condition_variable.hpp#L92 Always uses relative: https://github.com/boostorg/thread/blob/develop/include/boost/thread/w in32/basic_timed_mutex.hpp#L190 Always uses relative: https://github.com/boostorg/thread/blob/develop/include/boost/thread/v 2/thread.hpp#L60 Always uses absolute: https://github.com/boostorg/thread/blob/develop/include/boost/thread/d etail/thread.hpp#L551
What I'm planning to do is very simple: we always use deadline timer scheduling from now on, so quite literally the steady_clock deadline that comes in is exactly that handed to Windows unmodified. I was also going to try setting a tolerable delay via SetWaitableTimerEx() on Vista and later where the tolerable delay is calculated as 10% of the interval to the deadline but clamped to:
timeGetTime() <= tolerable delay <= 250 ms
I'm not sure I understood this. timeGetTime() returns system time since boot. Did you mean that you would somehow discover the quant duration and use it for the tolerable delay?
Sorry I was being sloppy. The current quantum is the KiCyclesPerClockQuantum kernel variable. I believe there is a user space function for that now ... a quick google search says GetSystemTimeAdjustment(). He literally reads from a memory location in the kernel, so it's as quick as reading a variable. Niall -- ned Productions Limited Consulting http://www.nedproductions.biz/ http://ie.linkedin.com/in/nialldouglas/

Le 24/01/15 20:56, Niall Douglas a écrit :
On 24 Jan 2015 at 21:29, Andrey Semashev wrote:
And there's another consideration. Waitable timers are useful for absolute waits. For relative waits I would still like Boost.Thread to use relative system waits. The reason is that relative waits do not react to system time shifts and do not require a waitable timer kernel object. This is interesting actually. The win32 interruptible_wait() and uninterruptible_wait() functions consume a detail::timeout which is capable of transferring either relative or absolute timeout. However condition_variable timed wait and thread timed join never calls with anything but an absolute deadline timeout, and converts all relative to absolute. Oddly enough this_thread::sleep_until seems to convert its absolute timeout to a relative one, and thunks through this_thread::sleep_for, so here all absolutes are converted to relative.
So we have the odd situation that condition_variable and thread join are always implemented as absolute timeouts, while thread sleep is always implemented as relative timeouts. I also see that timed mutex always converts absolute to relative too on win32.
Some of this isn't win32 specific either. I can see some code there which looks like these are decisions being made by upper layers of code and so would affect POSIX as well.
I've added this issue as https://svn.boost.org/trac/boost/ticket/10967.
Some references to code:
Always uses absolute: https://github.com/boostorg/thread/blob/develop/include/boost/thread/win32/c...
Always uses relative: https://github.com/boostorg/thread/blob/develop/include/boost/thread/win32/b...
Always uses relative: https://github.com/boostorg/thread/blob/develop/include/boost/thread/v2/thre...
Always uses absolute: https://github.com/boostorg/thread/blob/develop/include/boost/thread/detail/...
Thanks Niall for this report. As you know I'm unable to check on Windows. Please look at the ticket, I'm here to contribute to fix this incoherences. Best, Vicente

Le 24/01/15 15:18, Niall Douglas a écrit :
On 23 Jan 2015 at 18:30, Andrey Semashev wrote:
Vista made these changes to scheduling for efficiency purposes. I suspect Boost.Thread was written for an XP or earlier target. I'm confused. Boost.Thread always implemented the standard behavior, with possibility of spurious wakeups. Windows before Vista did not exhibit spurious wakeups (which was ok) and since Vista it started doing this, and this improved efficiency (I assume, the estimate of the improvement had included the negative effect on the applications dealing with the wakeups). Boost.Thread is still behaving correctly wrt the standard. So, why would you want to change Boost.Thread and conceal spurious wakeups, making it less efficient? I'll reiterate that any current use of cv must deal with spurious wakeups already. Firstly, my thanks to both you and Peter for your thoughts on this.
After sleeping on it for a night, this is what I'll do: I'm going to update Thread's clamping of timeouts to better match Windows since Vista onwards. This basically means asking Windows what its current scheduling quantum is (it can vary from moment to moment according to what applications request) and clamping the timeout sent to Windows APIs to that quantum. Therefore, if you asked for a 10ms timeout, Thread will clamp that probably to 15 ms before sending it onto Windows. I'll always round upwards, so there is a greater potential for a timed wait to take one quantum longer than it should before timing out, but then that can happen anyway at any time.
None of this will remove spurious wakeups, but it will reduce their frequency considerably. It means that the CPU spends more time sleeping and less being woken up and being put back to sleep by a predicate check loop.
If anyone has a problem with this solution, now is the time to speak.
Niall, could we first fix the incoherences you have identified respecting the standard specification and then see if we need to do something else. Best, Vicente

Niall Douglas wrote:
The standard says nothing about what is or is not a spurious wakeup unfortunately.
What does it matter? We all know what is or is not a spurious wakeup - a return from the wait without the condition being notified.
My reading of http://pubs.opengroup.org/onlinepubs/009695399/functions/pthread_cond_timedw... says that the timed wait may not return timed out if abstime has not passed.
Correct, if it returns before abstime has passed, it returns 0, not ETIMEDOUT.
Unfortunately abstime is measured against the system clock which may arbitrarily move around, but that's the POSIX definition.
abstime is measured against the condition variable's clock, which is the system clock by default, but can be changed. But this is irrelevant.
For example, would the community be happy if on Windows timed waits always were at least the timeout interval requested?
Code that works in the presence of spurious wakeups will retry the wait on its own. So the only people who would be helped would be those whose code doesn't expect spurious wakeups. We've been here before. Such code is simply incorrect. We know that "the community" would prefer for spurious wakeups to not exist, and in fact often pretends that they do not. But that's just wrong.

Niall Douglas wrote:
The standard says nothing about what is or is not a spurious wakeup unfortunately.
What does it matter? We all know what is or is not a spurious wakeup - a return from the wait without the condition being notified.
Actually, I spoke too soon. This is what the standard says:
template
participants (4)
-
Andrey Semashev
-
Niall Douglas
-
Peter Dimov
-
Vicente J. Botet Escriba