[thread] Request review of new synchronisation object, boost::permit<>

Niall Douglas

3 May 2014 3 May '14

9:13 p.m.

Dear Boost, Review and commentary upon a proposed new Boost object, boost::permit<> is invited. A quick description of their purpose: "Permits come in two variants, consuming and non-consuming. There is always a one-to-one relationship between the granting of a permit and the receiving of that permit, the only difference with the non-consuming variant is that the permit is automatically regranted as soon as it is consumed. This leads to the only API difference to condition variables: permit<consuming=true> provides notify_one() but not notify_all(), whilst permit<consuming=false> provides neither and you will have to use grant() and revoke() instead. When deciding to substitute permits for condition variables, you will need to decide carefully which kind of permit is the correct substitute." "Permit objects are actually very similar to a void promise/future i.e. they act as a reusable promise/future pair but without transporting any value or exception state, just the permission to proceed. This is due to their original design intention, which was as a promise/future equivalent for C11 (see below for their history, and note that the original C API is available in the boost::c_permit namespace and documentation can be found at https://ci.nedprod.com/job/Boost.Thread%20Build%20Documentation/lastSu ccessfulBuild/artifact/boost-trunk/boost/thread/permit/latex/pthread_p ermit_reference.pdf). As such, they can be very useful for situations where a promise/future is too heavy (permits can spin instead of sleep, plus they need not be repeatedly constructed and destructed)." There is lots more detail at their documentation page: https://ci.nedprod.com/job/Boost.Thread%20Build%20Documentation/Boost. Thread_Documentation/libs/thread/doc/html/thread/synchronization/permi t_ref.html Source code of Permit containing fork of Boost.Thread: https://github.com/ned14/thread/tree/permit_object They come with a functional and load soak test suite. They work on Windows, Linux, FreeBSD and I believe OS X. Any other questions do ask. Niall -- Currently unemployed and looking for work in Ireland. Work Portfolio: http://careers.stackoverflow.com/nialldouglas/

Attachments:

SMime.p7s (application/x-pkcs7-signature — 5.9 KB)

Show replies by date

Peter Dimov

4 May 4 May

4:51 p.m.

Niall Douglas wrote:

...

Review and commentary upon a proposed new Boost object, boost::permit<> is invited.

Aren't permits just Windows events?

Niall Douglas

6:15 p.m.

On 4 May 2014 at 19:51, Peter Dimov wrote:

...

...
Review and commentary upon a proposed new Boost object, boost::permit<> is invited.

Aren't permits just Windows events?

A non-consuming permit looks quite like a Win32 event object yes, except it has stronger guarantees useful for many-waiter many-signaller situations. A permit is completely user space though, and portable. A consuming permit looks totally different to Win32 event objects. I am not sure what else it resembles actually, perhaps a reusable future/promise. Niall -- ned Productions Limited Consulting http://www.nedproductions.biz/ http://ie.linkedin.com/in/nialldouglas/

Peter Dimov

8:25 p.m.

Niall Douglas wrote:

...

A non-consuming permit looks quite like a Win32 event object yes, except it has stronger guarantees useful for many-waiter many-signaller situations.

What are these stronger guarantees?

...

A consuming permit looks totally different to Win32 event objects.

You do know that WIn32 events can be auto-reset or manual-reset, right? Is not an auto-reset event the same as a consuming permit?

Niall Douglas

5 May 5 May

1:29 a.m.

On 4 May 2014 at 23:25, Peter Dimov wrote:

...

...
A non-consuming permit looks quite like a Win32 event object yes, except it has stronger guarantees useful for many-waiter many-signaller situations.

What are these stronger guarantees?

They are listed at the top of the documentation page, but in essence a manual reset Win32 event object state change doesn't make strong guarantees about what happens to waiters. A permit grant, if non-consuming, will block until all waiting threads at the point of grant have been released. I believe the Win32 SetEvent() doesn't block. The fact it does block can be useful - I am still unsure whether grant ought to return the number of threads released by the grant or not, hence asking here for feedback. You also get guarantees about destroying the object during waits or grants which can be very useful for patching up poorly written code to be race free.

...

...
A consuming permit looks totally different to Win32 event objects.

You do know that WIn32 events can be auto-reset or manual-reset, right? Is not an auto-reset event the same as a consuming permit?

It's time to put on my sheepish hat. You are absolutely correct, and I don't know how I cognitively disconnected from auto-reset event objects being semantically similar to consuming permits. The only explanation I think could be the number of years it has been since I last used a win32 event object. I'll get the docs updated soon to reflect this, but yes, you're right permits are similar to a user space portable shared memory capable windows event object. I think they're useful in some situations where existing facilities force you to roll something like them, and as we all know rolling your own threading code is always risky. This permit implementation has taken a surprising amount of effort given how simple they supposedly are e.g. ensuring they don't starve some waiting threads under heavy load over others, that sort of thing. Niall -- ned Productions Limited Consulting http://www.nedproductions.biz/ http://ie.linkedin.com/in/nialldouglas/

microcai

3:03 a.m.

2014-05-05 9:29 GMT+08:00 Niall Douglas <s_sourceforge@nedprod.com>:

...

On 4 May 2014 at 23:25, Peter Dimov wrote:

...
...
A non-consuming permit looks quite like a Win32 event object yes, except it has stronger guarantees useful for many-waiter many-signaller situations.

asio::windows::object_handle permit(io_service, CreateEvent(0,1,00)); permit.wait(); permit.async_wait( []{ do some thing here} ) ;

Niall Douglas

1:59 p.m.

On 5 May 2014 at 11:03, microcai wrote:

...

...
...
...
A non-consuming permit looks quite like a Win32 event object yes, except it has stronger guarantees useful for many-waiter many-signaller situations.

asio::windows::object_handle permit(io_service, CreateEvent(0,1,00));

permit.wait();

permit.async_wait( []{ do some thing here} ) ;

Equivalent with permits: boost::permit_nc permit; int fds[2]; pipe(fds); pthread_permitnc_associate_fd(permit.native_handle(), fds); asio::posix::stream_descriptor input(ioservice, fd[0]), output(ioservice, fd[1]); input.async_wait([] { do something here } ); I haven't checked this code, but something close should work. The point is you can associate the state of a fd with a non-consuming permit. Niall -- ned Productions Limited Consulting http://www.nedproductions.biz/ http://ie.linkedin.com/in/nialldouglas/

Peter Dimov

12:38 p.m.

Niall Douglas wrote:

...

On 4 May 2014 at 23:25, Peter Dimov wrote:

...
...
A non-consuming permit looks quite like a Win32 event object yes, except it has stronger guarantees useful for many-waiter many-signaller situations.

What are these stronger guarantees?

They are listed at the top of the documentation page,

OK, I took that hint and read it. :-)

...

but in essence a manual reset Win32 event object state change doesn't make strong guarantees about what happens to waiters. A permit grant, if non-consuming, will block until all waiting threads at the point of grant have been released. I believe the Win32 SetEvent() doesn't block.

This doesn't make sense to me. How do you differentiate between waiting threads being released or non-released? SetEvent doesn't block, but after it's called, the waiting threads are no longer blocked and will resume their execution at scheduler's judgment. What would it mean for SetEvent to block until they are released? What does released mean? As it stands, permits look very much like events to me - not just semantically similar, but equivalent. The manual reset event has grant, revoke, wait; grant unblocks all current and future waiters. The auto reset event has notify_one and wait, and notify_one unblocks one current or future waiter, except that two or more notify_one calls when there are no waiters are equivalent to one. It seems to me that this is exactly how permits work; or if there's a difference, I can't see it. Incidentally, I think that both your examples of condition variables not working are wrong. The first one is where you say that "The problem here is that the notify can cause the permit object to be destroyed before the notify has exited -- which if tried with condition variables may produce memory corruption." Condition variables, in fact, do support this. If a condition variable implementation causes memory corruption in this scenario, then it's buggy. The second one is where you say "And the compiler's optimiser is permitted by the standard to believe that result has not changed between its initialisation and the first test of its value, so ..." The compiler optimizer is not permitted by the standard to hold such beliefs. There is a mutex acquisition between result = 0 and result != 0; this is an acquire operation, and acquire operations may synchronize-with release operations in different threads, which synchronize-with edge may cause the load of 'result' to see a different value unless 'result' can be proven to be local to this thread, which it cannot be because of the [&].

Niall Douglas

5:09 p.m.

On 5 May 2014 at 15:38, Peter Dimov wrote:

...

...
but in essence a manual reset Win32 event object state change doesn't make strong guarantees about what happens to waiters. A permit grant, if non-consuming, will block until all waiting threads at the point of grant have been released. I believe the Win32 SetEvent() doesn't block.

This doesn't make sense to me. How do you differentiate between waiting threads being released or non-released? SetEvent doesn't block, but after it's called, the waiting threads are no longer blocked and will resume their execution at scheduler's judgment. What would it mean for SetEvent to block until they are released? What does released mean?

Released = left the wait function.

...

As it stands, permits look very much like events to me - not just semantically similar, but equivalent. The manual reset event has grant, revoke, wait; grant unblocks all current and future waiters. The auto reset event has notify_one and wait, and notify_one unblocks one current or future waiter, except that two or more notify_one calls when there are no waiters are equivalent to one. It seems to me that this is exactly how permits work; or if there's a difference, I can't see it.

There are some subtle differences due to permits being user space, portable and non-kernel. But generally yes. Does this mean they are a wise or bad addition to Boost.Thread?

...

Incidentally, I think that both your examples of condition variables not working are wrong. The first one is where you say that

"The problem here is that the notify can cause the permit object to be destroyed before the notify has exited -- which if tried with condition variables may produce memory corruption."

Condition variables, in fact, do support this. If a condition variable implementation causes memory corruption in this scenario, then it's buggy.

I see no guarantee of such in either the Boost documentation or http://www.cplusplus.com/reference/condition_variable/condition_variab le/~condition_variable/. You are safe going the other way round i.e. destroying a condvar which has wait functions in the process of exiting in other threads. I raised this point because clang's thread sanitiser pukes if you destroy a condvar which is notifying in other threads. It could be a false positive.

...

The second one is where you say

"And the compiler's optimiser is permitted by the standard to believe that result has not changed between its initialisation and the first test of its value, so ..."

The compiler optimizer is not permitted by the standard to hold such beliefs. There is a mutex acquisition between result = 0 and result != 0; this is an acquire operation, and acquire operations may synchronize-with release operations in different threads, which synchronize-with edge may cause the load of 'result' to see a different value unless 'result' can be proven to be local to this thread, which it cannot be because of the [&].

This is a difficult one I agree. The way I came at it is this: examine this reduced code first: atomic<bool> &lock; int result=0; while(!cas_lock(lock)); if(!result) // can the compiler assume this is always false? Sure, the CAS lock prevents reordering of reads and writes either side of the lock. But does it tell the compiler that it cannot assume that a *local* variable's value will not have changed across the CAS lock? I decided this is a valid optimisation, so the compiler can assume result will still be zero and can skip checking it. This now opens the problem of the capturing lambda - does constructing a capturing lambda tell the compiler it cannot assume that any of the reference captured values will not be modified? So, code like this: atomic<bool> &lock; int result=0; [&]{}; while(!cas_lock(lock)); if(!result) // can the compiler assume this is always false? If merely constructing a capturing lambda means local variables could be modified, that seems to me as equal to "mark all local variables as volatile" which seems daft. Just to be sure though as I wasn't sure, I asked about this on stackoverflow at https://stackoverflow.com/questions/22809921/do-global-reference-captu ring-lambdas-in-c-inhibit-alias-optimisations and the answer appeared to be that indeed that consequence is daft. Hence the example in the docs. If I am missing something, I am all ears, this problem bothers me. Do remember I did say in the docs that I am not claiming that any compiler actually does this optimisation, I'm merely saying that some future compiler potentially could. (and yes, the thread creation almost certainly involves a syscall, and a syscall always has the compiler assume all memory could have changed, so yes it would always reload the boolean. However while this is safe with boost::thread, it may not be with a boost::fiber for example). Anyway, I still stick to my guns on that predicate checked values in wait conditions really ought to be stored atomic to prevent reordering and elision. Atomic storage guarantees. Niall -- ned Productions Limited Consulting http://www.nedproductions.biz/ http://ie.linkedin.com/in/nialldouglas/

Peter Dimov

5:41 p.m.

Niall Douglas wrote:

...

I see no guarantee of such in either the Boost documentation or http://www.cplusplus.com/reference/condition_variable/condition_variable/~co.... You are safe going the other way round i.e. destroying a condvar which has wait functions in the process of exiting in other threads.

POSIX says "It shall be safe to destroy an initialized condition variable upon which no threads are currently blocked." No threads are currently blocked upon the condition variable in your example. C++11 says the same: "~condition_variable(); Requires: There shall be no thread blocked on *this." It is true that the POSIX standard supplies an example in which the thread calling pthread_cond_broadcast destroys the condvar, and it's true that C++11 addresses this same case in the note, but I think that the normative text clearly applies to your case too.

...

The way I came at it is this: examine this reduced code first:

atomic<bool> &lock; int result=0; while(!cas_lock(lock)); if(!result) // can the compiler assume this is always false?

Sure, the CAS lock prevents reordering of reads and writes either side of the lock. But does it tell the compiler that it cannot assume that a *local* variable's value will not have changed across the CAS lock?

The compiler is within its rights to assume here that 'result' doesn't change, because it can prove that its address is not available to another thread. This is not the case in your example. The address of 'result' is - quite obviously - available to another thread, so a compiler that could prove that it isn't would be wrong.

...

Anyway, I still stick to my guns on that predicate checked values in wait conditions really ought to be stored atomic to prevent reordering and elision.

The classic mutex/cv pattern does provide the necessary guarantees. Using atomics makes it easier for the programmer to write incorrect code. atomic<bool> pred; Thread 1: lock_guard<> lock( m ); while( !pred ) { cv.wait( lock ); } Thread 2: pred = true; // pred is atomic, needn't bother with m cv.notify_one(); The above is wrong. The corrected version is either: Thread 2: { lock_guard<> lock( m ); pred = true; } cv.notify_one(); or (!) Thread 2: pred = true; { lock_guard<> lock( m ); } cv.notify_one();

Niall Douglas

6:55 p.m.

On 5 May 2014 at 20:41, Peter Dimov wrote:

...

POSIX says

"It shall be safe to destroy an initialized condition variable upon which no threads are currently blocked."

No threads are currently blocked upon the condition variable in your example. C++11 says the same:

"~condition_variable(); Requires: There shall be no thread blocked on *this."

It is true that the POSIX standard supplies an example in which the thread calling pthread_cond_broadcast destroys the condvar, and it's true that C++11 addresses this same case in the note, but I think that the normative text clearly applies to your case too.

Ok, I'll repair the docs appropriately. Thanks for the info.

...

...
The way I came at it is this: examine this reduced code first:

atomic<bool> &lock; int result=0; while(!cas_lock(lock)); if(!result) // can the compiler assume this is always false?

Sure, the CAS lock prevents reordering of reads and writes either side of the lock. But does it tell the compiler that it cannot assume that a *local* variable's value will not have changed across the CAS lock?

The compiler is within its rights to assume here that 'result' doesn't change, because it can prove that its address is not available to another thread. This is not the case in your example. The address of 'result' is - quite obviously - available to another thread, so a compiler that could prove that it isn't would be wrong.

But you've merely constructed the lambda, not executed it. The compiler is surely right to think that without it being executed that result's value will be constant?

...

...
Anyway, I still stick to my guns on that predicate checked values in wait conditions really ought to be stored atomic to prevent reordering and elision.

The classic mutex/cv pattern does provide the necessary guarantees. Using atomics makes it easier for the programmer to write incorrect code.

I never claimed you shouldn't be using mutexes AND atomics. Condition variables should always be used with mutexes, with no exception. Niall -- ned Productions Limited Consulting http://www.nedproductions.biz/ http://ie.linkedin.com/in/nialldouglas/

Peter Dimov

7:31 p.m.

Niall Douglas wrote:

...

But you've merely constructed the lambda, not executed it. The compiler is surely right to think that without it being executed that result's value will be constant?

In the example in the docs, you are passing the lambda to another thread, which means that you're passing a reference to 'result' to another thread. So it's no longer local.

...

I never claimed you shouldn't be using mutexes AND atomics.

Yes, sure. But with an atomic variable as a predicate, people (even experienced programmers) do make this sort of mistake every time. And if the variable isn't atomic, they don't, because they always protect it with the mutex. Using atomic variables is thus more error prone, not less.

Niall Douglas

6 May 6 May

4:20 p.m.

On 5 May 2014 at 22:31, Peter Dimov wrote:

...

...
But you've merely constructed the lambda, not executed it. The compiler is surely right to think that without it being executed that result's value will be constant?

In the example in the docs, you are passing the lambda to another thread, which means that you're passing a reference to 'result' to another thread. So it's no longer local.

I still don't see how the compiler knows anything about that. The compiler can see a syscall, that is all. If thread() doesn't do a syscall, all it can see is a store of a capturing lambda type to some memory location. I already explained why I don't see the compiler auto-volatiling every captured local variable as a result.

...

...
Consuming permits let you grant either with or without the mutex.

...

...
Non-consuming permits will deadlock if you grant holding the mutex because of the guarantee of blocking until the release of all current waiters.

OK. Here you go then.

* Initially P is 0 and p_mutex is unlocked.

* Tc1, Tc2 lock p_mutex and call wait.

* Tp1 locks p_mutex, sets P=1, unlocks p_mutex and gets suspended.

* Tp2 locks p_mutex, sets P=1, unlocks p_mutex and calls notify_one.

* Tc1 is unblocked with p_mutex locked, returns from wait, gets suspended.

* Tp1 is resumed and calls notify_one.

* Tc2 is unblocked and blocks on p_mutex, which is held by Tc1.

* Tc1 is resumed, sees P=1, sets P=0, unlocks p_mutex.

* Tc2 is unblocked, sees P=0.

Your original point was that the docs should not claim no need for predicate testing during waits. I asserted this because unlike a condvar, permits carry state i.e. their own predicate, so predicate testing - where that predicate is for indicating a thread can continue - is unnecessary. In your example above, you add an unnecessary predicate which duplicates the one in the permit, except it gets out of sync with the permit's state because you don't take measures to ensure they stay in sync. No wonder it deadlocks so. I therefore don't get what point you're making? What should I change in the docs to alleviate your concern? Niall -- ned Productions Limited Consulting http://www.nedproductions.biz/ http://ie.linkedin.com/in/nialldouglas/

Peter Dimov

5:01 p.m.

Niall Douglas wrote:

...

I still don't see how the compiler knows anything about that. The compiler can see a syscall, that is all.

It's the reverse. The compiler has to know that the syscall doesn't change 'result' in order to prove that 'result' is still zero. And the syscall in this case does change 'result' - by executing a function that calls a lambda that assigns to 'result'. So a compiler that "knew" that the syscall doesn't change 'result' would be broken. This is really not much different on a conceptual level from int main() { extern void f( int* ); int result = 0; f( &result ); std::cout << result << std::endl; } The compiler cannot know that f doesn't change 'result'.

...

Your original point was that the docs should not claim no need for predicate testing during waits. I asserted this because unlike a condvar, permits carry state i.e. their own predicate, so predicate testing - where that predicate is for indicating a thread can continue - is unnecessary.

You assert that: "You will never see spurious wakeups from a permit object -- [...]. This means you don't need to do any wait predicate checking with permits..." That is, you specifically claim that it is the absence of spurious wakeups that frees one from predicate testing after wait returns, not the fact that the permit has state of its own. And the example I provided demonstrates a situation in which a thread sees something that is indistinguishable from a spurious wakeup, from that thread's point of view. The wait call returns, but the state of the program is as if nobody has called notify_one. The predicate P in my example represents program state. Your threads are calling wait because they are waiting for something to occur. P=1 represents a state in which this something has occurred. P=0 represents a state in which it has not. I can rephrase the example with a queue Q, so that P is Q.empty(), producers push items onto Q, and consumers pop all the items and do something with them. In this formulation, a spurious wakeup would mean that a consumer returns from wait() and finds the queue empty, which is exactly what happens in my example.

Peter Dimov

5 May 5 May

2:26 p.m.

"You will never see spurious wakeups from a permit object -- [...]. This means you don't need to do any wait predicate checking with permits..." I don't think that this is true. Suppose we have two producer threads Tp1 and Tp2 and two consumer treads Tc1 and Tc2, and a predicate P. I'll treat the predicate as a simple boolean variable below, for simplicity. A producer thread sets P=1 and calls notify_one. A consumer thread calls wait and then sets P=0. (It has to set P=0 regardless of how many producers have set the predicate to 1, because - I assume - the permit object doesn't count the pending notify_one calls, unlike a semaphore and like an event.) So you have Initially P is 0. Tc1, Tc2 call wait. Tp1 sets P=1 and calls notify_one. Tc1 is unblocked, returns from wait, but is immediately suspended by the scheduler for unfathomable scheduling reasons, like its code causing a page fault. Tp2 sets P=1 and calls notify_one. Tc2 is unblocked, returns from wait, sets P=0. Tc1 is resumed, sees P=0, panics. I suspect that your guarantees, that grant/notify_one calls are blocking and mutually exclusive, were designed to prevent this, but they don't and can't, in theory. They _can_ make the above extremely improbable, but they aren't theoretically sound. The specification of condition variables allows spurious wakeups not because they can't be prevented by the implementation; it permits spurious wakeups to make client code assume spurious wakeups, because code that is written to assume spurious wakeups is more likely to be correct. Or conversely, code that does not assume spurious wakeups is likely to be incorrect even if no spurious wakeups occur.

Niall Douglas

5:37 p.m.

On 5 May 2014 at 17:26, Peter Dimov wrote:

...

"You will never see spurious wakeups from a permit object -- [...]. This means you don't need to do any wait predicate checking with permits..."

I don't think that this is true. Suppose we have two producer threads Tp1 and Tp2 and two consumer treads Tc1 and Tc2, and a predicate P. I'll treat the predicate as a simple boolean variable below, for simplicity.

A producer thread sets P=1 and calls notify_one. A consumer thread calls wait and then sets P=0. (It has to set P=0 regardless of how many producers have set the predicate to 1, because - I assume - the permit object doesn't count the pending notify_one calls, unlike a semaphore and like an event.)

So you have

Initially P is 0. Tc1, Tc2 call wait. Tp1 sets P=1 and calls notify_one. Tc1 is unblocked, returns from wait, but is immediately suspended by the scheduler for unfathomable scheduling reasons, like its code causing a page fault. Tp2 sets P=1 and calls notify_one. Tc2 is unblocked, returns from wait, sets P=0. Tc1 is resumed, sees P=0, panics.

I suspect that your guarantees, that grant/notify_one calls are blocking and mutually exclusive, were designed to prevent this, but they don't and can't, in theory. They _can_ make the above extremely improbable, but they aren't theoretically sound.

That isn't the reason actually, but I'll get back to that. Firstly, permits are a notification object, not a serialisation object. If you have some predicate P whose state you are changing you must protect it with a mutex, just like any other code. This is why permit.wait() takes a locked mutex. Reworking your example thusly: Initially P is 0 and p_mutex is unlocked. Tc1, Tc2 lock p_mutex and call wait. Tp1 locks p_mutex, sets P=1 and calls notify_one. Upon Tp1 releasing the mutex, Tc1 is unblocked with mutex locked, returns from wait, but is immediately suspended by the scheduler for unfathomable scheduling reasons, like its code causing a page fault. Tp2 tries to set P=1, but gets blocked on the mutex being held by Tc1. When Tc1 is resumed and eventually unlocks the mutex, everything else proceeds as normal. Maybe you meant the fact that the permit contains its own predicate, and hence no need for predicate checking? If so, then P equals the state of the permit, and: Initially P is 0. Tc1, Tc2 call wait. Tp1 sets P=1. Tc1 is unblocked, thus atomically resetting P=0, returns from wait, but is immediately suspended by the scheduler for unfathomable scheduling reasons, like its code causing a page fault. Tp2 sets P=1. Tc2 is unblocked, thus atomically resetting P=0, returns from wait. Tc1 is resumed, all is well. Exactly the number of waiters were freed as granters. Perhaps though in fact you were more concerned about two threads granting a permit, but only one thread getting woken? The problem here is that in some cases this is exactly what you want, in other cases it is a lost wakeup and you should use a semaphore instead. Similarly, revoking non-consuming permits is always racy unless you have added synchronisation to ensure you're not being stupid. All threading primitives have their gotchas, no doubt. The question here is if the presented implementation of this permit is a wise addition to Boost.

...

I suspect that your guarantees, that grant/notify_one calls are blocking and mutually exclusive, were designed to prevent this, but they don't and can't, in theory. They _can_ make the above extremely improbable, but they aren't theoretically sound.

Grants being blocking for non-consuming permits is actually a time complexity guarantee, so you are being guaranteed progress no matter what. The Windows kernel can "cheat" here in ways we cannot in portable code.

...

The specification of condition variables allows spurious wakeups not because they can't be prevented by the implementation; it permits spurious wakeups to make client code assume spurious wakeups, because code that is written to assume spurious wakeups is more likely to be correct. Or conversely, code that does not assume spurious wakeups is likely to be incorrect even if no spurious wakeups occur.

I would have said spurious wakeups come exclusively from kernel bugs and the fact POSIX allows signals to escape syscalls, which is definitely a pre-threading era legacy design choice. On Windows spurious wakeups are trapped for you in user space so you never see them and therefore never have to deal with them, unless of course you elect to do so (e.g. any of the wait APIs able to return WAIT_IO_COMPLETION). Spurious wakeups absolutely can be prevented by the implementation, just for backwards compatibility POSIX cannot do so. Niall -- ned Productions Limited Consulting http://www.nedproductions.biz/ http://ie.linkedin.com/in/nialldouglas/

Peter Dimov

5:56 p.m.

Niall Douglas wrote:

...

Firstly, permits are a notification object, not a serialisation object. If you have some predicate P whose state you are changing you must protect it with a mutex, just like any other code. This is why permit.wait() takes a locked mutex.

But you do have a wait() function that takes nothing?

...

Reworking your example thusly:

Initially P is 0 and p_mutex is unlocked.

Tc1, Tc2 lock p_mutex and call wait.

Tp1 locks p_mutex, sets P=1 and calls notify_one. Upon Tp1 releasing the mutex, Tc1 is unblocked with mutex locked, returns from wait, but is immediately suspended by the scheduler for unfathomable scheduling reasons, like its code causing a page fault.

Tp2 tries to set P=1, but gets blocked on the mutex being held by Tc1. When Tc1 is resumed and eventually unlocks the mutex, everything else proceeds as normal.

This assumes that notify_one is called before releasing the mutex. Condition variables allow notify_one outside the mutex. Is this use not supported by permits?

Niall Douglas

7:06 p.m.

On 5 May 2014 at 20:56, Peter Dimov wrote:

...

...
Firstly, permits are a notification object, not a serialisation object. If you have some predicate P whose state you are changing you must protect it with a mutex, just like any other code. This is why permit.wait() takes a locked mutex.

But you do have a wait() function that takes nothing?

That wait never sleeps, so wakeups cannot be lost. There is actually an internal facility for sleeping waits without a mutex, but it is deliberately not documented as you shouldn't use it (it's only there because it made the Windows implementation much more efficient).

...

...
Reworking your example thusly:

Initially P is 0 and p_mutex is unlocked.

Tc1, Tc2 lock p_mutex and call wait.

Tp1 locks p_mutex, sets P=1 and calls notify_one. Upon Tp1 releasing the mutex, Tc1 is unblocked with mutex locked, returns from wait, but is immediately suspended by the scheduler for unfathomable scheduling reasons, like its code causing a page fault.

Tp2 tries to set P=1, but gets blocked on the mutex being held by Tc1. When Tc1 is resumed and eventually unlocks the mutex, everything else proceeds as normal.

This assumes that notify_one is called before releasing the mutex. Condition variables allow notify_one outside the mutex. Is this use not supported by permits?

Consuming permits let you grant either with or without the mutex. Non-consuming permits will deadlock if you grant holding the mutex because of the guarantee of blocking until the release of all current waiters. If you grant outside of the mutex, you accept that if your preferred waiter doesn't wake and claim its mutex before another wake occurs, you intend the grant to be lost. As I mentioned, sometimes you want that, sometimes you don't. I assume you'd like this sort of stuff documented? Niall -- ned Productions Limited Consulting http://www.nedproductions.biz/ http://ie.linkedin.com/in/nialldouglas/

Peter Dimov

7:39 p.m.

Niall Douglas wrote:

...

Consuming permits let you grant either with or without the mutex. Non-consuming permits will deadlock if you grant holding the mutex because of the guarantee of blocking until the release of all current waiters.

OK. Here you go then. * Initially P is 0 and p_mutex is unlocked. * Tc1, Tc2 lock p_mutex and call wait. * Tp1 locks p_mutex, sets P=1, unlocks p_mutex and gets suspended. * Tp2 locks p_mutex, sets P=1, unlocks p_mutex and calls notify_one. * Tc1 is unblocked with p_mutex locked, returns from wait, gets suspended. * Tp1 is resumed and calls notify_one. * Tc2 is unblocked and blocks on p_mutex, which is held by Tc1. * Tc1 is resumed, sees P=1, sets P=0, unlocks p_mutex. * Tc2 is unblocked, sees P=0.

Peter Dimov

8:13 p.m.

Niall Douglas wrote:

...

Spurious wakeups absolutely can be prevented by the implementation,

I know.

...

just for backwards compatibility POSIX cannot do so.

No, the POSIX committee knew that they can be prevented, too. What I am saying is that they could have disallowed them when they wrote the specification, but did not, on purpose. Code that has problems with spurious wakeups also has problems with other corner cases (such as the one from my other post); and writing it to handle spurious wakeups takes care of these other corner cases as well. You are not the first to assume that spurious wakeups represent a deficiency in the POSIX specification and that it would have been better without them, but this is not true, at least in principle. (Pragmatically speaking, without spurious wakeups, incorrect code fails less often, sometimes "never" for practical purposes. But... well.)

Phil Endecott

6 May 6 May

4:10 p.m.

Peter Dimov wrote:

...

You are not the first to assume that spurious wakeups represent a deficiency in the POSIX specification and that it would have been better without them, but this is not true, at least in principle. (Pragmatically speaking, without spurious wakeups, incorrect code fails less often, sometimes "never" for practical purposes. But... well.)

My experience of this was with signals, e.g. sigtimer, on Linux; a library that I was trying to use worked fine until I had a timer firing 50 times each second, which caused spurious wakeups in the futex syscall. This took a ridiculously long time to debug. It would actually be worthwhile to have a debug condvar implementation that would deliberately wakeup "spuriously" with some probability. That could even be a useful addition to Boost, if someone wanted to write it. Regards, Phil.

Peter Dimov

7 May 7 May

1:36 a.m.

Phil Endecott wrote:

...

My experience of this was with signals, e.g. sigtimer, on Linux; a library that I was trying to use worked fine until I had a timer firing 50 times each second, which caused spurious wakeups in the futex syscall. This took a ridiculously long time to debug.

Frequent signals are one way to cause behaviors such as the one I described. Under normal circumstances, a thread that has just been woken up is very, very unlikely to be immediately suspended. But signals borrow existing threads, and if a signal happens to borrow that thread and then spends the entire time slice in the signal handler, it would be as if the thread has been suspended just after it's been woken up, which in my example manifests itself as something that is indistinguishable from a spurious wakeup. This is just a speculation on my part though. I don't know if it applies to your case.

Ion Gaztañaga

6 May 6 May

7:57 p.m.

El 05/05/2014 22:13, Peter Dimov wrote:

...

Niall Douglas wrote:

...
Spurious wakeups absolutely can be prevented by the implementation,

I know.

...
just for backwards compatibility POSIX cannot do so.

No, the POSIX committee knew that they can be prevented, too.

What I am saying is that they could have disallowed them when they wrote the specification, but did not, on purpose. Code that has problems with spurious wakeups also has problems with other corner cases (such as the one from my other post); and writing it to handle spurious wakeups takes care of these other corner cases as well.

I think spurious wakeups also allow more efficient implementations in some systems or implementations written in user space. When emulating condition variables you can't (or it's extremely expensive) to guarantee that you wake up the waiting thread and that thread can lock the mutex before anyone can lock and modify the predicate. Similar reasons are given in http://en.wikipedia.org/wiki/Spurious_wakeup and http://stackoverflow.com/questions/8594591/why-does-pthread-cond-wait-have-s... Best, Ion

Andrey Semashev

5 May 5 May

8:30 p.m.

On Monday 05 May 2014 18:37:26 Niall Douglas wrote:

...

On 5 May 2014 at 17:26, Peter Dimov wrote:

...
The specification of condition variables allows spurious wakeups not because they can't be prevented by the implementation; it permits spurious wakeups to make client code assume spurious wakeups, because code that is written to assume spurious wakeups is more likely to be correct. Or conversely, code that does not assume spurious wakeups is likely to be incorrect even if no spurious wakeups occur.

I would have said spurious wakeups come exclusively from kernel bugs and the fact POSIX allows signals to escape syscalls, which is definitely a pre-threading era legacy design choice. On Windows spurious wakeups are trapped for you in user space so you never see them and therefore never have to deal with them, unless of course you elect to do so (e.g. any of the wait APIs able to return WAIT_IO_COMPLETION).

MSDN explicitly states that condition variables are subject to spurious wakeups: http://msdn.microsoft.com/en-us/library/windows/desktop/ms682052(v=vs.85).as... With other APIs (including WaitFor* family) you do have to account for APC interruptions (as you said, by checking the result for WAIT_IO_COMPLETION value). So you are not relieved from spurious wakeups on Windows.

...

Spurious wakeups absolutely can be prevented by the implementation, just for backwards compatibility POSIX cannot do so.

AFAIR, the rationale for spurious wakeups is that preventing them would be too expensive for all uses of condition variables. Given that in most cases you need to check for a condition upon wakeup anyway, that added cost is a waste. That said, spurious wakeups can be concealed in some implementations. I think, on Linux libc is able to process EINTR returned from futex wait operations and conceal the wakeup without returning from pthread wait function.

Bjorn Reese

6:30 p.m.

New subject: [thread] Request review of new synchronisation object, boost::permit<>

On 05/03/2014 11:13 PM, Niall Douglas wrote:

...

you will have to use grant() and revoke() instead. When deciding to substitute permits for condition variables, you will need to decide carefully which kind of permit is the correct substitute."

Is it correct to say that permits are level-triggered, whereas condition variables are edge-triggered?

...

"Permit objects are actually very similar to a void promise/future i.e. they act as a reusable promise/future pair but without transporting any value or exception state, just the permission to

So a permit is a future without value :)

...

ermit_reference.pdf). As such, they can be very useful for situations where a promise/future is too heavy (permits can spin instead of sleep, plus they need not be repeatedly constructed and destructed)."

It may be a good idea to have example code of a spinning future (possibly by using expected<>) Another useful example could be a one-to-one latch (as in N3666.)

...

https://github.com/ned14/thread/tree/permit_object

I suggest that permit<> is renamed to basic_permit<>, and permit_c to permit. permit_nc should have a more meaningful name, such as singleshot_permit or permit_once.

Niall Douglas

7:21 p.m.

On 5 May 2014 at 20:30, Bjorn Reese wrote:

...

...
you will have to use grant() and revoke() instead. When deciding to substitute permits for condition variables, you will need to decide carefully which kind of permit is the correct substitute."

Is it correct to say that permits are level-triggered, whereas condition variables are edge-triggered?

Correct. Though, technically speaking, a condvar + mutex combo is also level triggered if and only if the mutex is always held by all waiters and all notifiers while they run.

...

...
"Permit objects are actually very similar to a void promise/future i.e. they act as a reusable promise/future pair but without transporting any value or exception state, just the permission to

So a permit is a future without value :)

Also lighter weight, shared memory capable, no malloc etc.

...

...
ermit_reference.pdf). As such, they can be very useful for situations where a promise/future is too heavy (permits can spin instead of sleep, plus they need not be repeatedly constructed and destructed)."

It may be a good idea to have example code of a spinning future (possibly by using expected<>)

Remind me: is proposed expected<T> still essentially the value and exception transport mechanism of promise<T>/future<T>? It may have evolved since, or I got its intent wrong.

...

Another useful example could be a one-to-one latch (as in N3666.)

I had thought Alasdair's latch a sort of inverted semaphore? Can you clarify what you mean by one to one latch?

...

...
https://github.com/ned14/thread/tree/permit_object

I suggest that permit<> is renamed to basic_permit<>

Makes sense.

...

, and permit_c to permit. permit_nc should have a more meaningful name, such as singleshot_permit or permit_once.

How does boost::gate sound for a non-consuming permit? So: typedef basic_permit<true> permit; typedef basic_permit<false> gate; If gate is no good, we could use boost::event, though I personally think that is a bit too likely to clash. Niall -- ned Productions Limited Consulting http://www.nedproductions.biz/ http://ie.linkedin.com/in/nialldouglas/

Bjorn Reese

7:45 p.m.

New subject: [thread] Request review of new synchronisation object, boost::permit<>

On 05/05/2014 09:21 PM, Niall Douglas wrote:

...

Remind me: is proposed expected<T> still essentially the value and exception transport mechanism of promise<T>/future<T>? It may have evolved since, or I got its intent wrong.

Last I looked, yes.

...

...
Another useful example could be a one-to-one latch (as in N3666.)

I had thought Alasdair's latch a sort of inverted semaphore? Can you clarify what you mean by one to one latch?

A permit triggers on the first "count-down". The "latch" should trigger after N count-downs. This may be useful to wake up a thread for, say, every N asynchronous operation. The idea of this example is simply to illustrate how a permit can be combined with an atomic counter.

...

If gate is no good, we could use boost::event, though I personally think that is a bit too likely to clash.

Gate and event are probably too general-purpose to be placed directly in the boost namespace.

Antony Polukhin

7 May 7 May

2:31 p.m.

2014-05-04 1:13 GMT+04:00 Niall Douglas <s_sourceforge@nedprod.com>:

...

Dear Boost,

Review and commentary upon a proposed new Boost object, boost::permit<> is invited.

I've just started to read the documentation and that's what looks wrong: ... And the compiler's optimiser is permitted by the standard to believe that result has not changed between its initialisation and the first test of its value, so in fact the optimiser could compile this instead by eliding the first test of result under the assumption that the predicate will always be false for its first execution: do { wait(guard);} while(!pred()); ... This can not happen because thread constructor is a `release` operation: // main thread int result=0; boost::thread(f); // release [2] boost::unique_lock<decltype(lock)> guard(lock); // acquire [1] done.wait(guard, [&result] { return result!=0; }); // new thread // acquire - from thread constructor [2] boost::lock_guard<decltype(lock)> guard(lock); acquire [1] result=78; // release [1] Compiler is prohibited from moving non-atomic stores past the atomic store-release or perform non-atomic loads earlier than the atomic load-acquire. [1] ISO C++11 30.4.1.2 Mutex types: Synchronization: Prior unlock() operations on the same object shall synchronize with (1.10) this operation. [2] ISO C++11 30.3.1.2 thread constructors: Synchronization: The completion of the invocation of the constructor synchronizes with the beginning of the invocation of the copy of f. -- Best regards, Antony Polukhin

4109

Age (days ago)

4113

Last active (days ago)

List overview

Download

27 comments

8 participants

participants (8)

Andrey Semashev
Antony Polukhin
Bjorn Reese
Ion Gaztañaga
microcai
Niall Douglas
Peter Dimov
Phil Endecott