[interprocess] Mutex and condition at process termination
Dear Experts, Can we improve how interprocess mutexes and condition variables behave on process termination? Currently if a process terminates (i.e. it crashes, or you press ctrl-C), the interprocess docs say nothing as far as I can see about what happens to locked mutexes and awaited conditions. In practice it seems that mutexes that were locked remain locked, and other processes will deadlock. (I'm using Linux.) A few thoughts: * If a process were only reading the shared state, then it would be appropriate for the mutex to be unlocked on termination. * If a process were modifying the shared state, then it would be wrong to unconditionally unlock the mutex. So it would be useful to distinguish between reader and writer locks, even if we're not implementing a single-writer/multiple-reader mutex. * The system could be made more robust by blocking signals while a mutex is locked. This doesn't help with crashes, e.g. segfaults, but it would help with ctrl-C. * It may be useful to cause all processes to terminate if one of them terminates with a mutex held for writing, either immediately or as soon as they try to lock the same mutex. Perhaps also to delete the presumed-corrupted shared memory segment. * PTHREAD_MUTEX_ROBUST might be part of the solution. That seems to require the non-crashed process to do clean up, i.e. we would need to record whether the crashed process were reading or writing and react appropriately. I'm less clear about what happens to condition variables, but it does seem that perhaps terminating a process while it is waiting on a condition will cause other processes to deadlock. Perhaps the wait conceptually returns and the mutex is re-locked during termination. I have encountered this while trying to use a simple diagnostic program that just dumps some shared memory data structures and waits on a condition in a loop. I run this for a while and then press ctrl-C. Yes, a while after I disconnect the diagnostic program the system crashes... the worst sort of bug! Regards, Phil.
Em qui., 14 de mai. de 2020 às 07:42, Phil Endecott via Boost < boost@lists.boost.org> escreveu:
* The system could be made more robust by blocking signals while a mutex is locked. This doesn't help with crashes, e.g. segfaults, but it would help with ctrl-C.
I don't think this is a good idea. Signals are a property of the application and libraries rarely should touch signals silently. Blocking signals (instead handling them) is better, but it shouldn't happen across the return of some function without a good excuse. I'm less clear about what happens to condition variables, but it
does seem that perhaps terminating a process while it is waiting on a condition will cause other processes to deadlock. Perhaps the wait conceptually returns and the mutex is re-locked during termination.
These are all tricky questions. POSIX gets away by deferring decision to the user and it does work. The challenge here is adapting the problem to the vocabulary of C++ objects model. If we do proceed with your idea about read/write locks, the PTHREAD_MUTEX_ROBUST solution could be mapped into the lock object. unique_lock -> we can't assume anything, then we use this most broad assumption: a write lock and no recovery mechanisms in using code unique_read_lock -> read lock unique_robust_lock -> stores extra data about the lock state that the user can query after lock is acquired. This should be enough to map PTHREAD_MUTEX_ROBUST behaviour and make it work with condition variables. unique_robust_lock lk{mtx}; for (;;) { if (!lk.consistent()) { // mutex acquired // but in inconsistent state // user has to deal with it } if (predicate) break; cond.wait(lk); } What do you think? -- Vinícius dos Santos Oliveira https://vinipsmaker.github.io/
Phil Endecott wrote:
Can we improve how interprocess mutexes and condition variables behave on process termination?
Having given this some more thought:
I think it would be useful if Boost.Interprocess added
a robust mutex, as a straightforward wrapper around the
POSIX robust mutex and equivalents on other platforms if
they exist. I note that there is a patch that does this
on the Interprocess issue tracker but it unconditionally
cleans up the mutex when it find that the other process
died, which is wrong. I believe that the lock() method
should fail in that case, and it should provide a
make_consistent method that the user can invoke if
appropriate before retrying. Then read and write locks,
with appropriate clean-up behaviour, can be implemented
on top of that.
Vinicius dos Santos Oliveira
After some more thought, here is another idea: PTHREAD_MUTEX_ROBUST is no longer a property of the mutex, but a property of the lock.
I don't see how that can be implemented on top of the
POSIX API, where robustness is a property of the mutex.
Andrey Semashev
* PTHREAD_MUTEX_ROBUST might be part of the solution. That seems to require the non-crashed process to do clean up, i.e. we would need to record whether the crashed process were reading or writing and react appropriately.
You can't do that reliably because the crashed process could have crashed between locking the mutex and indicating its intentions.
I don't follow. Say I have a bool in the mutex called being_written. It's initially false, the read lock doesn't touch it, and the write lock does: lock() { m.lock(); being_written = true; memory_barrier(); } unlock() { memory_barrier(); being_written = false; m.unlock(); } If the process crashes between locking and setting being_written, then the process doing the cleanup will see being_written = false, and that's OK because the crasher hadn't actually written anything. Regarding blocking signals, I agree this is not really something that should be part of the interprocess synchronisation primitives, but I do think that a modern wrapper around the ancient C signals API would be good to have.
I'm less clear about what happens to condition variables, but it does seem that perhaps terminating a process while it is waiting on a condition will cause other processes to deadlock. Perhaps the wait conceptually returns and the mutex is re-locked during termination.
AFAIR, pthread_cond_t uses a non-robust mutex internally, which means that condition variables are basically useless when you need robust semantics.
Yes.
If you need a condition variable-like behavior, in a robust way, I think your best bet is to use futexes directly.
Yes, that is the conclusion that I've also come to - but it is probably a very difficult problem. Note that robust mutexes use futexes rather differently from regular mutexes, and there is kernel involvement at process termination (see man get_robust_list). A robust condition variable would have to do something similar. I find this all rather surprising, as interrupting a waiting condition variable is often much more common than interrupting a locked mutex. Regards, Phil.
On 2020-05-16 21:35, Phil Endecott via Boost wrote:
Phil Endecott wrote: Andrey Semashev
wrote: * PTHREAD_MUTEX_ROBUST might be part of the solution. That seems to require the non-crashed process to do clean up, i.e. we would need to record whether the crashed process were reading or writing and react appropriately.
You can't do that reliably because the crashed process could have crashed between locking the mutex and indicating its intentions.
I don't follow. Say I have a bool in the mutex called being_written. It's initially false, the read lock doesn't touch it, and the write lock does:
lock() { m.lock(); being_written = true; memory_barrier(); } unlock() { memory_barrier(); being_written = false; m.unlock(); }
If the process crashes between locking and setting being_written, then the process doing the cleanup will see being_written = false, and that's OK because the crasher hadn't actually written anything.
What if the writer crashes in unlock(), between being_written = false and m.unlock()?
Regarding blocking signals, I agree this is not really something that should be part of the interprocess synchronisation primitives, but I do think that a modern wrapper around the ancient C signals API would be good to have.
I agree, although given that there are many different ways to handle signals, I have a hard time imagining what such a wrapper would look like.
If you need a condition variable-like behavior, in a robust way, I think your best bet is to use futexes directly.
Yes, that is the conclusion that I've also come to - but it is probably a very difficult problem. Note that robust mutexes use futexes rather differently from regular mutexes, and there is kernel involvement at process termination (see man get_robust_list). A robust condition variable would have to do something similar.
Yes, given that the robust list is an internal interface between the kernel and libc, you basically have to implement your own mechanism, which may not be absolutely equivalent to the real robust mutexes. For example, in a pseudo_robust_mutex::lock you could use a timed wait on the internal futex, and on timeout check if the mutex owner pid exists. You may have other ways of detecting and handling abandoned locks depending on your application architecture. Condition variable could be implemented without the internal mutex (i.e. it would only have one internal futex), if you have a guarantee that the associated external mutex is always locked when the condition variable methods are called.
Andrey Semashev
On 2020-05-16 21:35, Phil Endecott via Boost wrote:
Phil Endecott wrote: Andrey Semashev
wrote: * PTHREAD_MUTEX_ROBUST might be part of the solution. That seems to require the non-crashed process to do clean up, i.e. we would need to record whether the crashed process were reading or writing and react appropriately.
You can't do that reliably because the crashed process could have crashed between locking the mutex and indicating its intentions.
I don't follow. Say I have a bool in the mutex called being_written. It's initially false, the read lock doesn't touch it, and the write lock does:
lock() { m.lock(); being_written = true; memory_barrier(); } unlock() { memory_barrier(); being_written = false; m.unlock(); }
If the process crashes between locking and setting being_written, then the process doing the cleanup will see being_written = false, and that's OK because the crasher hadn't actually written anything.
What if the writer crashes in unlock(), between being_written = false and m.unlock()?
Not a problem; the writer completed its changes and the state is consistent. Regards, Phil.
On 16/05/2020 20:46, Andrey Semashev via Boost wrote:
Regarding blocking signals, I agree this is not really something that should be part of the interprocess synchronisation primitives, but I do think that a modern wrapper around the ancient C signals API would be good to have.
I agree, although given that there are many different ways to handle signals, I have a hard time imagining what such a wrapper would look like.
The latest revision of the "modern signals" paper can be found at https://wg21.link/P2069. Last month WG14 and the Austin Working Group reviewed it, and agreed with something close to it, in principle. Modulo bikeshed renaming, obviously. Niall
Niall Douglas wrote:
The latest revision of the "modern signals" paper can be found at https://wg21.link/P2069.
Thanks Niall, that's interesting. I note that your signal_guard takes a callable. That's not easy to combine with a scoped lock_guard, i.e. if I want to make something that's exactly like std::lock_guard but also blocks SIGINT. Is there any way around that? To block SIGINT for the short time that a lock is held, I was imagining calls to pthread_sigmask() when locking and unlocking - but I guess this is an actual system call, so it could take much longer than the mutex lock and unlock, which are just atomics typically. If I understand correctly, your proposal avoids this by installing a signal handler once at startup, and then just changing some state at the start and end of the guard. So if I want only to block a signal rather than ignoring or handling it, with your proposal I would need to track pending signals and raise them at the end of the guard. Is that right? I'm unclear what happens in a multi-threaded program; if one thread is in a critical section with SIGINT blocked, can the signal be delivered to a different thread and cause the whole process to be terminated, defeating the purpose of blocking it? Does your proposal change this behaviour, compared to pthread_sigmask() ? Regards, Phil.
On 19/05/2020 11:32, Phil Endecott via Boost wrote:
Niall Douglas wrote:
The latest revision of the "modern signals" paper can be found at https://wg21.link/P2069.
Thanks Niall, that's interesting.
I note that your signal_guard takes a callable. That's not easy to combine with a scoped lock_guard, i.e. if I want to make something that's exactly like std::lock_guard but also blocks SIGINT. Is there any way around that?
You can install guards for the current thread. That takes a callable which is to be protected, as you noticed. The callable based design is unavoidable here, for Win32 SEH compatibility. You can also install handlers globally. I'd suggest a global handler which examines thread_local state would suit you best.
To block SIGINT for the short time that a lock is held, I was imagining calls to pthread_sigmask() when locking and unlocking - but I guess this is an actual system call, so it could take much longer than the mutex lock and unlock, which are just atomics typically.
Correct.
If I understand correctly, your proposal
avoids this by installing a signal handler once at startup, and then just changing some state at the start and end of the guard.
Correct. We install sigaction() handlers at the start of process, and globally enable signals. Those are slow operations. We then "reimplement" signal handling locally, all of which avoids the kernel.
So if I want only to block a signal rather than ignoring or handling it, with your proposal I would need to track pending signals and raise them at the end of the guard. Is that right?
Under P2069, all globally installed handlers are *filtering* handlers. You'll get called when the signal raises, you can do your thing, and handlers installed after you get called in turn by default. So for your situation, you need to do nothing.
I'm unclear what happens in a multi-threaded program; if one thread is in a critical section with SIGINT blocked, can the signal be delivered to a different thread and cause the whole process to be terminated, defeating the purpose of blocking it? Does your proposal change this behaviour, compared to pthread_sigmask() ?
We don't touch the thread local signal mask at all. We globally enable signals for the process. Thus, whichever thread receives a signal is the thread which runs "modern signals". And that varies per POSIX implementation. In terms of general thread safety, "modern signals" is always thread and reentrant safe i.e. you can modify installed signals at any time from any thread. One caveat, which is documented, is that modifying the global signal install from within a global signal handle of the same signal number can cause an endless loop, so don't do that if you can avoid it. If you can't, it can be worked around, with effort. BTW, I assume you realise that your proposed scheme won't be watertight right? I mean that Windows doesn't send you a signal on process termination, and there are ways of terminating a POSIX process without it ever receiving a signal either. The biggest source of that on Linux is OOM, which is very irritating of it. Niall
On 20/05/2020 01:22, Niall Douglas wrote:
BTW, I assume you realise that your proposed scheme won't be watertight right? I mean that Windows doesn't send you a signal on process termination, and there are ways of terminating a POSIX process without it ever receiving a signal either. The biggest source of that on Linux is OOM, which is very irritating of it.
I'm not sure if Boost.Interprocess makes use of them (I assume not) but on Windows the standard kernel interprocess mutex will report an "abandoned" mutex if something tries to acquire a mutex that was owned by a terminated process/thread, regardless of how it was terminated. This is a successful acquisition that indicates that the protected state may not be consistent. Though I expect that most apps don't handle it as such and either just treat it as failure or press ahead anyway and just hope they don't crash/corrupt.
Gavin Lambert wrote:
I'm not sure if Boost.Interprocess makes use of them (I assume not) but on Windows the standard kernel interprocess mutex will report an "abandoned" mutex if something tries to acquire a mutex that was owned by a terminated process/thread, regardless of how it was terminated.
This is a successful acquisition that indicates that the protected state may not be consistent.
OK, this is what POSIX "robust" mutexes do. It's useful to know that they are portable. Regards, Phil.
On 2020-05-17 21:40, Niall Douglas via Boost wrote:
The proposal to mark standard library functions as async signal safe, or alternatively as non-allocating, does not sound feasible given the committees reluctance to mark standard library functions as noexcept (Lakos rule.)
On 18/05/2020 14:38, Bjorn Reese via Boost wrote:
On 2020-05-17 21:40, Niall Douglas via Boost wrote:
The proposal to mark standard library functions as async signal safe, or alternatively as non-allocating, does not sound feasible given the committees reluctance to mark standard library functions as noexcept (Lakos rule.)
FYI I believe the Lakos rule is going away soon in any case, as we shall be gaining better ways of solving the problem it solved, and still marking functions obviously noexcept as noexcept. Re: async signal safe, I believe EWG-I guidance was that marking all standard library functions is infeasible, but that marking "core" standard library functions is desirable, for some definition of "core". In theory, one could then deduce async signal safety of some arbitrary use of the standard library, or to be more specific, the compiler could deduce it for you. In practice, I think people will subset the C++ they use within signal guarded sections. The requirement that no non-trivial destructors could ever be called is fairly limiting in any case, it precludes the portable use of much of the standard library. Coming back to shared memory mutexes etc, me personally if I want a mutex that is shared across processes and is resilient to sudden process death, I just lock a shared file in /tmp using flock(). This is completely portable: LLFIO, thanks to recent WG21 feedback, now implements SharedMutex Concept matching file_handle. Rather usefully, the content of the same shared file can also be the shared memory between the processes. One thus has exactly what the OP is looking for. Niall
Niall Douglas wrote:
Coming back to shared memory mutexes etc, me personally if I want a mutex that is shared across processes and is resilient to sudden process death, I just lock a shared file in /tmp using flock(). This is completely portable: LLFIO, thanks to recent WG21 feedback, now implements SharedMutex Concept matching file_handle. Rather usefully, the content of the same shared file can also be the shared memory between the processes. One thus has exactly what the OP is looking for.
That's an interesting idea, but unfortunately I don't think there's a way to get a (robust) condition variable that works with this lock, and that's what I really need. Regards, Phil.
On 2020-05-19 15:55, Niall Douglas via Boost wrote:
Coming back to shared memory mutexes etc, me personally if I want a mutex that is shared across processes and is resilient to sudden process death, I just lock a shared file in /tmp using flock().
Do you have any performance numbers of flock vs. e.g. futex? I would expect the former to be considerably slower.
On 20/05/2020 13:18, Andrey Semashev via Boost wrote:
On 2020-05-19 15:55, Niall Douglas via Boost wrote:
Coming back to shared memory mutexes etc, me personally if I want a mutex that is shared across processes and is resilient to sudden process death, I just lock a shared file in /tmp using flock().
Do you have any performance numbers of flock vs. e.g. futex? I would expect the former to be considerably slower.
It depends hugely on platform. Linux might do 100k ops/sec. FreeBSD might do 600k ops/sec. Windows might do 20k ops/sec (a Win32 event object or semaphore is much faster). All are very considerably slower than a cmpxchg into shared memory. The only way that I know of to get both the speed of an atomic, and robustness, is to launch a dedicated monitor process which can break hanged locks if the owning process unexpectedly dies. LLFIO has the world's simplest child process support for exactly this purpose, as llfio::process_handle. Niall
Em qui., 14 de mai. de 2020 às 07:42, Phil Endecott via Boost < boost@lists.boost.org> escreveu:
Can we improve how interprocess mutexes and condition variables behave on process termination? [...]
After some more thought, here is another idea: PTHREAD_MUTEX_ROBUST is no longer a property of the mutex, but a property of the lock. So a normal unique_lock will map the behaviour of PTHREAD_MUTEX_STALLED (i.e. a non-robust mutex) while an unique_robust_lock will still succeed to acquire a mutex whose process died while holding a lock (and then user has to check whether acquired lock is consistent). If there are multiple locks waiting to acquire the mutex whose process crashed, only a unique_robust_lock will succeed in acquiring the mutex. -- Vinícius dos Santos Oliveira https://vinipsmaker.github.io/
On 2020-05-14 13:41, Phil Endecott via Boost wrote:
Dear Experts,
Can we improve how interprocess mutexes and condition variables behave on process termination?
Currently if a process terminates (i.e. it crashes, or you press ctrl-C), the interprocess docs say nothing as far as I can see about what happens to locked mutexes and awaited conditions. In practice it seems that mutexes that were locked remain locked, and other processes will deadlock. (I'm using Linux.) A few thoughts:
* If a process were only reading the shared state, then it would be appropriate for the mutex to be unlocked on termination.
* If a process were modifying the shared state, then it would be wrong to unconditionally unlock the mutex. So it would be useful to distinguish between reader and writer locks, even if we're not implementing a single-writer/multiple-reader mutex.
* The system could be made more robust by blocking signals while a mutex is locked. This doesn't help with crashes, e.g. segfaults, but it would help with ctrl-C.
Catching signals is a good idea regardless of IPC and locking mutexes. As long as there is a moment when your application holds some valuable data or some state (e.g. a network connection) that needs to be properly saved or cleaned up on exit, you have to implement proper signal handling and graceful program termination.
* It may be useful to cause all processes to terminate if one of them terminates with a mutex held for writing, either immediately or as soon as they try to lock the same mutex. Perhaps also to delete the presumed-corrupted shared memory segment.
* PTHREAD_MUTEX_ROBUST might be part of the solution. That seems to require the non-crashed process to do clean up, i.e. we would need to record whether the crashed process were reading or writing and react appropriately.
You can't do that reliably because the crashed process could have crashed between locking the mutex and indicating its intentions. For an other process to be able to restart or roll back a failed operation, that operation has to be implemented in a lock-free fashion, so that each step is atomic. At this point mutexes become redundant. In my experience, the only sensible reaction to an abandoned operation (regardless of the way you use to detect the abandoned state) is to scrap it and abort or start over in a new shared memory segment.
I'm less clear about what happens to condition variables, but it does seem that perhaps terminating a process while it is waiting on a condition will cause other processes to deadlock. Perhaps the wait conceptually returns and the mutex is re-locked during termination.
AFAIR, pthread_cond_t uses a non-robust mutex internally, which means that condition variables are basically useless when you need robust semantics. If you need a condition variable-like behavior, in a robust way, I think your best bet is to use futexes directly.
On 2020-05-14 20:43, Andrey Semashev wrote:
On 2020-05-14 13:41, Phil Endecott via Boost wrote:
Dear Experts,
Can we improve how interprocess mutexes and condition variables behave on process termination?
Currently if a process terminates (i.e. it crashes, or you press ctrl-C), the interprocess docs say nothing as far as I can see about what happens to locked mutexes and awaited conditions. In practice it seems that mutexes that were locked remain locked, and other processes will deadlock. (I'm using Linux.) A few thoughts:
* If a process were only reading the shared state, then it would be appropriate for the mutex to be unlocked on termination.
* If a process were modifying the shared state, then it would be wrong to unconditionally unlock the mutex. So it would be useful to distinguish between reader and writer locks, even if we're not implementing a single-writer/multiple-reader mutex.
* The system could be made more robust by blocking signals while a mutex is locked. This doesn't help with crashes, e.g. segfaults, but it would help with ctrl-C.
Catching signals is a good idea regardless of IPC and locking mutexes. As long as there is a moment when your application holds some valuable data or some state (e.g. a network connection) that needs to be properly saved or cleaned up on exit, you have to implement proper signal handling and graceful program termination.
To be clear, I don't mean that Boost.Interprocess should be dealing with signals. User's application should.
* It may be useful to cause all processes to terminate if one of them terminates with a mutex held for writing, either immediately or as soon as they try to lock the same mutex. Perhaps also to delete the presumed-corrupted shared memory segment.
* PTHREAD_MUTEX_ROBUST might be part of the solution. That seems to require the non-crashed process to do clean up, i.e. we would need to record whether the crashed process were reading or writing and react appropriately.
You can't do that reliably because the crashed process could have crashed between locking the mutex and indicating its intentions. For an other process to be able to restart or roll back a failed operation, that operation has to be implemented in a lock-free fashion, so that each step is atomic. At this point mutexes become redundant.
In my experience, the only sensible reaction to an abandoned operation (regardless of the way you use to detect the abandoned state) is to scrap it and abort or start over in a new shared memory segment.
I'm less clear about what happens to condition variables, but it does seem that perhaps terminating a process while it is waiting on a condition will cause other processes to deadlock. Perhaps the wait conceptually returns and the mutex is re-locked during termination.
AFAIR, pthread_cond_t uses a non-robust mutex internally, which means that condition variables are basically useless when you need robust semantics.
If you need a condition variable-like behavior, in a robust way, I think your best bet is to use futexes directly.
participants (6)
-
Andrey Semashev
-
Bjorn Reese
-
Gavin Lambert
-
Niall Douglas
-
Phil Endecott
-
Vinícius dos Santos Oliveira