[interprocess] locked mutex and process killed

older
A problem of Regex with ICU support

vicente.botet

23 Mar 2009 23 Mar '09

10:57 p.m.

Hi, what happens when a process owning interprocess::mutex is killed? is the mutex unlocked before the process dies? Thanks, Vicente

Show replies by date

Edouard A.

24 Mar 24 Mar

10:08 a.m.

On Mon, 23 Mar 2009 23:57:19 +0100, "vicente.botet" <vicente.botet@wanadoo.fr> wrote:

...

Hi,

what happens when a process owning interprocess::mutex is killed? is the mutex unlocked before the process dies?

After a quick look at the source code: inline void interprocess_mutex::lock() { if (pthread_mutex_lock(&m_mut) != 0) throw lock_exception(); } The answer to your question is "an exception will be thrown" since the call will return "EOWNERDEAD". Generally speaking when a thread terminates while holding a mutex, the mutex is considered abandoned and shared resources protected by the mutex are in an undetermined state. Hope you like quantum physics. ;) -- EA

Vicente Botet Escriba

4:39 p.m.

Edouard A. wrote:

...

On Mon, 23 Mar 2009 23:57:19 +0100, "vicente.botet" <vicente.botet@wanadoo.fr> wrote:

...
Hi,

what happens when a process owning interprocess::mutex is killed? is the mutex unlocked before the process dies?

After a quick look at the source code:

inline void interprocess_mutex::lock() { if (pthread_mutex_lock(&m_mut) != 0) throw lock_exception(); }

The answer to your question is "an exception will be thrown" since the call will return "EOWNERDEAD".

Generally speaking when a thread terminates while holding a mutex, the mutex is considered abandoned and shared resources protected by the mutex are in an undetermined state. Hope you like quantum physics. ;)

--

EA _______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Hi, if this is the case, what do you think about adding a kind of resource manager that will ensure all the shared resources are released (and in particular the interprocess::mutex are unlocked) on a cleanup phase just before diing? Vicente -- View this message in context: http://www.nabble.com/-interprocess--locked-mutex-and-process-killed-tp22671... Sent from the Boost - Dev mailing list archive at Nabble.com.

Ion Gaztañaga

4:54 p.m.

Vicente Botet Escriba wrote:

...

Edouard A. wrote:

...
The answer to your question is "an exception will be thrown" since the call will return "EOWNERDEAD".

Generally speaking when a thread terminates while holding a mutex, the mutex is considered abandoned and shared resources protected by the mutex are in an undetermined state. Hope you like quantum physics. ;)

if this is the case, what do you think about adding a kind of resource manager that will ensure all the shared resources are released (and in particular the interprocess::mutex are unlocked) on a cleanup phase just before diing?

There is no guarantee at all. POSIX does not guarantee EOWNERDEAD if you don't use robust mutexes, which is now an option in POSIX 2008 (O'll try to add it ASAP but adding robust mutexes has some interface problems) but is not mandatory. Adding a resource manager is not trivial, it needs to periodically check if processes are alive and that's not easy. Interprocess does not offer any guarantee that the underlying OS does not offer. Best, Ion

Vicente Botet Escriba

5:31 p.m.

Ion Gaztañaga wrote:

...

Vicente Botet Escriba wrote:

...
Edouard A. wrote:

...
The answer to your question is "an exception will be thrown" since the call will return "EOWNERDEAD".

Generally speaking when a thread terminates while holding a mutex, the mutex is considered abandoned and shared resources protected by the mutex are in an undetermined state. Hope you like quantum physics. ;)

if this is the case, what do you think about adding a kind of resource manager that will ensure all the shared resources are released (and in particular the interprocess::mutex are unlocked) on a cleanup phase just before diing?

There is no guarantee at all. POSIX does not guarantee EOWNERDEAD if you don't use robust mutexes, which is now an option in POSIX 2008 (O'll try to add it ASAP but adding robust mutexes has some interface problems) but is not mandatory.

Adding a resource manager is not trivial, it needs to periodically check if processes are alive and that's not easy. Interprocess does not offer any guarantee that the underlying OS does not offer.

Best,

Ion _______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Hi, what kind of interface problems have adding robust mutexes? Do you mean that you plan to use robust mutex when available? Should this be an option? My initial question was, on which state a locked interprocess mutex is when the owning process dies (with the current implementation)? is it locked? unlocked? it is undefined? How an application can recover from this situation? Best, Vicente -- View this message in context: http://www.nabble.com/-interprocess--locked-mutex-and-process-killed-tp22671... Sent from the Boost - Dev mailing list archive at Nabble.com.

Edouard A.

4:58 p.m.

On Tue, 24 Mar 2009 09:39:08 -0700 (PDT), Vicente Botet Escriba <vicente.botet@wanadoo.fr> wrote:

...

if this is the case, what do you think about adding a kind of resource manager that will ensure all the shared resources are released (and in particular the interprocess::mutex are unlocked) on a cleanup phase just before diing?

Hi Vicente, If the process is killed by the operating system (or the user with SIGKILL/TerminateProcess) there is nothing you can do before dying (otherwise that means you could make a process immortal). For mutexes they will be unlocked when leaving the scope of your function, so if you have some sort of exception translator you should be on the safe side even for division by zero, bad memory access and the like (in Windows I'm thinking about _set_se_translator). A lot of low memory errors will also be caught thanks to std::bad_alloc. For other resources it may make sense to register a clean up function with atexit. Kind regards. -- EA

Vicente Botet Escriba

5:34 p.m.

Edouard A. wrote:

...

On Tue, 24 Mar 2009 09:39:08 -0700 (PDT), Vicente Botet Escriba <vicente.botet@wanadoo.fr> wrote:

...
if this is the case, what do you think about adding a kind of resource manager that will ensure all the shared resources are released (and in particular the interprocess::mutex are unlocked) on a cleanup phase just before diing?

Hi Vicente,

If the process is killed by the operating system (or the user with SIGKILL/TerminateProcess) there is nothing you can do before dying (otherwise that means you could make a process immortal).

For mutexes they will be unlocked when leaving the scope of your function, so if you have some sort of exception translator you should be on the safe side even for division by zero, bad memory access and the like (in Windows I'm thinking about _set_se_translator). A lot of low memory errors will also be caught thanks to std::bad_alloc.

For other resources it may make sense to register a clean up function with atexit.

Kind regards.

--

EA _______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

How the mutex will be unlocked when leaving the scope of the function? Vicente -- View this message in context: http://www.nabble.com/-interprocess--locked-mutex-and-process-killed-tp22671... Sent from the Boost - Dev mailing list archive at Nabble.com.

Edouard A.

5:55 p.m.

On Tue, 24 Mar 2009 10:34:58 -0700 (PDT), Vicente Botet Escriba <vicente.botet@wanadoo.fr> wrote:

...

Edouard A. wrote:

...

How the mutex will be unlocked when leaving the scope of the function?

Well the destructor of the scoped_lock calls unlock, or am I missing something? -- EA

vicente.botet

8:03 p.m.

----- Original Message ----- From: "Edouard A." <edouard@fausse.info> To: <boost@lists.boost.org> Sent: Tuesday, March 24, 2009 6:55 PM Subject: Re: [boost] [interprocess] locked mutex and process killed

...

On Tue, 24 Mar 2009 10:34:58 -0700 (PDT), Vicente Botet Escriba <vicente.botet@wanadoo.fr> wrote:

...
Edouard A. wrote:

...
How the mutex will be unlocked when leaving the scope of the function?

Well the destructor of the scoped_lock calls unlock, or am I missing something?

Sorry if this question has a obvious answer. It is sure that the destructors will be called when a process is killed? Vicente

Edouard A.

10:49 p.m.

...

...
Well the destructor of the scoped_lock calls unlock, or am I missing something?

Sorry if this question has a obvious answer. It is sure that the destructors will be called when a process is killed?

Sorry, I explained incorrectly. There is nothing obvious about these complex system programming issues. You can translate into exceptions most of the "problems" that will cause a process to terminate. Access violation, division by zero, stack overflow, etc. This way you can make sure the destructors are called thanks to stack unwinding. This is not a silver bullet but you will have some sort of cover. On top of that, if you add a handler for memory allocation error, you have yourself a reasonable guarantee that you will fail gracefully. Of course when the process is terminated by the OS, nothing is called. From the kernel point of view, a process is just a structure with handles referencing kernel objects. When it says... "DIE !", all threads cease to be scheduled, virtual memory gets unmapped and everything created by the process is destroyed, except, as you noticed, shared objects such as a named mutex. Since your threads are no longer scheduled, no chance that the code of your destructor is run. Dang! It's not even referenced by the VMM anymore! AAAaahhhhh. Tar pit. If the process terminates by itself, functions registered with atexit and destructors of static variables are called. Now the question is against which kind of process termination do you wish to protect yourself? On which platform? If you wish to protect against the user terminating your process there's not much you can do as Ion explained. You can remove termination rights, but again, if the user has got root/administrator privileges it can just sudo his way to your demise. Kind regards. -- EA

Dmitry Goncharov

25 Mar 25 Mar

8:40 a.m.

...

...
Sorry if this question has a obvious answer. It is sure that the destructors will be called when a process is killed?

Sorry, I explained incorrectly. There is nothing obvious about these complex system programming issues.

You can translate into exceptions most of the "problems" that will cause a process to terminate. Access violation, division by zero, stack overflow, etc.

This way you can make sure the destructors are called thanks to stack unwinding. This is not a silver bullet but you will have some sort of cover. On top of that, if you add a handler for memory allocation error, you have yourself a reasonable guarantee that you will fail gracefully.

Of course when the process is terminated by the OS, nothing is called. From the kernel point of view, a process is just a structure with handles referencing kernel objects. When it says... "DIE !", all threads cease to be scheduled, virtual memory gets unmapped and everything created by the process is destroyed, except, as you noticed, shared objects such as a named mutex.

Since your threads are no longer scheduled, no chance that the code of your destructor is run. Dang! It's not even referenced by the VMM anymore! AAAaahhhhh. Tar pit.

If the process terminates by itself, functions registered with atexit and destructors of static variables are called.

Now the question is against which kind of process termination do you wish to protect yourself? On which platform?

If you wish to protect against the user terminating your process there's not much you can do as Ion explained. Even in this situation you can do something. After a thread (or a

Edouard A. wrote: process) which holds a mutex locked gets killed any attempt to unlock the mutex will (and should) fail with EPERM. To resume using the mutex you can initialize it again. In other words, you can invoke pthread_mutex_init() on this mutex. This will unlock the mutex for you. This is sort of a dirty trick. If a thread had a mutex locked when it was killed it is very likely that the data (that the mutex protects) is in some inconsistent state. So, besides unlocking the mutex (by the means of pthread_mutex_init()) you also have to restore the data.

...

You can remove termination rights, but again, if the user has got root/administrator privileges it can just sudo his way to your demise.

Kind regards.

BR, Dmitry

Edouard A.

8:53 a.m.

On Wed, 25 Mar 2009 11:40:29 +0300, Dmitry Goncharov <dgoncharov@unison.com> wrote:

...

Even in this situation you can do something. After a thread (or a process) which holds a mutex locked gets killed any attempt to unlock the mutex will (and should) fail with EPERM. To resume using the mutex you can initialize it again. In other words, you can invoke pthread_mutex_init() on this mutex. This will unlock the mutex for you. This is sort of a dirty trick. If a thread had a mutex locked when it was killed it is very likely that the data (that the mutex protects) is in some inconsistent state. So, besides unlocking the mutex (by the means of pthread_mutex_init()) you also have to restore the data.

Yes, if you know that the mutex is abandoned in the first place, which requires robust mutexes, if I understood correctly. I think that there is no Windows specific implementation of interprocess mutex. This could be useful since the API guarantees you to return WAIT_ABANDONED when you wait on a mutex in this case. It's not a lot of work, I offer my help if needed. -- EA

Kim Barrett

9:53 a.m.

At 11:40 AM +0300 3/25/09, Dmitry Goncharov wrote:

...

Even in this situation you can do something. After a thread (or a process) which holds a mutex locked gets killed any attempt to unlock the mutex will (and should) fail with EPERM.

Per POSIX, if the mutex type is PTHREAD_MUTEX_NORMAL, unlock by non-owner invokes undefined behavior.

...

To resume using the mutex you can initialize it again. In other words, you can invoke pthread_mutex_init() on this mutex. This will unlock the mutex for you.

Again per POSIX, reinitializing an initialized mutex invokes undefined behavior, as does destroying a locked mutex. Just as an example of the problems, what happens to all the threads that were blocked waiting for the mutex that was just reinitialized out from under them.

...

This is sort of a dirty trick. If a thread had a mutex locked when it was killed it is very likely that the data (that the mutex protects) is in some inconsistent state. So, besides unlocking the mutex (by the means of pthread_mutex_init()) you also have to restore the data.

There are applications where that kind of roll-back is possible and even necessary, just perhaps tricky to write. That's why there is interest in so-called "robust" mutexes.

Dmitry Goncharov

10:19 a.m.

Kim Barrett wrote:

...

At 11:40 AM +0300 3/25/09, Dmitry Goncharov wrote:

...
Even in this situation you can do something. After a thread (or a process) which holds a mutex locked gets killed any attempt to unlock the mutex will (and should) fail with EPERM.

Per POSIX, if the mutex type is PTHREAD_MUTEX_NORMAL, unlock by non-owner invokes undefined behavior.

...
To resume using the mutex you can initialize it again. In other words, you can invoke pthread_mutex_init() on this mutex. This will unlock the mutex for you.

Again per POSIX, reinitializing an initialized mutex invokes undefined behavior, as does destroying a locked mutex.

Just as an example of the problems, what happens to all the threads that were blocked waiting for the mutex that was just reinitialized out from under them.

...
This is sort of a dirty trick. If a thread had a mutex locked when it was killed it is very likely that the data (that the mutex protects) is in some inconsistent state. So, besides unlocking the mutex (by the means of pthread_mutex_init()) you also have to restore the data.

There are applications where that kind of roll-back is possible and even necessary, just perhaps tricky to write. That's why there is interest in so-called "robust" mutexes.

You are exactly right. Your points prove that this is a dirty trick. BR, Dmitry

5924

Age (days ago)

5926

Last active (days ago)

List overview

Download

13 comments

6 participants

participants (6)

Dmitry Goncharov
Edouard A.
Ion Gaztañaga
Kim Barrett
Vicente Botet Escriba
vicente.botet