observing process blocked on find_or_construct

Hi ,
We are trying interprocess scenario, one process created shared memory
objects and killed and second process open the object using
find_or_construct seeing this process blocked at getting lock.
bt as shown below,
#0 __lll_lock_wait () at
../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
#1 0x00007fa6b6733eb6 in _L_lock_941 () from /lib64/libpthread.so.0
#2 0x00007fa6b6733daf in __GI___pthread_mutex_lock
(mutex=mutex@entry=0x7fa6b198b070)
at ../nptl/pthread_mutex_lock.c:113
#3 0x0000000001bc3822 in lock (this=0x7fa6b198b070) at
/x86/include/boost/interprocess/sync/posix/recursive_mutex.hpp:90
#4 lock (this=0x7fa6b198b070) at
/x86/include/boost/interprocess/sync/interprocess_recursive_mutex.hpp:163
#5 scoped_lock (m=..., this=<synthetic pointer>) at
/x86/include/boost/interprocess/sync/scoped_lock.hpp:81
#6 boost::interprocess::segment_manager

Murali Kishore wrote:
Hi ,
We are trying interprocess scenario, one process created shared memory objects and killed and second process open the object using find_or_construct seeing this process blocked at getting lock.
bt as shown below, #0 __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135 #1 0x00007fa6b6733eb6 in _L_lock_941 () from /lib64/libpthread.so.0 #2 0x00007fa6b6733daf in __GI___pthread_mutex_lock (mutex=mutex@entry=0x7fa6b198b070) at ../nptl/pthread_mutex_lock.c:113 #3 0x0000000001bc3822 in lock (this=0x7fa6b198b070) at /x86/include/boost/interprocess/sync/posix/recursive_mutex.hpp:90
Detecting mutexes held by dead processes requires the so-called POSIX robust mutexes. I can see in the Interprocess source code that those are enabled when the macro BOOST_INTERPROCESS_POSIX_ROBUST_MUTEXES is defined. This macro is defined automatically #if (_XOPEN_SOURCE >= 700 || _POSIX_C_SOURCE >= 200809L) https://github.com/boostorg/interprocess/blob/29cee9c6067f1d20ddb6421af15977... but I suppose you can also try defining it manually and see if that helps, because _XOPEN_SOURCE and _POSIX_C_SOURCE are also user macros and you probably aren't defining any of them.

Thanks Peter Dimov, i have added logic unlock in signal handler, now i am
not seeing this issue.
I am seeing one more issue, if i call construct object and do work and
clear in loop of ~30000, i see following error while construct call.
43 #0 set_bits (b=0, n=...) at /x86/include/boost/interp
rocess/offset_ptr.hpp:728
44 #1 set_color (c=<optimized out>, n=...) at /x86/inclu
de/boost/intrusive/detail/rbtree_node.hpp:167
45 #2
boost::intrusive::rbtree_algorithms
Murali Kishore wrote:
Hi ,
We are trying interprocess scenario, one process created shared memory objects and killed and second process open the object using find_or_construct seeing this process blocked at getting lock.
bt as shown below, #0 __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135 #1 0x00007fa6b6733eb6 in _L_lock_941 () from /lib64/libpthread.so.0 #2 0x00007fa6b6733daf in __GI___pthread_mutex_lock (mutex=mutex@entry=0x7fa6b198b070) at ../nptl/pthread_mutex_lock.c:113 #3 0x0000000001bc3822 in lock (this=0x7fa6b198b070) at /x86/include/boost/interprocess/sync/posix/recursive_mutex.hpp:90
Detecting mutexes held by dead processes requires the so-called POSIX robust mutexes. I can see in the Interprocess source code that those are enabled when the macro BOOST_INTERPROCESS_POSIX_ROBUST_MUTEXES is defined.
This macro is defined automatically
#if (_XOPEN_SOURCE >= 700 || _POSIX_C_SOURCE >= 200809L)
https://github.com/boostorg/interprocess/blob/29cee9c6067f1d20ddb6421af15977...
but I suppose you can also try defining it manually and see if that helps, because _XOPEN_SOURCE and _POSIX_C_SOURCE are also user macros and you probably aren't defining any of them.
-- Regards, Murali Kishore

On 1/28/25 15:16, Murali Kishore via Boost wrote:
Thanks Peter Dimov, i have added logic unlock in signal handler, now i am not seeing this issue.
I am seeing one more issue, if i call construct object and do work and clear in loop of ~30000, i see following error while construct call.
It's not enough to just unlock the mutex (or recover it if it was abandoned). You also need to restore the state it was protecting to a consistent state. Which is often unrealistic to do since you don't know which part of the state is corrupted and how to restore it to a consistent state. For example, you don't have the means to repair the segment manager, if its internal object tree is left corrupted, and you don't know whether any of the objects stored in it are half-constructed or otherwise inconsistent. Typically, your best course of action when you detect an abandoned mutex is to scrap the data it protects and start from scratch. And also try hard to not abandon mutexes as much as possible, e.g. don't just kill the process on a signal and let it finish its work on the shared memory first.
participants (3)
-
Andrey Semashev
-
Murali Kishore
-
Peter Dimov