BOOST_THREAD_HAS_EINTR_BUG...
Not really sure where to bring this issue up. I have a singleton globals object with a session manager, share manager, etc. that is destroyed atexit() on final reference release. There are a few mutexes in the implementation. This functions fine on Windows, MacOS, etc - the globals object is released during atexit() and shutdown proceeds normally. Under Linux when BOOST_THREAD_HAS_EINTR_BUG is defined we SIGABRT at this: ~mutex() { BOOST_VERIFY(!posix::pthread_mutex_destroy(&m)); } The verify fails and we abort(). When I define BOOST_THREAD_HAS_NO_EINTR_BUG we do not abort - which means that posix::pthread_mutex_destroy() returns 0 in the case where BOOST_THREAD_HAS_NO_EINTR_BUG is defined and returns non-zero in the case of it not being defined. Guidance? Also point me at a better place for this discussion if there is such a place. Thanks, David Bien
Em sex., 26 de abr. de 2024 às 13:51, David Bien via Boost
Under Linux when BOOST_THREAD_HAS_EINTR_BUG is defined we SIGABRT at this:
~mutex() { BOOST_VERIFY(!posix::pthread_mutex_destroy(&m)); }
How is the code for BOOST_THREAD_HAS_NO_EINTR_BUG? Are UNIX signals being involved? -- Vinícius dos Santos Oliveira https://vinipsmaker.github.io/
Apparently abort() calls raise(SIGABRT) under Linux.
/* This prints an "Assertion failed" message and aborts. */
extern void __assert_fail (const char *__assertion, const char *__file,
unsigned int __line, const char *__function)
__THROW __attribute__ ((__noreturn__));
And I guess that one way to abort is to throw an __attribute__ ((__noreturn__)).
Or, perhaps more likely, it is throwing into no catch() and then throwing itself causes an abort() due to no handler being present above atexit().
bien
________________________________
From: Vinícius dos Santos Oliveira
Under Linux when BOOST_THREAD_HAS_EINTR_BUG is defined we SIGABRT at this:
~mutex() { BOOST_VERIFY(!posix::pthread_mutex_destroy(&m)); }
How is the code for BOOST_THREAD_HAS_NO_EINTR_BUG? Are UNIX signals being involved? -- Vinícius dos Santos Oliveira https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fvinipsmaker.github.io%2F&data=05%7C02%7C%7Ca3335f7546f240f606ae08dc661307e6%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C638497479127208113%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=Z4o45i1vO7qyfLVFSBFq8purCVRwxe7CVU3nu7lxarA%3D&reserved=0https://vinipsmaker.github.io/
Vinícius dos Santos Oliveira wrote:
Em sex., 26 de abr. de 2024 às 13:51, David Bien via Boost
escreveu: Under Linux when BOOST_THREAD_HAS_EINTR_BUG is defined we SIGABRT at this:
~mutex() { BOOST_VERIFY(!posix::pthread_mutex_destroy(&m)); }
What is the return value of pthread_mutex_destroy here?
How is the code for BOOST_THREAD_HAS_NO_EINTR_BUG? Are UNIX signals being involved?
It's used here: https://github.com/boostorg/thread/blob/aec18d337f41d8e3081ee65f5cf3b5090179... Basically, the relevant code is: #ifdef BOOST_THREAD_HAS_EINTR_BUG BOOST_FORCEINLINE BOOST_THREAD_DISABLE_THREAD_SAFETY_ANALYSIS int pthread_mutex_destroy(pthread_mutex_t* m) { int ret; do { ret = ::pthread_mutex_destroy(m); } while (ret == EINTR); return ret; } #else BOOST_FORCEINLINE BOOST_THREAD_DISABLE_THREAD_SAFETY_ANALYSIS int pthread_mutex_destroy(pthread_mutex_t* m) { return ::pthread_mutex_destroy(m); } #endif The only scenario in which the former can return nonzero when the latter doesn't would be if it returns EINTR, and then tries to destroy the mutex again, and then get EINVAL because the mutex has already been destroyed. But this doesn't make sense because in this case the other function would have returned EINTR, which is also nonzero, so it should also fail the VERIFY. So I'm at a loss here. More printf debugging (printing the return values in both cases) is needed.
Em sex., 26 de abr. de 2024 às 14:25, Peter Dimov
So I'm at a loss here. More printf debugging (printing the return values in both cases) is needed.
Same here. EINTR should only happen for UNIX signals AFAIK (and SIGABRT only happens much later... after the call to pthread_mutex_detroy already returned, so it's not important yet). However glibc isn't even using syscalls for pthread_mutex_destroy from what I'm seeing: https://github.com/bminor/glibc/blob/master/nptl/pthread_mutex_destroy.c -- Vinícius dos Santos Oliveira https://vinipsmaker.github.io/
It seems there is some type of race condition involved. It both works, fails, and hangs regardless of BOOST_THREAD_HAS_EINTR_BUG defined. Weird cuz I never saw anything of the sort in weeks of testing under Windows, and a couple days of testing under MacOS - both using boost synchronization objects.
When it aborts the return is EBUSY.
I am going to change it to use STL just as a data point. The condition variable/mutex pattern here:
~CShareMgr()
{
{
boost::lock_guard< boost::mutex > lock( m_cvMutex );
m_fRunning = false;
}
m_cv.notify_one();
if ( m_thread.joinable() ) // wait on cleanup thread here.
{
m_thread.join();
}
}
void CShareMgr::_CleanupThread()
{
while ( true )
{
boost::unique_lock< boost::mutex > lock( m_cvMutex );
if ( m_cv.wait_for( lock, boost::chrono::minutes( 1 ), [ this ] { return !m_fRunning; } ) )
return;
lock.unlock();
_CleanupThreadVolatile();
}
}
void CShareMgr::_CleanupThreadVolatile() volatile
{
std::vector
Em sex., 26 de abr. de 2024 às 13:51, David Bien via Boost
escreveu: Under Linux when BOOST_THREAD_HAS_EINTR_BUG is defined we SIGABRT at this:
~mutex() { BOOST_VERIFY(!posix::pthread_mutex_destroy(&m)); }
What is the return value of pthread_mutex_destroy here?
How is the code for BOOST_THREAD_HAS_NO_EINTR_BUG? Are UNIX signals being involved?
It's used here: https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fboostorg%2Fthread%2Fblob%2Faec18d337f41d8e3081ee65f5cf3b5090179ab0e%2Finclude%2Fboost%2Fthread%2Fpthread%2Fpthread_helpers.hpp%23L17&data=05%7C02%7C%7C8a131f3fbb1546d5138208dc6615d40e%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C638497491169778552%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=X%2B767f7giYai7qMLtBCTd65AfHC2HKYAnCAHMciSoqU%3D&reserved=0https://github.com/boostorg/thread/blob/aec18d337f41d8e3081ee65f5cf3b5090179... Basically, the relevant code is: #ifdef BOOST_THREAD_HAS_EINTR_BUG BOOST_FORCEINLINE BOOST_THREAD_DISABLE_THREAD_SAFETY_ANALYSIS int pthread_mutex_destroy(pthread_mutex_t* m) { int ret; do { ret = ::pthread_mutex_destroy(m); } while (ret == EINTR); return ret; } #else BOOST_FORCEINLINE BOOST_THREAD_DISABLE_THREAD_SAFETY_ANALYSIS int pthread_mutex_destroy(pthread_mutex_t* m) { return ::pthread_mutex_destroy(m); } #endif The only scenario in which the former can return nonzero when the latter doesn't would be if it returns EINTR, and then tries to destroy the mutex again, and then get EINVAL because the mutex has already been destroyed. But this doesn't make sense because in this case the other function would have returned EINTR, which is also nonzero, so it should also fail the VERIFY. So I'm at a loss here. More printf debugging (printing the return values in both cases) is needed. _______________________________________________ Unsubscribe & other changes: https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Flists.boost.org%2Fmailman%2Flistinfo.cgi%2Fboost&data=05%7C02%7C%7C8a131f3fbb1546d5138208dc6615d40e%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C638497491169793535%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=%2BSvaxKUj05Yr91tn80uRbt%2BCRgK%2BnXuViPDrBTOVIU4%3D&reserved=0http://lists.boost.org/mailman/listinfo.cgi/boost
Sure, but then there must be a bug in my code... do you see one? It's pretty simple - and once again has worked for many weeks of testing under other platforms.
________________________________
From: Peter Dimov
When it aborts the return is EBUSY.
EBUSY means that you're destroying a locked mutex, so the abort is legitimate.
David Bien wrote:
Sure, but then there must be a bug in my code... do you see one? It's pretty simple - and once again has worked for many weeks of testing under other platforms.
No, I don't see a bug in the code you posted. (Assuming that m_thread is joinable and executes _CleanupThread.) Incidentally, underscore followed by a capital letter is reserved to the implementation and you're not supposed to use such identifiers.
________________________________
From: Peter Dimov
Sent: Friday, April 26, 2024 11:15 AM To: 'David Bien' ; boost@lists.boost.org Subject: RE: [boost] BOOST_THREAD_HAS_EINTR_BUG... David Bien wrote:
When it aborts the return is EBUSY.
EBUSY means that you're destroying a locked mutex, so the abort is legitimate.
"Incidentally, underscore followed by a capital letter is reserved to the
implementation and you're not supposed to use such identifiers."
Yeah, people have been telling me that for 35 years of writing C++ - the fact is that the compiler mangles every single name - as you well know... I literally have never encountered a single problem due to my coding style.
Yeah, the pattern looks correct to me as well - I mean if the m_cv and m_cvMutex are passing then the thread should exit, no?
It seems foolproof - the code that is - no reason it shouldn't exit the cleanup thread. I'd love for someone to poke holes in this theory.
bien
________________________________
From: Peter Dimov
Sure, but then there must be a bug in my code... do you see one? It's pretty simple - and once again has worked for many weeks of testing under other platforms.
No, I don't see a bug in the code you posted. (Assuming that m_thread is joinable and executes _CleanupThread.) Incidentally, underscore followed by a capital letter is reserved to the implementation and you're not supposed to use such identifiers.
________________________________
From: Peter Dimov
Sent: Friday, April 26, 2024 11:15 AM To: 'David Bien' ; boost@lists.boost.org Subject: RE: [boost] BOOST_THREAD_HAS_EINTR_BUG... David Bien wrote:
When it aborts the return is EBUSY.
EBUSY means that you're destroying a locked mutex, so the abort is legitimate.
Ok, so I asked copilot - LMAO!!!
It suggested, and I didn't think it made sense but I tried it, to change the class member declaration order to destruct the std::thread first.
std::weak_ptr< CGlobals > m_wpGlobals;
std::unordered_map< boost::uuids::uuid, CSession, std::hash< boost::uuids::uuid > > m_sessions;
boost::shared_mutex m_mutex;
std::chrono::minutes m_nMinutesCleanupSession;
bool m_fRunning;
boost::mutex m_cvMutex;
boost::condition_variable m_cv;
std::thread m_thread;
I had it somewhere in the middle - but importantly above the m_cv and m_cvMutex.
I ran it thirty times or so and didn't get a race or an EBUSY. Granted this is no guarantee - but it always failed before within a few runs.
Oh, and I also changed this - but things still failed after this change.
~CShareMgr()
{
{
boost::lock_guard< boost::mutex > lock( m_cvMutex );
m_fRunning = false;
m_cv.notify_one(); // moved this up from below the block.
}
if ( m_thread.joinable() )
{
m_thread.join();
}
}
Does the change to destructing the std::thread() first make any sense to anyone at all? Cuz it doesn't make sense to me - the presumption is that the thread wasn't in a joinable state when joinable() was called?
A bit confused,
bien
________________________________
From: Boost
Sure, but then there must be a bug in my code... do you see one? It's pretty simple - and once again has worked for many weeks of testing under other platforms.
No, I don't see a bug in the code you posted. (Assuming that m_thread is joinable and executes _CleanupThread.) Incidentally, underscore followed by a capital letter is reserved to the implementation and you're not supposed to use such identifiers.
________________________________
From: Peter Dimov
Sent: Friday, April 26, 2024 11:15 AM To: 'David Bien' ; boost@lists.boost.org Subject: RE: [boost] BOOST_THREAD_HAS_EINTR_BUG... David Bien wrote:
When it aborts the return is EBUSY.
EBUSY means that you're destroying a locked mutex, so the abort is legitimate.
_______________________________________________ Unsubscribe & other changes: https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Flists.boost.org%2Fmailman%2Flistinfo.cgi%2Fboost&data=05%7C02%7C%7Cebf66098729e4302d7cc08dc661f0555%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C638497530615589376%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=CI1k7on8sa8nfV%2Be8KRpGAzRXmeNiW%2FVrQL7ZCGB0%2Fc%3D&reserved=0http://lists.boost.org/mailman/listinfo.cgi/boost
participants (3)
-
David Bien
-
Peter Dimov
-
Vinícius dos Santos Oliveira