Le 15/03/12 20:25, John Rocha a écrit :
On 3/15/2012 11:41 AM, Vicente J. Botet Escriba wrote:
Hi,
does this means that there are some restrictions on the code that can be executed on the boost::thread_interrupted catch handler or some guidelines that s/he must follow? Is it legal that the user throws itself a boost::thread_interrupted exception? Is there a way to clear the interrupted flag so that the join() is not interrupted on the boost::thread_interrupted catch handler?
What was wrong with the user code that made the library crash? Which precondition of the library was violated? Maybe some assertions help so that the error is identified as soon as possible.
Best, Vicente
Hello Vicente,
I'd like to point out that the library didn't crash. My code did. This was caused by faulty logic on my part -- a race condition with a very small window of opportunity.
Thanks for the clarification. For a quick read of your post I thought that it was the case. My bad.
main() told T1 to shutdown but it used two(2) methods to inform T1. The first was the thread_interrupt, the second was to set a shutdown flag in T1's object. However, when main uses T1.interupt() that also causes a flag to be set in the boost thread object too.
T1 happens to be in it's T1.check_for_shutdown() routine, after it invoked boost::this_thread::interruption_point() check but before the "if shutdown flag" check.
T1.check_for_shutdown() // this will throw if the boost interruption requested flag is set boost::this_thread::interruption_point();
**** the code is here when main() ran it T1.shutdown() ***
if (T1.m_shutdown) { throw boost::thread_interrupted }
So I detected the shutdown with MY logic m_shutdown, not with an interruption point. Consequently the boost thread's interruption is still pending, and will remain pending until another interruption point is hit.
In my code, the thread_interrupted is caught by a handler that is waiting fot this and it then invokes T1.thread_shutdown() routine. Which for T1 is to send a boost interrupt to T2, and then join on T2.
BUT, join() is an interuption point, my code would now act upon that pending interruption, thowing ANOTHER thread_interrupted, exiting join(), finishing off shutdown logic and then exiting the thread.
So my code would terminate T1 before all of its child threads had terminated.
Now, main legally unblocks from join(), since T1 exited, and then it invokes the destructor on T1, which does cascade destructions ont the T1-T4 objects.
I don't know if there is a possible improvement here as the destructor of the Tx implies the destructor of a thread instance. The Boost.Thread implementation detach a joinable thread on destruction, while the c++11 standard calls to terminate(). I don't know if a call to terminate() would show the issue more clearly in your case.
THIS is what leads to my segmentation fault, called pure virtual, etc. errors. Because there are threads still alive running code based on that object, access data from that object, which was just deleted out from underneath it.
Can the user throw a boost::thread_interrupted exception? To be honest this wasn't the problem. Throwing this doesn't impact the setting of the threads "do I have an interrupt pending" flag. I could have thrown my own custom exception and I still would have encountered this problem. The problem is that my "check_for_shutdown" logic assumed that when it exited no interrupts would be pending.
I see.
Is there a way to clear the interrupted flag so that the join is not interrupted? I would argue that clearing the flag isn't correct. One could add logic such as: try { boost::this_thread::interruption_point(): } catch (boost::thread_interrupted &) { } join()
Which would clear the flag.
Thanks for the trick.
However, while I am blocked in join(), some other thread could send me another interrupt which would break me out of join(). Not what I wanted. I feel that Anthony's sugestion of blocking interrupts for this method is the appropriate way to go.
I agree that disabling interrupts is the correct way to avoid new interruptions. But you need to clear them before.
I don't think any assertions could help, but maybe a documentation improvement? Or maybe it's there and I didn't read carefully enough?
I agree now. Asserts could not help in this case. IIUC, a boost::thread_interrupted catch handler should clear the interrupt flag (as the boost::thread_interrupted exception can be throw by the user code) and disable interrupts before attempting to use ant interruption point. Are there other ways? Best, Vicente