[boost.thread] at_thread_exit must be serialized?

Dear Mike, could you please give me the reasoning why the cleanup handlers that are specified in at_thread_exit must be serialized? It turns out that this is caused by a mutex located in on_thread_exit. Essentially this prohibits more than an exit handler chain to run at the same time. While there might be sound reasons for this I'd like to know please. How did I run into this: I wanted to bind the lifetime of a second thread to the one who started it. So I did soemthing like: void start(void) { if (!at_thread_exit(stop)) syslog_thrd = new cthread::cancelable_thread(&syslog_run); } The stop function beeing declared as: void stop(void) { if (syslog_thrd) { syslog_thrd->cancel(); syslog_thrd->join(); delete syslog_thrd; syslog_thrd = 0; } } When my first thread is exiting now, the stop is getting called properly. However my cancel causes the thread to exit (calling into its exit handler chain in turn). This will of course cause deadlock (altough in an non obvious way) since stop also has been called from the exit handler chain. The thread cannot be joined because it does not really end. You see? Perhaps you even can suggest another solution to my problem? Regards, Roland

Roland wrote:
Dear Mike,
could you please give me the reasoning why the cleanup handlers that are specified in at_thread_exit must be serialized?
They don't; the mutex is to protect access to other global variables. Most of them were experimental and have been eliminated, but there's still one important one left: tls_key.
It turns out that this is caused by a mutex located in on_thread_exit. Essentially this prohibits more than an exit handler chain to run at the same time. While there might be sound reasons for this I'd like to know please.
I've adjusted the area where the mutex is locked to fix this problem. Mike

On Fri, 06 Aug 2004 11:31:08 -0400 Michael Glassford <glassfordm@hotmail.com> wrote:
They don't; the mutex is to protect access to other global variables. Most of them were experimental and have been eliminated, but there's still one important one left: tls_key.
I am afraid in the meantime I did find another reason why (but I think its not a serious one): at_thread_exit may not access the list while the handlers are running. But this is odd usage anyways, isn't it?
I've adjusted the area where the mutex is locked to fix this problem.
Thank you, I will try this. BTW.: I am still thinking about the leakage problem: I tried to forcibly allocate/deallocate the tss_data pointer, but MFC still shows leakage. I think I know very well why. So the on_process_init/ on_process_term idea was not of use in this case. In the meantime I had another thought: Do you think the problem could be solved by refernce counting: say using a shared_ptr ? Roland

Roland wrote:
On Fri, 06 Aug 2004 11:31:08 -0400 Michael Glassford <glassfordm@hotmail.com> wrote:
They don't; the mutex is to protect access to other global variables. Most of them were experimental and have been eliminated, but there's still one important one left: tls_key.
I am afraid in the meantime I did find another reason why (but I think its not a serious one): at_thread_exit may not access the list while the handlers are running. But this is odd usage anyways, isn't it?
The change I made should fix this case, too. The mutex is now unlocked whenever cleanup handlers are being run.
I've adjusted the area where the mutex is locked to fix this problem.
Thank you, I will try this.
BTW.: I am still thinking about the leakage problem:
I tried to forcibly allocate/deallocate the tss_data pointer, but MFC still shows leakage. I think I know very well why. So the on_process_init/ on_process_term idea was not of use in this case.
Is it possible that MFC is running its leak check before the code releasing everything is run?
In the meantime I had another thought: Do you think the problem could be solved by refernce counting: say using a shared_ptr ?
Sorry, I'm not sure which problem you mean--the leak or the recursion problem (already fixed)--or how reference counting would fix it. Mike

On Fri, 06 Aug 2004 12:06:37 -0400 Michael Glassford <glassfordm@hotmail.com> wrote:
The change I made should fix this case, too. The mutex is now unlocked whenever cleanup handlers are being run.
I've adjusted the area where the mutex is locked to fix this problem.
Thank you, I will try this.
You didn't yet check this in did you? Or could it be, that the anonymous CVS access is skewed in time?
Is it possible that MFC is running its leak check before the code releasing everything is run?
I think this is the case. The leak-check ca nbe seen to be triggered by a global destructor call of a process object. So it isn't obvious when the releasing code should be run. But if we can count on the prerequisite that any threads (including the main thread) have called their on_thread_exit handlers, before global dtors are beeing run, only a single threaded problem remains. I think this can be tackled then by reference counting. Because the leak checker of MFC has to be correct for global objects, we simply need to destroy the shared tss_data when the last destructor of tss has gone. I think this could work similar to the reference counting you already have implemented in tss_hooks for the exit handler chain.
Sorry, I'm not sure which problem you mean--the leak or the recursion problem (already fixed)--or how reference counting would fix it.
I meant the leakage problem. Roland

On Fri, 06 Aug 2004 12:06:37 -0400 Michael Glassford <glassfordm@hotmail.com> wrote:
The change I made should fix this case, too. The mutex is now unlocked whenever cleanup handlers are being run.
I've adjusted the area where the mutex is locked to fix this
"Roland" <roland.schwarz@chello.at> wrote in message news:20040806164129.GBDQ9307.viefep19-int.chello.at@speedsnail... problem.
Thank you, I will try this.
You didn't yet check this in did you?
Yes, I did.
Or could it be, that the anonymous CVS access is skewed in time?
Is it possible that MFC is running its leak check before the code releasing everything is run?
I think this is the case. The leak-check ca nbe seen to be triggered by a global destructor call of a process object. So it isn't obvious when the releasing code should be run. But if we can count on the prerequisite that any threads (including
Yes, unfortunately it is. the
main thread) have called their on_thread_exit handlers, before global dtors are beeing run, only a single threaded problem remains. I think this can be tackled then by reference counting. Because the leak checker of MFC has to be correct for global objects, we simply need to destroy the shared tss_data when the last destructor of tss has gone. I think this could work similar to the reference counting you already have implemented in tss_hooks for the exit handler chain.
Which I've since pretty much removed. It seemed unnecessary and perhaps unreliable. It also had the problem of freeing too often: if you created a thread that used tss, then it exited, then you created another, then it exited, etc., TlsAlloc and TlsFree would be called over and over again.
Sorry, I'm not sure which problem you mean--the leak or the recursion problem (already fixed)--or how reference counting would fix it.
I meant the leakage problem.
OK. Mike
participants (2)
-
Michael Glassford
-
Roland