
Hi Boosters, I've tried several times to contact Bill Kempf about moving forward with threads, even leaving him messages on his home answering machine. I've had no reply. It's important that we make an effort to move Boost towards a uniform licensing scheme, and the threads library is an especially important one to do that for. We've been talking about significant restructuring in this library; is it likely that we'll get to a point where the original code can be thrown out? That may be the only way we'll be able to change the license. IMO the docs really *need* to be thrown out and re-done, even if the design were to stay substantially the same. Thoughts? -- Dave Abrahams Boost Consulting http://www.boost-consulting.com

David Abrahams wrote:
Hi Boosters,
I've tried several times to contact Bill Kempf about moving forward with threads, even leaving him messages on his home answering machine. I've had no reply.
That's too bad.
It's important that we make an effort to move Boost towards a uniform licensing scheme, and the threads library is an especially important one to do that for. We've been talking about significant restructuring in this library; is it likely that we'll get to a point where the original code can be thrown out?
Probably the most complex part to rewrite would be the win32 condition variable implementation and perhaps the read/write mutex implementation (though Bill didn't write that in the first place, and the original author has given permission to use the Boost license for his code, so maybe it's not necessary). The mutex, thread, and tss implementations are, in most places, fairly thin (though some think not thin enough) wrappers over platform APIs. Also, I think in a rewrite it would make sense to ditch the MPTasks implementation and assume Mac OS applications will use the pthreads implementation. Unless someone wants to step forward and volunteer to help with that part.
That may be the only way we'll be able to change the license. IMO the docs really *need* to be thrown out and re-done, even if the design were to stay substantially the same.
Any particular complaints?
Thoughts?
It would be a shame to have to duplicate work due to licensing issues, but if it must be done, it makes sense to do it now (well, post-1.32.0, of course). I'd be willing to attempt it. Mike

On Monday 19 July 2004 12:00 pm, Michael Glassford wrote:
David Abrahams wrote:
Hi Boosters,
I've tried several times to contact Bill Kempf about moving forward with threads, even leaving him messages on his home answering machine. I've had no reply.
That's too bad.
Anyone for a Boost Road Trip? <g>
Also, I think in a rewrite it would make sense to ditch the MPTasks implementation and assume Mac OS applications will use the pthreads implementation. Unless someone wants to step forward and volunteer to help with that part.
Is the author of the MPTasks implementation (Mac Murrett) unavailable?
It would be a shame to have to duplicate work due to licensing issues, but if it must be done, it makes sense to do it now (well, post-1.32.0, of course). I'd be willing to attempt it.
This is _very_ unfortunate. Doug

Doug Gregor wrote:
On Monday 19 July 2004 12:00 pm, Michael Glassford wrote:
David Abrahams wrote:
Hi Boosters,
I've tried several times to contact Bill Kempf about moving forward with threads, even leaving him messages on his home answering machine. I've had no reply.
That's too bad.
Anyone for a Boost Road Trip? <g>
Does that mean you're volunteering? <g>
Also, I think in a rewrite it would make sense to ditch the MPTasks implementation and assume Mac OS applications will use the pthreads implementation. Unless someone wants to step forward and volunteer to help with that part.
Is the author of the MPTasks implementation (Mac Murrett) unavailable?
Don't know. All I know is that I once asked for someone to compile a fairly simple change I had made or if anyone was even using the MPTasks implementation, and got zero responses. I still don't know for certain if the change even compiles or not. That doesn't look promising if more involved help is required.
It would be a shame to have to duplicate work due to licensing issues, but if it must be done, it makes sense to do it now (well, post-1.32.0, of course). I'd be willing to attempt it.
This is _very_ unfortunate.
That I'm willing to attempt it, or that it might be necessary? To make sure I was perfectly clear, I'd really very much rather not. It would be _much_ more productive to re-use what already exists even if a complete redesign were made. I meant only that I'm willing if it's absolutely necessary. I truly hope it's not. Mike

On Monday 19 July 2004 1:13 pm, Michael Glassford wrote:
Doug Gregor wrote:
Is the author of the MPTasks implementation (Mac Murrett) unavailable?
Don't know. All I know is that I once asked for someone to compile a fairly simple change I had made or if anyone was even using the MPTasks implementation, and got zero responses. I still don't know for certain if the change even compiles or not. That doesn't look promising if more involved help is required.
We should probably chase him down to see if we can the license changed, even if we end up dropping support for it.
It would be a shame to have to duplicate work due to licensing issues, but if it must be done, it makes sense to do it now (well, post-1.32.0, of course). I'd be willing to attempt it.
This is _very_ unfortunate.
That I'm willing to attempt it, or that it might be necessary?
That it might be necessary. Doug

On Monday 19 July 2004 12:00 pm, Michael Glassford wrote:
David Abrahams wrote:
Hi Boosters,
I've tried several times to contact Bill Kempf about moving forward with threads, even leaving him messages on his home answering machine. I've had no reply.
That's too bad.
Anyone for a Boost Road Trip? <g>
Also, I think in a rewrite it would make sense to ditch the MPTasks implementation and assume Mac OS applications will use the pthreads implementation. Unless someone wants to step forward and volunteer to help with that part.
Is the author of the MPTasks implementation (Mac Murrett) unavailable?
Mac is around. I will probably see him this week - I can ask him if that's what people want. -- -- Marshall Marshall Clow Idio Software <mailto:marshall@idio.com> It is by caffeine alone I set my mind in motion. It is by the beans of Java that thoughts acquire speed, the hands acquire shaking, the shaking becomes a warning. It is by caffeine alone I set my mind in motion.

On Jul 19, 2004, at 8:03 PM, Marshall Clow wrote:
Is the author of the MPTasks implementation (Mac Murrett) unavailable?
Mac is around. I will probably see him this week - I can ask him if that's what people want. -- -- Marshall
FYI, we've been in touch with Mac Murrett. He's given his permission to move the MPTasks implementation over to the Boost Software License and is available to help out or answer questions. Doug

My first attempt at a reply appears to have disappeared into the ether. Here I go again: Doug Gregor wrote:
On Jul 19, 2004, at 8:03 PM, Marshall Clow wrote:
Is the author of the MPTasks implementation (Mac Murrett) unavailable?
Mac is around. I will probably see him this week - I can ask him if that's what people want. -- -- Marshall
FYI, we've been in touch with Mac Murrett. He's given his permission to move the MPTasks implementation over to the Boost Software License and is available to help out or answer questions.
Thanks (to Doug and to Mac)! I'll make the necessary changes, but I have a question: when updating the copyright statement, should I leave the date as it was (2001) or update it to the current date? Mike

Michael Glassford wrote:
I'll make the necessary changes, but I have a question: when updating the copyright statement, should I leave the date as it was (2001) or update it to the current date?
As it was. You can't change the date of someone else's copyright. (IANAL)

In article <p0620021fbd221d0600d5@[192.168.16.235]>, Marshall Clow <marshall@idio.com> wrote:
On Monday 19 July 2004 12:00 pm, Michael Glassford wrote:
David Abrahams wrote:
Hi Boosters,
I've tried several times to contact Bill Kempf about moving forward with threads, even leaving him messages on his home answering machine. I've had no reply.
That's too bad.
Anyone for a Boost Road Trip? <g>
Also, I think in a rewrite it would make sense to ditch the MPTasks implementation and assume Mac OS applications will use the pthreads implementation. Unless someone wants to step forward and volunteer to help with that part.
Is the author of the MPTasks implementation (Mac Murrett) unavailable?
Mac is around. I will probably see him this week - I can ask him if that's what people want.
MPTasks are layered on top of pthreads, so it should be fine to replace an MPTasks implementation of boost::threads with a pthreads one without hurting anyone, except for clients of boost::threads that depend on the implementation using MPTasks -- but I don't think that assumption is supported. By the way, does the boost::threads API assume that preemptive scheduling, or would it be possible to use boost::threads as an abstraction over cooperative threads? meeroh -- If this message helped you, consider buying an item from my wish list: <http://web.meeroh.org/wishlist>

I'm signed up to the list again! Thanks to Bill and Marshall for the friendly nudges. On Jul 19, 2004, at 9:18 PM, Miro Jurisic wrote:
MPTasks are layered on top of pthreads, so it should be fine to replace an MPTasks implementation of boost::threads with a pthreads one without hurting anyone, except for clients of boost::threads that depend on the implementation using MPTasks -- but I don't think that assumption is supported.
The MP task implementation is really only interesting for Carbon and Mac OS 9-only targets; the pthreads implementation is more efficient and probably safer under Mac OS X/Mach-O.
By the way, does the boost::threads API assume that preemptive scheduling, or would it be possible to use boost::threads as an abstraction over cooperative threads?
It was always Bill's thought that the API was portable to cooperative threads, but I do not believe that this was ever implemented or tested. Certainly many threaded solutions will not work without preemption, regardless of the thread library they are built against. M.

Mac Murrett <mmurrett@mac.com> writes:
I'm signed up to the list again! Thanks to Bill and Marshall for the friendly nudges.
You got a nudge from Bill Kempf?? If so, would you mind conveying some messages to him? He won't respond to mine, and we need him to formally transfer maintainership of Boost.Threads and if possible give his blanket permissions to update the license on his files. -- Dave Abrahams Boost Consulting http://www.boost-consulting.com

On Jul 20, 2004, at 4:07 AM, David Abrahams wrote:
Mac Murrett <mmurrett@mac.com> writes:
I'm signed up to the list again! Thanks to Bill and Marshall for the friendly nudges.
You got a nudge from Bill Kempf??
My apologies, that should have read "Jon and Marshall", referring to Jon Kalb. I guess we both have Bill on the mind. M.

Michael Glassford <glassfordm@hotmail.com> writes:
David Abrahams wrote:
Hi Boosters, I've tried several times to contact Bill Kempf about moving forward with threads, even leaving him messages on his home answering machine. I've had no reply.
That's too bad.
It's important that we make an effort to move Boost towards a uniform licensing scheme, and the threads library is an especially important one to do that for. We've been talking about significant restructuring in this library; is it likely that we'll get to a point where the original code can be thrown out?
Probably the most complex part to rewrite would be the win32 condition variable implementation and perhaps the read/write mutex implementation (though Bill didn't write that in the first place, and the original author has given permission to use the Boost license for his code, so maybe it's not necessary).
Sounds like it's not.
The mutex, thread, and tss implementations are, in most places, fairly thin (though some think not thin enough) wrappers over platform APIs.
Also, I think in a rewrite it would make sense to ditch the MPTasks implementation and assume Mac OS applications will use the pthreads implementation. Unless someone wants to step forward and volunteer to help with that part.
That may be the only way we'll be able to change the license. IMO the docs really *need* to be thrown out and re-done, even if the design were to stay substantially the same.
Any particular complaints?
Too many to write down ;-)
Thoughts?
It would be a shame to have to duplicate work due to licensing issues, but if it must be done, it makes sense to do it now (well, post-1.32.0, of course). I'd be willing to attempt it.
Thanks for that! -- Dave Abrahams Boost Consulting http://www.boost-consulting.com

It is fine to hear, that there are serious attempts now to move forward with the threads library. If you need a voluteer I would like to put an offer. I am used to the Win32 API and have a special interest in work on the TSS part. (Especially the annoying DLL issue.) I had already some contacts to Michael Glassford who seems for me to be a very careful one when it comes to design considerations. However it would be too bad if we needed a _complete_ rewrite of the threading library. Could anyone please drop me some lines why this should be necessary at all? Is the new license thus incompatible to the old one? Wasn't there already an established procedure of how to move forward in case of a dropped out contributor? As far as I understood the procedures, it should be sufficient to have the code undergo a new review. If the old license does not explicitely forbid this we can reuse the proven good parts of the code (and there is lot of it!) in the renewd threads lib. Any thoughts? Roland

Roland wrote:
However it would be too bad if we needed a _complete_ rewrite of the threading library. Could anyone please drop me some lines why this should be necessary at all?
A complete re-write should not be necessary, but the current model of associating a single method with a thread could be implemented as a specialization of a more object-oriented approach of modelling a thread as a class instance. Another advantage of modelling a thread as instance of a class rather than just a function is that you may elect to implement thread-specific variables as thread class members. This works nicely if you enforce a design rule of a single thread per instance of the Thread class. Some of the things you must do in POSIX style C thread programming, or in the Windows API for thread programming just aren't necessary when using C++ in an object-oriented fashion. C++ provides some natural protections when using threads. For example, POSIX threads must be join'ed to reap the status of the thread and release the thread stack and other resources associated with the thread. C++ destructors come to the rescue here. Another issue to address is the different abstractions that the different operating systems use. For example, Windows allows a thread to be created in a suspended state -- but emulating suspended initial state is fairly easy to implement in POSIX threads. I wouldn't, though, want to saddle non-Windows programmers with an emulation they don't need nor desire. This suggests a low-level Thread class that is almost a lowest common denominator on all platforms, with child classes that model the extensions of the various platforms. The extensions are implemented natively on the platforms that offer it, and emulated on those that don't. I'd like to see the synchronization objects separated into their own portion of the threads library because some synchronization methods require non-C++ standard implementations -- the volatile keyword isn't the same thing as a LOCK prefix or direct assembly language access to CAS instructions. Work in this area should be directed at suggested changes in the C++ standard to address multiprocessor problems. I mention these issues because I don't think the original Boost threads library fully considered them. I'd like to see these issues and others discussed before we jump in and implement. Glen

Dayton wrote:
Roland wrote:
However it would be too bad if we needed a _complete_ rewrite of the threading library. Could anyone please drop me some lines why this should be necessary at all?
[following paragraph re-ordered from end]
I mention these issues because I don't think the original Boost threads library fully considered them. I'd like to see these issues and others discussed before we jump in and implement.
Actually, I think a good many of the issues you raise below were considered and the current design was consciously chosen, though not necessarily completely followed through on because the implementation was never completed. For instance:
A complete re-write should not be necessary,
As indicated in other posts, the re-write may be necessary not for technical but for licensing reasons.
but the current model of associating a single method with a thread could be implemented as a specialization of a more object-oriented approach of modelling a thread as a class instance.
There disadvantages to that approach, which have been detailed elsewhere (I believe in other postings in this mailing list, in comp.programming.threads, etc.). One disadvantage, for example, is that you you have to do a lot of extra work to create a new type of worker thread, while in the Boost.Threads approach there's only one type of thread class and you do a different type of work simply by writing a new thread function. I've implemented threads the way you suggest before (more than once) and prefer the Boost.Threads approach, as I'm sure many (though not all) others do.
Another advantage of modelling a thread as instance of a class rather than just a function is that you may elect to implement thread-specific variables as thread class members. This works nicely if you enforce a design rule of a single thread per instance of the Thread class. Some of the things you must do in POSIX style C thread programming, or in the Windows API for thread programming just aren't necessary when using C++ in an object-oriented fashion.
True, you could implement thread-specific variables as thread class members, but why would you want to? One immediately obvious disadvantage is that you would then need to modify the thread class whenever you needed to add, change, or remove a thread-specific global variable.
C++ provides some natural protections when using threads. For example, POSIX threads must be join'ed to reap the status of the thread and release the thread stack and other resources associated with the thread. C++ destructors come to the rescue here.
Certainly destructors can help with releasing or resources, but you would still need join or some other communication mechanism to reap the status and/or result of the thread. If you prefer another communication mechanism, you can use one by making the function you pass to the thread object call the mechanism before returning and making the creating thread not call join.
Another issue to address is the different abstractions that the different operating systems use. For example, Windows allows a thread to be created in a suspended state -- but emulating suspended initial state is fairly easy to implement in POSIX threads. I wouldn't, though, want to saddle non-Windows programmers with an emulation they don't need nor desire. This suggests a low-level Thread class that is almost a lowest common denominator on all platforms, with child classes that model the extensions of the various platforms. The extensions are implemented natively on the platforms that offer it, and emulated on those that don't.
I'd like to see the synchronization objects separated into their own portion of the threads library because some synchronization methods require non-C++ standard implementations -- the volatile keyword isn't the same thing as a LOCK prefix or direct assembly language access to CAS instructions. Work in this area should be directed at suggested changes in the C++ standard to address multiprocessor problems.
Mike

"Michael Glassford" <glassfordm@hotmail.com> wrote in message news:ce68e5$48s$1@sea.gmane.org...
Dayton wrote:
Roland wrote:
but the current model of associating a single method with a thread could be implemented as a specialization of a more object-oriented approach of modelling a thread as a class instance.
There disadvantages to that approach, which have been detailed elsewhere (I believe in other postings in this mailing list, in comp.programming.threads, etc.). One disadvantage, for example, is that you you have to do a lot of extra work to create a new type of worker thread, while in the Boost.Threads approach there's only one type of thread class and you do a different type of work simply by writing a new thread function.
Which can be a state-ful function object. I'm not sure if the OP is under the impression that only functions can be passed Jeff F

Michael Glassford wrote:
True, you could implement thread-specific variables as thread class members, but why would you want to? One immediately obvious disadvantage is that you would then need to modify the thread class whenever you needed to add, change, or remove a thread-specific global variable.
Maybe that's not so bad idea, assuming that thread class is template and thread variables are members of its policy class? B.

Dayton wrote:
Another advantage of modelling a thread as instance of a class rather than just a function is that you may elect to implement thread-specific variables as thread class members. This works nicely if you enforce a design rule of a single thread per instance of the Thread class.
The main use of thread-specific storage is to make a library that keeps global state thread-neutral, giving every thread its own state (thread-specific free lists in a pool allocator are a good example), or to decentralize the thread-specific state in several separate modules. You can execute a Thread class in a boost::thread by using boost::thread th( boost::bind( &Thread::run, &my_thread_object ) ); but, of course, there are the lifetime issues to consider, which your proposal does not address, unless you want to enforce join() in the destructor. There is also boost::shared_ptr<Thread> my_thread_ptr( new MyThread( ... ) ); boost::thread th( boost::bind( &Thread::run, my_thread_ptr ) ); which would automatically keep *my_thread_ptr alive until the thread exits _and_ you release my_thread_ptr.

Peter Dimov wrote: [...]
boost::thread th( boost::bind( &Thread::run, &my_thread_object ) );
but, of course, there are the lifetime issues to consider, which your proposal does not address, unless you want to enforce join() in the destructor.
Nah. In the *pre*destructor. ;-) Well, ability to subclass "real" thread-objects (without any virtual "run" javanese) is usefull but not critical. Nice to have, so to say. regards, alexander.

On Tue, 27 Jul 2004 01:04:32 -0700 Dayton <mvglen04-cnews@yahoo.com> wrote:
Another advantage of modelling a thread as instance of a class rather than just a function is that you may elect to implement thread-specific variables as thread class members.
I don't think so. The purpose of TSS variables is to be global to your namespace. Member variables are not. To access them you would need some pointer, possibly declared globally. To access this pointer in turn one again needs to resort to TSS. Roland

Roland wrote:
I don't think so. The purpose of TSS variables is to be global to your namespace. Member variables are not. To access them you would need some pointer, possibly declared globally. To access this pointer in turn one again needs to resort to TSS.
Possibly, but not necessarily. In C++ we tend to replace global variables with singletons. In case of thread object we could have class having single instace per thread ("thread singleton") and static member function returning instance for current thread ("thread accessor"). This instance does not need to be stored on thread local storage. It can be stored in global table (simplest option would be to use thread IDs as table index), thus still available after thread destruction (eg. for deterministic destruction and pulling results). B.

Bronek Kozicki wrote:
Roland wrote:
I don't think so. The purpose of TSS variables is to be global to your namespace. Member variables are not. To access them you would need some pointer, possibly declared globally. To access this pointer in turn one again needs to resort to TSS.
Possibly, but not necessarily. In C++ we tend to replace global variables with singletons. In case of thread object we could have class having single instace per thread ("thread singleton") and static member function returning instance for current thread ("thread accessor"). This instance does not need to be stored on thread local storage. It can be stored in global table (simplest option would be to use thread IDs as table index), thus still available after thread destruction (eg. for deterministic destruction and pulling results).
It looks like your global table indexed by thread ID is just reinvention of TSS, which is guaranteed to be slower. And, BTW, you can't have simple vector indexed by thread id -- because in NTPL thread ids do not start with zero. So you'd need std::map, which might be really slow, or some custom map, which doesn't exist yet. - Volodya

Vladimir Prus wrote:
It looks like your global table indexed by thread ID is just reinvention of TSS, which is guaranteed to be slower.
And, BTW, you can't have simple vector indexed by thread id -- because in NTPL thread ids do not start with zero. So you'd need std::map, which might be really slow, or some custom map, which doesn't exist yet.
I'd rather waste someting like 64KB memory and make it simple and fast table of pointers. B.

Bronek Kozicki wrote:
Vladimir Prus wrote:
It looks like your global table indexed by thread ID is just reinvention of TSS, which is guaranteed to be slower.
And, BTW, you can't have simple vector indexed by thread id -- because in NTPL thread ids do not start with zero. So you'd need std::map, which might be really slow, or some custom map, which doesn't exist yet.
I'd rather waste someting like 64KB memory and make it simple and fast table of pointers.
64KB? We had a code which worked by creating such table, and after upgrade to NPTL is started throwing "out of memory" errors when resizing this table -- on a 2GB box. The thread ids can be very large values with NPTL, so you *need* either map or some hashing. - Volodya

Vladimir Prus wrote:
64KB? We had a code which worked by creating such table, and after upgrade to NPTL is started throwing "out of memory" errors when resizing this table
NTPL == Native Posix Thread Library? I think that we are seeking for a way to avoid shortcomings of native Win32 thread local storage, not Posix one. Maybe NTPL does not need such workarounds? B.

Bronek Kozicki wrote:
Vladimir Prus wrote:
64KB? We had a code which worked by creating such table, and after upgrade to NPTL is started throwing "out of memory" errors when resizing this table
NTPL == Native Posix Thread Library?
Yes.
I think that we are seeking for a way to avoid shortcomings of native Win32 thread local storage, not Posix one. Maybe NTPL does not need such workarounds?
Probably I misunderstood your previous posts, but it sounded like you say that TSS is not needed at all, and should be replaced with singletons and hand-crafted thread-id -> instance mapping. I believe that on Linux, using provided TSS facilities is better, and so I need thread_specific_ptr. And to write portable code, that class should be present everywhere. Now, do you propose that thread_specific_ptr implementation is changed to use hand-crafted mapping? How that will help with cleanup issues? - Volodya

I think that we are seeking for a way to avoid shortcomings of native Win32 thread local storage, not Posix one. Maybe NTPL does not need such workarounds?
Probably I misunderstood your previous posts, but it sounded like you say that TSS is not needed at all, and should be replaced with singletons and hand-crafted thread-id -> instance mapping. I believe that on Linux, using provided TSS facilities is better, and so I need thread_specific_ptr. And to write portable code, that class should be present everywhere.
I agree, and it should be able to clean up non-boost threads as well - consider what happens if you're writing a library, and not a program - you then have no control over who creates threads or how they do so. I believe it would be unacceptable to say "you can only use this library if you also use Boost.Threads"; in such cases thread_specific_ptr is exceptionally useful IMO. John.

On Wed, 28 Jul 2004 13:25:21 +0100 John Maddock <john@johnmaddock.co.uk> wrote:
I agree, and it should be able to clean up non-boost threads as well - consider what happens if you're writing a library, and not a program - you then have no control over who creates threads or how they do so. I believe it would be unacceptable to say "you can only use this library if you also use Boost.Threads"; in such cases thread_specific_ptr is exceptionally useful IMO.
While having heard this argument quite often, I still do not understand how this can ever work (in a reliable way). Say you have a peace of code located in module A that is starting a thread by whatever means (System thread e.g.). Now in your module B you have a (global) TSS variable. A calls into B. When should the TSS be initialized? On first use? I am afraid that is not trivially possible, since you (as far as I understand on WIndows) will need to acquire a slot for this variable when the process starts up (normally in DLL_PROCESS_ATTACH). You might argue that you can have a global object in B whose constructor can handle this. Ok this would work. You can acquire the Slot during initialization, and can acquire your data on the heap on first use. _But_ what is about c-runtime initialisation? Perhaps I am a little bit to paranoid about this but the per thread data holds items like
unsigned long _tdoserrno; /* _doserrno value */ unsigned int _fpds; /* Floating Point data segment */ unsigned long _holdrand; /* rand() seed value */ char * _token; /* ptr to strtok() token */
to cite only a few, that are clearly related to a couple of runtime lib functions. How can you ever be sure the c-runtime will behave correctly when these are not available/initialized for your thread (note you did never call _beginthread!). So what I think is: One should not assume that your piece of code is thread safe (located in B) when making use of the c-runtime AND you do not know wheter the original call (located in A) started the thread by _beginthread. You even should not try to use the runtime lib in that case. A TSS implementation thus should "require" a certain (minimum) method for starting threads. I propose a method that has correct runtime initialization at minimum. When this requirement is not fullfilled your module can not be said "truly" thread safe. Note: If the TSS implementation AND the objects contained in a thread slot are independent from c-runtime the above discussion does not apply. But practically I think this is a much too severe restriction! View it another way: the TSS cleanup code has to consider any resources it is using at cleanup time. c-runtime struct _tiddata, beeing one of the important ones. Roland

Roland wrote:
Now in your module B you have a (global) TSS variable. A calls into B. When should the TSS be initialized? On first use? I am afraid that is not trivially
You do not need to worry about local thread data for CRT, at least if you use MSVC. Here is explanation: your library B contains function DllMain, and it's called just when library got loaded, right? Wrong. If you linked with CRT (no matter which one - statically or dynamically linked, multi- or single-threaded, with debug symbols or without), your entry point is actually _DllMainCRTStartup (use dumpbin.exe to see it). You may find this function in CRT sources (if you installed it together with MSVC). In multi-threaded, statically linked version of CRT this function will indirectly call TlsAlloc (or actually FlsAlloc - if present in "kernel32.dll", at least in CRT delivered with MSVC71). Index for CRT local storage has been initialized, what remains is CRT data for each thread. You are concerned that thread started with CreateThread (or whatever other API different than "blessed" _beginthread) might not contain valid CRT data in it's local storage. It's true, but CRT runtime will receive DLL_THREAD_ATTACH notifications. It will arrive to _DllMainCRTStartup if you linked CRT statically, otherwise to dynamic CRT runtime library (BTW, it's entry point is called _CRTDLL_INIT). There is still possibility that you disabled thread notifications in your library *and* linked CRT statically. That might be a problem, but if you grep for _getptd function (again, it's in sources of CRT in your MSVC directory) you will understand there's no problem at all. This function is the only way that other CRT functions use to access local thread storage. It will create and initialize CRT data if current thread does not contain it yet. Thus all CRT functions requiring access to local thread data will actually gain it without problem - it will be created no later than on first access. Now about release. If you used dynamically linked CRT library, that's no problem at all. It will receive DLL_THREAD_DETACH and execute _freeptd in order to free CRT thread data. If you used statically linked CRT runtime, this notification will arrive to _DllMainCRTStartup (which will execute _freeptd). There is a problem however if you disable thread notifications in your library *and* use statically linked CRT. This will result in memory leak in CRT, documented here : http://msdn.microsoft.com/library/en-us/dllproc/base/createthread.asp (see Remarks, bottom of page) When your library got unloaded from memory, TlsFree (or FlsFree) is called indirectly in entry point of your library, ie. _DllMainCRTStartup (if you used statically linked CRT). In case of dynamic CRT library, it's calling TlsAlloc and TlsFree (or FlsAlloc and FlsFree) in its own entry point on DLL_PROCESS_ATTACH and DLL_PROCESS_DETTACH - you may find it in crtlib.c in your MSVC directory. B.

On Thu, 29 Jul 2004 00:24:33 +0200 Bronek Kozicki <brok@rubikon.pl> wrote:
You do not need to worry about local thread data for CRT, at least if you use MSVC. Here is explanation: your library B contains function DllMain, and it's called just when library got loaded, right? Wrong.
Wrong. I assumed nothing about how B is packaged. To be more precise I Assume static lib (no DLL). And I am linking statically to c-runtime.
you linked with CRT (no matter which one - statically or dynamically linked, multi- or single-threaded, with debug symbols or without), your entry point is actually _DllMainCRTStartup
Wrong. The CRT exhibits mainCRTStartup (which is also the entrypoint specified in the PE header for the exe.) If, and only if you are mapping additional DLL's into your process, their entry points (usually specified by the linker as DllMainCRTSTartup) their entry point functions are called. (Process/Thread Attach/Detach) mainCRTStartup initializes the lib, and calls your main in turn.
You may find this function in CRT sources (if you installed it together with MSVC). In multi-threaded, statically linked version of CRT this function will indirectly call TlsAlloc (or actually FlsAlloc - if present in "kernel32.dll", at least in CRT delivered with MSVC71).
I don't believe so. TlsAlloc is called from within mainCRTStartup for the main thread. (Not from DllMainCRTStartUp.)
.... You are concerned that thread started with CreateThread (or whatever other API different than "blessed" _beginthread) might not contain valid CRT data in it's local storage. It's true, but CRT runtime will receive DLL_THREAD_ATTACH notifications. It will arrive to _DllMainCRTStartup if you linked CRT statically,
How should this ever work? When you link the CRT statically there is no chance of even exposing DllMainCRTStartup! You may only assign _one_ entrypoint into your executable, and this is already mainCRTStartup. I already have experimented with this, because the documentation is vague on when the entry of an exe is getting called. It says something about that it is getting called on thread attach/detach also. But nope. It really is not. The main problem is that when linking your external A which is calling on you. You cannot be sure your c-runtime is on the correct thread, except it has been created with _beginthread of the CRT. Give it a try. Set a breakpoint to your entry function while statically linked. I bet you will never arrive there :-( Roland

Roland wrote:
Wrong. I assumed nothing about how B is packaged. To be more precise I Assume static lib (no DLL). And I am linking statically to c-runtime.
I misunderstood you, then. I thought that you meant B to be dynamic link library.
mainCRTStartup initializes the lib, and calls your main in turn.
Correct.
You may find this function in CRT sources (if you installed it together with MSVC). In multi-threaded, statically linked version of CRT this function will indirectly call TlsAlloc (or actually FlsAlloc - if present in "kernel32.dll", at least in CRT delivered with MSVC71).
I don't believe so. TlsAlloc is called from within mainCRTStartup for the main thread. (Not from DllMainCRTStartUp.)
_DllMainCRTStartup is in crtdll.c (when your DLL library is dynamically linked to CRT) or dllcrt0.c (when your DLL library is statically linked to CRT). In both files you will find _CRT_INIT, called from _DllMainCRTStartup just before user entry point (DllMain) is executed. In _CRT_INIT in dllcrt0.c you will find call to _mtinit, which is defined in tidtable.c. Here you will find "__tlsindex = FLS_ALLOC(&_freefls)". But I think that's all irrelevant in your case, as your library B (being static library) does not have an entry point.
How should this ever work? When you link the CRT statically there is no chance of even exposing DllMainCRTStartup! You may only assign _one_ entrypoint into your executable, and this is already mainCRTStartup.
Of course this will work, if you linked CRT statically to your *dynamic* library. Things are much different with static libraries, as you pointed out. In this case linker will see symbol "__tlsindex" in static runtime library that your static library is linked with (ie. indirectly in your library), and when linking your library with code of your client (ie. user of your static library) it will match it with __tlsindex in his CRT (whenever he's using static or dynamic runtime). When his mainCRTStartup (or DllMainCRTStartup) is executed, this symbol will be initialized by call to TlsAlloc (ie. inside function _mtinit). This is enough to initialize TLS index for CRT. Because this index is shared (linker did this) with static CRT your static library is linked with, effectively it will share TLS with CRT linked (statially or dynamically) with code of your client. At least this is my idea of what linker is doing, and I've confirmed that with debugger. And this is enough; execution of TlsSetValue and initialization of CRT data is delayed to first access of CRT TLS data through _getptd function (defined in tidtable.c).
Give it a try. Set a breakpoint to your entry function while statically linked. I bet you will never arrive there :-(
... and it's not required by CRT to initialize its stuff in TLS. B.

On Thu, 29 Jul 2004 14:02:18 +0200 Bronek Kozicki <brok@rubikon.pl> wrote:
When his mainCRTStartup (or DllMainCRTStartup) is executed, this symbol will be initialized by call to TlsAlloc (ie. inside function _mtinit).
Don't know what you mean by this? There is _ever_ only one mainCRTStartup in any single executable! If there where more consequently main() would be multiply invoked, which is definitely not the case. My only point is: whoever causes your module beeing called _must_ have been starting his thread by _beginthread. Because only this allocates the struct _tiddata. Note that this has nothing to do with TlsAlloc which of course, as you pointed out already has delivered a slot index at process startup. (Whoever called mainCRTStartup.)
This is enough to initialize TLS index for CRT. Because this index is shared (linker did this) with static CRT your static library is linked with, effectively it will share TLS with CRT linked (statially or dynamically) with code of your client. At least this is my idea of what linker is doing, and I've confirmed that with debugger. And this is enough; execution of TlsSetValue and initialization of CRT data is delayed to first access of CRT TLS data through _getptd function (defined in tidtable.c).
Ahhh. Now I understand what you are talking about! You convinced me. But then what happens at thread exit? During DLL_THREAD_DETACH the _tiddata has already been deallocated during _endthread. Now if one again accesses one of the CRT functions during DLL_THREAD_DEATCH this would cause _getptd to reallocate _tiddata for this thread. Hmm. This will also account for a memory leak. Wouldn't it? But it definitely will not result in an access violation. While the situation is not as bad as I was afraid it could be, it still would be better to be able to call the tls cleanup code from within _endthread, and not DLL_THREAD_DETACH. Thank you for this insightful discussion so far. Roland

Roland wrote: [...]
this would cause _getptd to reallocate _tiddata for this thread. Hmm. This will also account for a memory leak. Wouldn't it?
POSIX TSD has a "concept" of PTHREAD_DESTRUCTOR_ITERATIONS (the value must be at least 4). This is meant to solve the problem of dependencies without imposing any order on TSD destructors invocation chain. MS should fix their TLS and have at least 5 PTHREAD_DESTRUCTOR_ITERATIONS (one can be used to for their C/C++ runtime ;-) ). regards, alexander.

Roland wrote:
TlsSetValue and initialization of CRT data is delayed to first access of CRT TLS data through _getptd function (defined in tidtable.c).
Ahhh. Now I understand what you are talking about! You convinced me.
:-)
During DLL_THREAD_DETACH the _tiddata has already been deallocated during _endthread. Now if one again accesses one of the CRT functions during DLL_THREAD_DEATCH this would cause _getptd to reallocate _tiddata for this thread. Hmm. This will also account for a memory leak. Wouldn't it?
Probably yes. But : * if CRT is statically linked to dynamic library, user's DllMain is called prior to releasing CRT data * if CRT is dynamically linked to program or some of it's libraries, I think that it will receive DLL_THREAD_DETACH notification after user dynamic libraries handled it (reverse order of loading libraries). I'm not sure about that. You would still have memory leak if using staticaly linked CRT *and* nobody is handling DLL_THREAD_DETACH for this TLS CRT index. But that does not apply to Windows Server 2003, see below. In case of Windows Server 2003 there is callback function called FlsCallback, documented here: http://msdn.microsoft.com/library/en-us/dllproc/base/flscallback.asp It's called inside ThreadExit, in context of exiting thread, *after* all DLLs received notification DLL_THREAD_DETACH. Actually, CRT in MSVC71 is using Fls* functions if available; function _freefls (defined by CRT) will be called to *reliably* release CRT TLS (or actually fiber local storage) on Windows Server 2003. You may find that in file tidtable.c (available with MSVC71, directory vc7/crt/src). I'm seeking for a way to achieve similar functionality in little older versions of Windows - but it might be quite difficult.
While the situation is not as bad as I was afraid it could be, it still would be better to be able to call the tls cleanup code from within _endthread, and not DLL_THREAD_DETACH.
Hm, _endthread is executed before notifications DLL_THREAD_DETACH are sent in ExitThread. Or more precisely - ExitThread is called at the end of _endthread, and there is no way one could execute code after this call. Thus if user DllMain is expecting to find data that has been released in _endthread, that might be a problem (eg. in case of CRT - another allocation inside _getptd). B.

I agree, and it should be able to clean up non-boost threads as well - consider what happens if you're writing a library, and not a program - you then have no control over who creates threads or how they do so. I believe it would be unacceptable to say "you can only use this library if you also use Boost.Threads"; in such cases thread_specific_ptr is exceptionally useful IMO.
While having heard this argument quite often, I still do not understand how this can ever work (in a reliable way).
Say you have a peace of code located in module A that is starting a thread by whatever means (System thread e.g.).
Now in your module B you have a (global) TSS variable. A calls into B. When should the TSS be initialized? On first use? I am afraid that is not trivially possible, since you (as far as I understand on WIndows) will need to acquire a slot for this variable when the process starts up (normally in DLL_PROCESS_ATTACH). You might argue that you can have a global object in B whose constructor can handle this. Ok this would work. You can acquire the Slot during initialization, and can acquire your data on the heap on first use.
_But_ what is about c-runtime initialisation? Perhaps I am a little bit to paranoid about this but the per thread data holds items like
unsigned long _tdoserrno; /* _doserrno value */ unsigned int _fpds; /* Floating Point data segment */ unsigned long _holdrand; /* rand() seed value */ char * _token; /* ptr to strtok() token */
to cite only a few, that are clearly related to a couple of runtime lib functions.
How can you ever be sure the c-runtime will behave correctly when these are not available/initialized for your thread (note you did never call _beginthread!).
I agree violently that the calling thread must have been started with __beginthread. I can't think of any threading lib where this is not the case - even exception handling doesn't work in a thread unless it was created with __beginthread. This is very different however from requiring that the thread was created with Boost.Threads, which is what was being suggested. As for C runtime initialisation, I believe that this is a non-issue: the slot is allocated (TlsAlloc) when the lib is loaded, but no C runtime calls are made at this point. Calls to the C runtime are only made when the slot is first *accessed* at which point the object gets created with a call to new, if it does not already exist. When it comes to cleanup we may have a problem if the exe is statically linked, but if it's dynamically linked then Boost.Threads cleanup *will* get called before that of the runtime on which it depends. I need to think some more about statically linked executables... John.

"John Maddock" <john@johnmaddock.co.uk> wrote in message news:020001c4755b$7dfad200$33570352@fuji...
I agree, and it should be able to clean up non-boost threads as well - consider what happens if you're writing a library, and not a program -
[snip]
How can you ever be sure the c-runtime will behave correctly when these are not available/initialized for your thread (note you did never
call _beginthread!).
I agree violently that the calling thread must have been started with __beginthread. I can't think of any threading lib where this is not the case - even exception handling doesn't work in a thread unless it was created with __beginthread. This is very different however from requiring that the thread was created with Boost.Threads, which is what was being suggested.
I was also always under the same impression, but reading the MSDN docs on CreateThread it sounds a bit less restrictive than I remembered: "A thread in an executable that is linked to the static C run-time library (CRT) should use _beginthread and _endthread for thread management rather than CreateThread and ExitThread. Failure to do so results in small memory leaks when the thread calls ExitThread. Another work around is to link the executable to the CRT in a DLL instead of the static CRT. Note that this memory leak only occurs from a DLL if the DLL is linked to the static CRT and a thread calls the DisableThreadLibraryCalls function. Otherwise, it is safe to call CreateThread and ExitThread from a thread in a DLL that links to the static CRT." Also, in what way doesn't exception handling work if the thread wasn't created using __beginthread? I tried the following snippet under WinXP/VC7.1 which ran silently (yes, under debug config): -- #include <cassert> #define WIN32_LEAN_AND_MEAN #include <windows.h> DWORD WINAPI ThreadFn(void*) { struct local : std::exception { local(bool* p_dtor_called) : _p_dtor_called(p_dtor_called) { *_p_dtor_called = false; } ~local() { *_p_dtor_called = true; } bool* _p_dtor_called; }; bool dtor_called = false; try { local l(&dtor_called); throw std::logic_error("test"); } catch (std::exception& e) { assert(dtor_called); return 0; } assert(false); return 1; } int main(int, char*[]) { HANDLE h = CreateThread(NULL, 0, &ThreadFn, NULL, 0, NULL); WaitForSingleObject(h, INFINITE); return 0; } -- I don't pretend that I understand all subtleties of exception handling, but the above at least _seems_ to work ok. // Johan

I was also always under the same impression, but reading the MSDN docs on CreateThread it sounds a bit less restrictive than I remembered:
"A thread in an executable that is linked to the static C run-time library (CRT) should use _beginthread and _endthread for thread management rather than CreateThread and ExitThread. Failure to do so results in small memory leaks when the thread calls ExitThread. Another work around is to link the executable to the CRT in a DLL instead of the static CRT. Note that this memory leak only occurs from a DLL if the DLL is linked to the static CRT and a thread calls the DisableThreadLibraryCalls function. Otherwise, it is safe to call CreateThread and ExitThread from a thread in a DLL that links to the static CRT."
I never knew that, in fact I never knew that DisableThreadLibraryCalls function existed at all, thanks for the pointer!
Also, in what way doesn't exception handling work if the thread wasn't created using __beginthread? I tried the following snippet under WinXP/VC7.1 which ran silently (yes, under debug config):
I don't pretend that I understand all subtleties of exception handling, but the above at least _seems_ to work ok.
You could be right, actually my memory is a little hazy on this one, but I think the problem with Borland C++ rather than MSVC++ which I *think* implements exception handling on top of structured exceptions (which should always work), where as other compiles do their own thing and rely on some global state to work correctly. Thanks, John.

John Maddock wrote:
implements exception handling on top of structured exceptions (which should always work), where as other compiles do their own thing and rely on some global state to work correctly.
MSVC also implements C++ exceptions in terms of SEH. Not most efficient and (as I have heard) MSVC8 will deliver option to replace it with some other mechanism. I do not think that MSVC would have any problems with exceptions handling if started with CreateThread. I believe that MS tried to make _beginthread, _beginthreadex and CreateThread somehow "equal" citizens, also in C++ world. Now about SEH exceptions leaking from thread function started with CreateThread - it's handling is implemented in core of operating system (I'm talking here about Windows NT line of OSes). It's totally independent of CRT and will result in process termination (ExitProcess). If thread has been started with CRT (_beginthread) result is virtually the same - exception won't reach OS, but only because CRT will catch it and call _exit. B.

On Wed, 28 Jul 2004 13:25:21 +0100 John Maddock <john@johnmaddock.co.uk> wrote:
I agree, and it should be able to clean up non-boost threads as well - consider what happens if you're writing a library, and not a program - you then have no control over who creates threads or how they do so. I believe it would be unacceptable to say "you can only use this library if you also use Boost.Threads"; in such cases thread_specific_ptr is exceptionally useful IMO.
While having heard this argument quite often, I still do not understand how this can ever work (in a reliable way).
Say you have a peace of code located in module A that is starting a thread by whatever means (System thread e.g.).
Now in your module B you have a (global) TSS variable. A calls into B. When should the TSS be initialized? On first use? I am afraid that is not
"Roland" <roland.schwarz@chello.at> wrote in message news:20040728142223.OHAY27266.viefep18-int.chello.at@speedsnail... trivially
possible, since you (as far as I understand on WIndows) will need to acquire a slot for this variable when the process starts up (normally in DLL_PROCESS_ATTACH).
You can do it at any time. The MSDN docs only say "TLS indexes are _typically_ allocated during ... intialization.". // Johan
participants (15)
-
Alexander Terekhov
-
Bronek Kozicki
-
David Abrahams
-
Dayton
-
Doug Gregor
-
Jeff Flinn
-
Johan Nilsson
-
John Maddock
-
Mac Murrett
-
Marshall Clow
-
Michael Glassford
-
Miro Jurisic
-
Peter Dimov
-
Roland
-
Vladimir Prus