Re: [boost] Windows MSVC thread exit handler for staticly linked Boost.Thread

2 Aug 2004

      Roland wrote:
...
On Sun, 01 Aug 2004 21:28:30 -0500 "Aaron W. LaFramboise"
<aaronrabiddog51@aaronwl.com> wrote:
...
The trouble is that we actually want the callback field to point to
___xl_a + 4, not at ___xl_a itself, which is zero.  The tlssup.obj that
is part of MSVC6's runtime libraries gets this wrong, and so the TLS
callback list pointed to by the TLS directory looks like this:
[null pointer][user-specified callback][null pointer]
In other words, whatever bit of the PE loader responsible for calling
the TLS callbacks hits that first null, thinks (correctly) that it is
the end of the list, and never calls any of the callbacks.
Unfortunately this seems to be not enough. Even when the first entry
is not zero (tried to set it to dummy stub) my callback allocated via
.CRT$XLB does not get called, there are still lot of zeroes in between.
It seems as if the linker has a minimum size when emiting data segements.
...
...
In any case, the runtime fixup you mention appears to fix this, although
it might be doing more work that it needs to (you just need to replace
that first zero with something valid).  I must admit I am slightly
concerned about modifying an PE image at runtime to make it correct, for
the same reason I am concerned with hooking in production code.
Hmm. Do I really modify the PE image? I am just modifying data that
...
data segment. Isn't this ok? What are your concerns?
You are right; this sort of modification is probably not that big of a
deal.  I am just concerned that, since this data forms part of the PE
...
...
Also, on a unrelated point, is there any reason to use the .CRT$XC
section directly rather than use a global class?  They're really the
same thing, but the entire .CRT section is undocumented, and not very
well known.  It seems unnecessary to depend upon that interface if there
is no particular gain from using it over the well-defined interface.
Yes. We need to run after the last global c-tor has finished to be
sure that
all our thread_specific_ptr ctors have been called. This is because I
rely on
the well documented behaviour of the atexit function that I use at
This is odd.  This doesn't happen for me.  The only problem I have is
the single leading 4-byte zero caused by the improper tlssup.obj code.
As I mentioned, I can swap the correct and incorrect tlssup.obj's out,
and see the problem come and go.  I am not sure what PECOFF says, but it
is the expected and well-known behavior that all sections with a $ in
them are sorted and merged before being linked.  I have no idea how
zeroes could come to be in the middle.

lies in the
format, a debugger pr loader might rely on it not being altered at runtime.

this time
...
to schedule the main-thread exit. This in turn will cause that it will
be run before
any of the global dtors (e.g. thread_specific_ptr) get to live. This
is to solve for
the wrong dtor ordering problem.
Yes, you are right here.  This does seem to be the best way to get this
right.
...
And then: should there be any unforeseeable problems in the future we
always
can revert back to the piggy-pack-DLL solution without (noticeable)
effect for the
user of the library.
I suppose once its in CVS and a few testers run it, we'll see if
anything major breaks.  But, I wonder if any of the testers are running,
for eg, Windows 95.  A lot of developers still care about users on
Windows 95, but probably none are still using it.  That OS is almost a
decade old now, and is older than the copy of the PECOFF specification
that I have.  It would be good to have a confirmation that this works there.

Another thing I am curious about is the case where Boost.Thread is
statically linked to user code in a DLL.  Will this TLS callback code
still work there?

Aaron W. LaFramboise