
Roland wrote:
On Sun, 01 Aug 2004 21:28:30 -0500 "Aaron W. LaFramboise" <aaronrabiddog51@aaronwl.com> wrote:
The trouble is that we actually want the callback field to point to ___xl_a + 4, not at ___xl_a itself, which is zero. The tlssup.obj that is part of MSVC6's runtime libraries gets this wrong, and so the TLS callback list pointed to by the TLS directory looks like this:
[null pointer][user-specified callback][null pointer]
In other words, whatever bit of the PE loader responsible for calling the TLS callbacks hits that first null, thinks (correctly) that it is the end of the list, and never calls any of the callbacks.
Unfortunately this seems to be not enough. Even when the first entry is not zero (tried to set it to dummy stub) my callback allocated via .CRT$XLB does not get called, there are still lot of zeroes in between. It seems as if the linker has a minimum size when emiting data segements.
In any case, the runtime fixup you mention appears to fix this, although it might be doing more work that it needs to (you just need to replace that first zero with something valid). I must admit I am slightly concerned about modifying an PE image at runtime to make it correct, for the same reason I am concerned with hooking in production code.
Hmm. Do I really modify the PE image? I am just modifying data that
data segment. Isn't this ok? What are your concerns? You are right; this sort of modification is probably not that big of a deal. I am just concerned that, since this data forms part of the PE
Also, on a unrelated point, is there any reason to use the .CRT$XC section directly rather than use a global class? They're really the same thing, but the entire .CRT section is undocumented, and not very well known. It seems unnecessary to depend upon that interface if there is no particular gain from using it over the well-defined interface.
Yes. We need to run after the last global c-tor has finished to be sure that all our thread_specific_ptr ctors have been called. This is because I rely on the well documented behaviour of the atexit function that I use at
This is odd. This doesn't happen for me. The only problem I have is the single leading 4-byte zero caused by the improper tlssup.obj code. As I mentioned, I can swap the correct and incorrect tlssup.obj's out, and see the problem come and go. I am not sure what PECOFF says, but it is the expected and well-known behavior that all sections with a $ in them are sorted and merged before being linked. I have no idea how zeroes could come to be in the middle. lies in the format, a debugger pr loader might rely on it not being altered at runtime. this time
to schedule the main-thread exit. This in turn will cause that it will be run before any of the global dtors (e.g. thread_specific_ptr) get to live. This is to solve for the wrong dtor ordering problem.
Yes, you are right here. This does seem to be the best way to get this right.
And then: should there be any unforeseeable problems in the future we always can revert back to the piggy-pack-DLL solution without (noticeable) effect for the user of the library.
I suppose once its in CVS and a few testers run it, we'll see if anything major breaks. But, I wonder if any of the testers are running, for eg, Windows 95. A lot of developers still care about users on Windows 95, but probably none are still using it. That OS is almost a decade old now, and is older than the copy of the PECOFF specification that I have. It would be good to have a confirmation that this works there. Another thing I am curious about is the case where Boost.Thread is statically linked to user code in a DLL. Will this TLS callback code still work there? Aaron W. LaFramboise