
Roland wrote:
On Sun, 01 Aug 2004 16:41:18 -0500 "Aaron W. LaFramboise" <aaronrabiddog51@aaronwl.com> wrote:
Just as a FYI, I now have a copy of MSVC6, and am working on this.
MSVC6 does, in fact, have the necessary support, but there is a bug (I had noticed this before, and this was one of the reasons I wasn't able to offer more information a few months ago, and I had entirely forgotten about it. Oops.). Fortunately, the bug is in the runtime library, not in the linker or anything else.
Yes the bug is, that the TLS handlers must be in a contiguous area between the __xl_a and __xl_z symbols. I fixed this by running a small piece of code during the startup (in __xi_a .. __xi_z area).
I think that the problem is something else. The linker sorts everything correctly and puts it into a contiguous section. The problem is apparently that noone ever used the __xl_a code (until 7.1.. what does this mean?) and so never noticed its broken. The linker merges the sections like this: .CRT$XLA ___xl_a: .long 0 ; provided by tlssup.obj in the runtime .CRT$XL? ; B through Y pointer_to_tls_callback ___xl_z: .CRT$XLZ .long 0 ; provided by tlssup.obj also A relocation is generated that assigns ___xl_a to the TLS callback field of the TLS directory. The storage referred to by ___xl_z null-terminates the list, as specified by PECOFF. The trouble is that we actually want the callback field to point to ___xl_a + 4, not at ___xl_a itself, which is zero. The tlssup.obj that is part of MSVC6's runtime libraries gets this wrong, and so the TLS callback list pointed to by the TLS directory looks like this: [null pointer][user-specified callback][null pointer] In other words, whatever bit of the PE loader responsible for calling the TLS callbacks hits that first null, thinks (correctly) that it is the end of the list, and never calls any of the callbacks. Apparently someone noticed this, and fixed it for MSVC7.1. If you try to use the MSVC6 tlssup.obj with MSVC7.1, you'll get the same broken behavior. (You can't do the reverse because the objects aren't backwards compatible.) In any case, the runtime fixup you mention appears to fix this, although it might be doing more work that it needs to (you just need to replace that first zero with something valid). I must admit I am slightly concerned about modifying an PE image at runtime to make it correct, for the same reason I am concerned with hooking in production code. It seems a little hackish, and it seems like it might cause suprising behavior. The alternative is to provide an implementation of tlssup.obj that isn't broken, but this is also slightly hackish (although it does at least produce an image that is correct with no runtime fixups needed). I was hoping there might be some sort of way to tweak something or other to make the real MSVC6 tlssup.obj behave correctly, but there does not seem to be any way other than doing some sort of runtime fixup, or flat-out replacing the whole object. In any case, no sort of runtime fixup should be done on anything other than MSVC6, since later versions seem to get it right. On these versions, I think we really should be marking the callbacks const and using bss_seg rather than data_seg. This matches the behavior of the rest of the native TLS support, and I think is more likely to work in general. Also, on a unrelated point, is there any reason to use the .CRT$XC section directly rather than use a global class? They're really the same thing, but the entire .CRT section is undocumented, and not very well known. It seems unnecessary to depend upon that interface if there is no particular gain from using it over the well-defined interface. Aaron W. LFramboise