[Serliazation] Thread-safety again

Sean Huang

13 Dec 2006 13 Dec '06

3:24 a.m.

I posted this about a year ago but got no definite answer from Robert. http://thread.gmane.org/gmane.comp.lib.boost.devel/132838/focus=132838 I vaguely remember that there were similar discussions about this issue a few month back but do not know if it had reached any conclusion. Now my programs started crashing because of this problem and I am wondering if there is a good solution. Is making the function scope static objects static member with its caveats the best we could do? Thanks, Sean

Show replies by date

Roland Schwarz

13 Dec 13 Dec

6:38 p.m.

Sean Huang wrote:

...

I posted this about a year ago but got no definite answer from Robert. http://thread.gmane.org/gmane.comp.lib.boost.devel/132838/focus=132838 I vaguely remember that there were similar discussions about this issue a few month back but do not know if it had reached any conclusion. Now my programs started crashing because of this problem and I am wondering if there is a good solution. Is making the function scope static objects static member with its caveats the best we could do?

The short answer: There is no nice solution to this problem. The longer: Basically you would need to protect the static by a mutex, i.e.: foo() { static mutex m; lock lk(m); static myobject mo(); lk.unlock(); } But wait, this does not work, since boost mutex also is not currently statically initializeable. So the only thing you currently can do is the following ugly solution: myobject& getmo() { static myobject mo(); return mo; } foo() { static int flag; run_once(&flag, getmo); myobject& mo = getmo(); } (Pls. don't take the above literally, it hasn't seen a compiler.) You might have a look at the regex tests which are using this aproach. Or you can search for the thread: "Proposal for thread safe initialization of local static and global objects". However we did not adopt this becasue it does not solve all problems, but it might be enough for you. Roland

Robert Ramey

7:19 p.m.

Roland Schwarz wrote:

...

Sean Huang wrote:

...
I posted this about a year ago but got no definite answer from Robert. http://thread.gmane.org/gmane.comp.lib.boost.devel/132838/focus=132838 I vaguely remember that there were similar discussions about this issue a few month back but do not know if it had reached any conclusion. Now my programs started crashing because of this problem and I am wondering if there is a good solution. Is making the function scope static objects static member with its caveats the best we could do?

The short answer: There is no nice solution to this problem.

The longer: Basically you would need to protect the static by a mutex, i.e.:

I'm not convinced its that bad. We don't need a general solution - all we need is a specific one for this particular problem. The alterations I checked into the HEAD rely on known special behavior of the serialization library. The original method was for all globals to use a access functions containing a local static. This was a general solution be fix the order of initialization and addressed problems associated with dynamic loading/unloading of DLLS containing serializable data types. These functions are called things like "get_instance()" But it trips up when mult-tasking. So, an attempt has been make sure that these functions get called at pre-execution time or when a DLL containing serialization code is loaded. The idea is that these will be invoked before multi-tasking is initiated thereby avoiding the race condition. I don't know if this has been achieved as it hard to figure out how to test it. So, though I'm not sure its really fixed, I do believe that this can be fixed by considering the sequence of code execution paths which can really occur with these functions. That is, a general solution is not required and a specific solution is (I believe) possible - if its not done already. As an aside - I'm intrigued about the possibilities of Dave's boost/archive/detail/dynamically_initliazed.hpp as a more general approach to the problem. Robert Ramey

Emil Dotchevski

7:35 p.m.

I apologize if this has been discussed before, but this message caught my attention and I decided to ask: why do you need any statics/globals for the serialization? I understand that you need global class registration to handle the case of reading a base pointer when the stream contains derived object, but when you serialize data you just assume that all classes have been registered ahead of time. Because you don't need mutable access to the class registration table, you don't have thread safety problems. The other piece of data that I can think of is the table that tracks which objects have already been serialized so that when you serialize a second pointer that points to the same object you don't serialize the object again, however this table has to be per-stream because I may have multiple opened streams that I'm reading from. Assuming you don't access the same stream from multiple threads, I see no thread safety problems here either. Am I missing something? ----- Original Message ----- From: "Robert Ramey" <ramey@rrsd.com> To: <boost@lists.boost.org> Sent: Wednesday, December 13, 2006 11:19 AM Subject: Re: [boost] [Serliazation] Thread-safety again

...

Roland Schwarz wrote:

...
Sean Huang wrote:

...
I posted this about a year ago but got no definite answer from Robert. http://thread.gmane.org/gmane.comp.lib.boost.devel/132838/focus=132838 I vaguely remember that there were similar discussions about this issue a few month back but do not know if it had reached any conclusion. Now my programs started crashing because of this problem and I am wondering if there is a good solution. Is making the function scope static objects static member with its caveats the best we could do?

The short answer: There is no nice solution to this problem.

The longer: Basically you would need to protect the static by a mutex, i.e.:

I'm not convinced its that bad. We don't need a general solution - all we need is a specific one for this particular problem. The alterations I checked into the HEAD rely on known special behavior of the serialization library.

The original method was for all globals to use a access functions containing a local static. This was a general solution be fix the order of initialization and addressed problems associated with dynamic loading/unloading of DLLS containing serializable data types. These functions are called things like "get_instance()"

But it trips up when mult-tasking.

So, an attempt has been make sure that these functions get called at pre-execution time or when a DLL containing serialization code is loaded. The idea is that these will be invoked before multi-tasking is initiated thereby avoiding the race condition. I don't know if this has been achieved as it hard to figure out how to test it.

So, though I'm not sure its really fixed, I do believe that this can be fixed by considering the sequence of code execution paths which can really occur with these functions. That is, a general solution is not required and a specific solution is (I believe) possible - if its not done already.

As an aside - I'm intrigued about the possibilities of Dave's boost/archive/detail/dynamically_initliazed.hpp as a more general approach to the problem.

Robert Ramey

_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Robert Ramey

9:30 p.m.

Emil Dotchevski wrote:

...

I apologize if this has been discussed before, but this message caught my attention and I decided to ask: why do you need any statics/globals for the serialization?

The static instantiations are used for a couple of things: a) maintaining table which is used to convert between different pointers to different types of objects at runtime - with or without RTTI existing. See void_cast ... b) To hold a constant string type "export" key as RTTI systems don't provide a portable one. c) To hold pointers to serialization functions for derived pointers. d) To guarentee instantiation of serialization functions not otherwise explicitly referred to. For more information - feel free to preuse the code.

...

I understand that you need global class registration to handle the case of reading a base pointer when the stream contains derived object, but when you serialize data you just assume that all classes have been registered ahead of time.

Not necessarily. The export functionality eliminates the requirement to explicitly register all derived types. Its original implementation initialized the type table upon first usage - the current one intializes the type table on start-up. This one of the changes I've been referring to in my belief that the situation may no longer exist.

...

Because you don't need mutable access to the class registration table, you don't have thread safety problems.

Iff and only iff you can get everthing initialize at pre-main time or upon DLL loading. Which has been exactly the problem up until now.

...

The other piece of data that I can think of is the table that tracks which objects have already been serialized so that when you serialize a second pointer that points to the same object you don't serialize the object again, however this table has to be per-stream because I may have multiple opened streams that I'm reading from. Assuming you don't access the same stream from multiple threads, I see no thread safety problems here either.

This is correct - there are no theading problems associated with tracking. Robert Ramey

Emil Dotchevski

14 Dec 14 Dec

2:14 a.m.

...

...
I understand that you need global class registration to handle the case of reading a base pointer when the stream contains derived object, but when you serialize data you just assume that all classes have been registered ahead of time.

Not necessarily. The export functionality eliminates the requirement to explicitly register all derived types. Its original implementation initialized the type table upon first usage - the current one intializes the type table on start-up. This one of the changes I've been referring to in my belief that the situation may no longer exist.

Let me see if I understand how this works. So, in foo.cpp you #include headers from the serialization lib, and then you say EXPORT(foo) and it (automatically) registers class foo with all archive classes you've included. However I don't see how this can be done automatically. I suppose you're using some kind of global object to do the registration (because you say it's done on startup). However, the global object isn't guaranteed to be initialized at startup, all you know is that it will be initialized before you enter any function from foo.cpp. In fact I have seen Metrowerks (correctly) consider a similar "automatic" registration dead code and remove all foo.cpp code from the executable (not just the global object, the entire class foo is removed.)

Robert Ramey

5:28 a.m.

Emil Dotchevski wrote:

...

However I don't see how this can be done automatically. I suppose you're using some kind of global object to do the registration (because you say it's done on startup). However, the global object isn't guaranteed to be initialized at startup, all you know is that it will be initialized before you enter any function from foo.cpp.

I didn't see how to do this either. Dave Abrahams finally did it. Look in export.hpp

...

In fact I have seen Metrowerks (correctly) consider a similar "automatic" registration dead code and remove all foo.cpp code from the executable (not just the global object, the entire class foo is removed.)

I notice that export.hpp has some special code for Metrowerks - maybe that's related to your point. Look in export.hpp Robert Ramey

Emil Dotchevski

15 Dec 15 Dec

7:14 p.m.

...

Emil Dotchevski wrote:

...
However I don't see how this can be done automatically. I suppose you're using some kind of global object to do the registration (because you say it's done on startup). However, the global object isn't guaranteed to be initialized at startup, all you know is that it will be initialized before you enter any function from foo.cpp.

I didn't see how to do this either. Dave Abrahams finally did it. Look in export.hpp

...
In fact I have seen Metrowerks (correctly) consider a similar "automatic" registration dead code and remove all foo.cpp code from the executable (not just the global object, the entire class foo is removed.)

I notice that export.hpp has some special code for Metrowerks - maybe that's related to your point.

Look in export.hpp

I looked in export.hpp. It seems to me that on non-metrowerks compilers it simply defines a namespace-scope const reference object, in a nameless namespace. This means that the reference is guaranteed to be initialized only if you execute a function in the compilation unit that includes export.hpp. So again, you're at the mercy of the optimizer to not deadstrip this code. Second, even if the code is not deadstripped, you don't know when the registration takes place, all you know is it'll happen before you enter a function from the compilation unit that includes export.hpp. Therefore, you are still prone to threading problems. There is nothing in the C++ standard which guarantees globals will be initialized before main(). For Metrowerks, the code in export.hpp has this comment: // CodeWarrior fails to construct static members of class templates // when they are instantiated from within templates, so on that // compiler we ask users to specifically register base/derived class // relationships for exported classes. On all other compilers, use of // this macro is entirely optional. I don't have access to Metrowerks right now but I suspect that this has nothing to do with initializing static members of class templates. If I correctly recall my experience from about 2 years ago, Metrowers simply deadstrips the entire compilation unit, functions, global objects -- everything -- simply because it figures that nothing calls this code, and this has nothing to do with templates. I believe that this is correct, standard-complying behavior. In my opinion, the only portable option is to require manual registration, and leave it up to the user to come up with their own, non-portable solution for automatic registration, and deal with multi-threading/deadstripping problems this could cause. I personally find it easier to register my classes "manually" (note that systems other than the serialization library also require class registration.)

Robert Ramey

9:02 p.m.

Emil Dotchevski wrote:

...

Therefore, you are still prone to threading problems. There is nothing in the C++ standard which guarantees globals will be initialized before main().

According to the reference I use: The C++ Programming Language by Stroustrup page 244 10.4.9: "A global, namespace or class static object which is created once "at the start of the program" and destroyed at once at the termination of the program." I've always interpreted this to mean what it says and this is reflected in the standard.. But I'm sure some else on this list can give a definitive answer as to what the standard actually says. As far as "deadstripping" code - this has indeed been a big problem - especially in release mode. I've referred to it as overzealous optimization. Compilers vary quite a bit in this regard and In practice it has been addressed on an ad-hoc basis. In some cases code has been marked "_export" just so the optimizer doesn't drop it. This is the function of the module "force_include.hpp".

...

In my opinion, the only portable option is to require manual registration,

That may well be true.

...

and leave it up to the user to come up with their own, non-portable solution for automatic registration, and deal with multi-threading/deadstripping problems this could cause.

Or the user could use the system included with the serialization library which has already dealt with these issues.

...

I personally find it easier to register my classes "manually" (note that systems other than the serialization library also require class registration.)

I'm not going to disagree with that. This was the first method I used and it works well and is portable. But the lack of an automatic method like "export" raised a chorus of howls which had to be addressed. So now we have both. Take your pick. Robert Ramey

Emil Dotchevski

10:51 p.m.

...

...
Therefore, you are still prone to threading problems. There is nothing in the C++ standard which guarantees globals will be initialized before main().

According to the reference I use: The C++ Programming Language by Stroustrup page 244 10.4.9: "A global, namespace or class static object which is created once "at the start of the program" and destroyed at once at the termination of the program." I've always interpreted this to mean what it says and this is reflected in the standard.. But I'm sure some else on this list can give a definitive answer as to what the standard actually says.

3.6.2.3: It is implementation-defined whether or not the dynamic initialization (8.5, 9.4, 12.1, 12.6.1) of an object of namespace scope is done before the first statement of main. If the initialization is deferred to some point in time after the first statement of main, it shall occur before the first use of any function or object defined in the same translation unit as the object to be initialized.

...

As far as "deadstripping" code - this has indeed been a big problem - especially in release mode. I've referred to it as overzealous optimization.

It isn't overzealous -- it's within the specifications of the standard, whether you like it or not.

...

Compilers vary quite a bit in this regard and In practice it has been addressed on an ad-hoc basis. In some cases code has been marked "_export" just so the optimizer doesn't drop it. This is the function of the module "force_include.hpp".

If I find that a piece of code breaks when optimizations are enabled, I treat it as a bug. Sometimes it's bug in the compiler, but more often than not it isn't.

...

...
In my opinion, the only portable option is to require manual registration,

That may well be true.

...
and leave it up to the user to come up with their own, non-portable solution for automatic registration, and deal with multi-threading/deadstripping problems this could cause.

Or the user could use the system included with the serialization library which has already dealt with these issues.

The only issue I see here is that you are trying to prevent the compiler from doing something that the standard allows it to do. Besides (I'm repeating myself here), the "problem" you are solving is beyond the scope of the serialization library, because there are other systems that also need type registration and other one-time dynamic initialization. If there is a consensus that it is desirable to provide for automatic dynamic initialization of namespace-scope objects even if no function from their compilation unit is ever executed, perhaps this needs to be separated in a boost library of its own, as a first step of (possibly) adding such feature to the C++ standard.

...

...
I personally find it easier to register my classes "manually" (note that systems other than the serialization library also require class registration.)

I'm not going to disagree with that. This was the first method I used and it works well and is portable. But the lack of an automatic method like "export" raised a chorus of howls which had to be addressed.

I've seen many instances of chorus of howls from folks that write sub-standard code and complain that it isn't working as they have hoped. I tend to quote the C++ standard in response.

Robert Ramey

16 Dec 16 Dec

12:04 a.m.

Emil Dotchevski wrote:

...

...
...
Therefore, you are still prone to threading problems. There is nothing in the C++ standard which guarantees globals will be initialized before main().

According to the reference I use: The C++ Programming Language by Stroustrup page 244 10.4.9: "A global, namespace or class static object which is created once "at the start of the program" and destroyed at once at the termination of the program." I've always interpreted this to mean what it says and this is reflected in the standard.. But I'm sure some else on this list can give a definitive answer as to what the standard actually says.

3.6.2.3: It is implementation-defined whether or not the dynamic initialization (8.5, 9.4, 12.1, 12.6.1) of an object of namespace scope is done before the first statement of main. If the initialization is deferred to some point in time after the first statement of main, it shall occur before the first use of any function or object defined in the same translation unit as the object to be initialized.

OK - of course we've actually observed this behavior in the loading unloading of DLLS. I have no idea whether the standard address dynamic loading/unloading and the construction/destruction issues. I supposed not and had to make changes when we came upon this. I expect that these same changes also address the

...

...
Compilers vary quite a bit in this regard and In practice it has been addressed on an ad-hoc basis. In some cases code has been marked "_export" just so the optimizer doesn't drop it. This is the function of the module "force_include.hpp".

If I find that a piece of code breaks when optimizations are enabled, I treat it as a bug. Sometimes it's bug in the compiler, but more often than not it isn't.

Actually when I first came upon this I thought it was a compiler bug. As you've pointed out its not. Not that it matters to me - it just one more problem to be solved one way or another.

...

...
...
In my opinion, the only portable option is to require manual registration,

That may well be true.

...
and leave it up to the user to come up with their own, non-portable solution for automatic registration, and deal with multi-threading/deadstripping problems this could cause.

Or the user could use the system included with the serialization library which has already dealt with these issues.

The only issue I see here is that you are trying to prevent the compiler from doing something that the standard allows it to do.

Agreed. That's exactly what we've done.

...

Besides (I'm repeating myself here), the "problem" you are solving is beyond the scope of the serialization library, because there are other systems that also need type registration and other one-time dynamic initialization.

The "problem" is addressed in the context and only in the context of the serialization library. I'm aware that the "problem" is more general and we have no illusion that this is anything other than an ad hoc solution. Its just the best we could do and it does address the issue for this library.

...

If there is a consensus that it is desirable to provide for automatic dynamic initialization of namespace-scope objects even if no function from their compilation unit is ever executed, perhaps this needs to be separated in a boost library of its own, as a first step of (possibly) adding such feature to the C++ standard.

Great idea! I'll look forward to seeing this!! But as you point out there's no way to do this in a portable way. If there had been, we would have done it that way.

...

...
But the lack of an automatic method like "export" raised a chorus of howls which had to be addressed.

...

I've seen many instances of chorus of howls from folks that write sub-standard code and complain that it isn't working as they have hoped. I tend to quote the C++ standard in response.

Oh, I'm sure that's much appreciated. I suppose I could have used the approach when this issue was first raised. But, had I done so, I don't think we would have any serialization library today. Sometimes real solutions to real problems can't be crafted while restricting oneself to portable, standard conforming code. But we've come full circle. If you want portability, stick to those facilities - in ths case explicit registration - if it's not a concern you have the other method available. Robert Ramey

David Abrahams

12:07 a.m.

"Emil Dotchevski" <emildotchevski@hotmail.com> writes:

...

...
...
Therefore, you are still prone to threading problems. There is nothing in the C++ standard which guarantees globals will be initialized before main().

According to the reference I use: The C++ Programming Language by Stroustrup page 244 10.4.9: "A global, namespace or class static object which is created once "at the start of the program" and destroyed at once at the termination of the program." I've always interpreted this to mean what it says and this is reflected in the standard.

You should use the standard rather than TC++PL for this sort of thing; only the standard is a definitive reference.

...

...
But I'm sure some else on this list can give a definitive answer as to what the standard actually says.

Emil does

...

3.6.2.3: It is implementation-defined whether or not the dynamic initialization (8.5, 9.4, 12.1, 12.6.1) of an object of namespace scope is <...snip...>

Emil, I agree 100% with everything you write here, but especially with this:

...

If there is a consensus that it is desirable to provide for automatic dynamic initialization of namespace-scope objects even if no function from their compilation unit is ever executed, perhaps this needs to be separated in a boost library of its own, as a first step of (possibly) adding such feature to the C++ standard.

Yes, that's an important thing to have in many cases, and I wish we had a portable way of doing it in C++. A Boost library (if such a thing is possible) would be a great step.

...

...
I'm not going to disagree with that. This was the first method I used and it works well and is portable. But the lack of an automatic method like "export" raised a chorus of howls which had to be addressed.

I've seen many instances of chorus of howls from folks that write sub-standard code and complain that it isn't working as they have hoped. I tend to quote the C++ standard in response.

:^) -- Dave Abrahams Boost Consulting www.boost-consulting.com

Martin Bonner

9:44 p.m.

Emil Dotchevski wrote:

...

3.6.2.3: It is implementation-defined whether or not the dynamic initialization (8.5, 9.4, 12.1, 12.6.1) of an object of namespace scope is done before the first statement of main. If the initialization is deferred to some point in time after the first statement of main, it shall occur before the first use of any function or object defined in the same translation unit as the object to be initialized.

Yes, but I think you will find that unless the implementation /does/ perform dynamic initialization before the first statement of main, it will have unsolvable ordering problems. Consider a class A defined in a.cpp, and B defined in b.cpp. a.cpp: extern B theB; A::A() { .... } b.cpp: extern A theA; B::B() { .... } Now then. If something from a.cpp is referenced externally then the compiler must initialize 'theB' before it starts using any of the functions in a.cpp. To do that, it has to execute B::B, which means it must initialize 'theA', which means it must call a function (A::A) in a.cpp ... before it is allowed to use any of the functions in a.cpp. Therefore dynamic initialization must be done before main. Note, of course, that this argument assumes no prior undefined behaviour. The particular /form/ of undefined behaviour which is most likely to break it, is loading a DLL.

...

...
As far as "deadstripping" code - this has indeed been a big problem - especially in release mode. I've referred to it as overzealous optimization.

...

It isn't overzealous -- it's within the specifications of the standard, whether you like it or not.

Chapter and verse? (Given the above argument that dynamic initialization must happen first.) Ignoring the above, I agree that a boost library for reliable dynamic registration would be really useful (even if I think the standard says it ought to work, it clearly doesn't work with all real compilers) -- Martin Bonner Pi Technology, Milton Hall, Ely Road, Milton, Cambridge, CB4 6WZ +44 1223 203894 ________________________________ From: boost-bounces@lists.boost.org on behalf of Emil Dotchevski Sent: Fri 15/12/2006 22:51 To: boost@lists.boost.org Subject: Re: [boost] [Serliazation] Thread-safety again

...

...
Therefore, you are still prone to threading problems. There is nothing in the C++ standard which guarantees globals will be initialized before main().

According to the reference I use: The C++ Programming Language by Stroustrup page 244 10.4.9: "A global, namespace or class static object which is created once "at the start of the program" and destroyed at once at the termination of the program." I've always interpreted this to mean what it says and this is reflected in the standard.. But I'm sure some else on this list can give a definitive answer as to what the standard actually says.

...

As far as "deadstripping" code - this has indeed been a big problem - especially in release mode. I've referred to it as overzealous optimization.

It isn't overzealous -- it's within the specifications of the standard, whether you like it or not.

...

Compilers vary quite a bit in this regard and In practice it has been addressed on an ad-hoc basis. In some cases code has been marked "_export" just so the optimizer doesn't drop it. This is the function of the module "force_include.hpp".

If I find that a piece of code breaks when optimizations are enabled, I treat it as a bug. Sometimes it's bug in the compiler, but more often than not it isn't.

...

...
In my opinion, the only portable option is to require manual registration,

That may well be true.

...
and leave it up to the user to come up with their own, non-portable solution for automatic registration, and deal with multi-threading/deadstripping problems this could cause.

Or the user could use the system included with the serialization library which has already dealt with these issues.

...

...
I personally find it easier to register my classes "manually" (note that systems other than the serialization library also require class registration.)

I'm not going to disagree with that. This was the first method I used and it works well and is portable. But the lack of an automatic method like "export" raised a chorus of howls which had to be addressed.

I've seen many instances of chorus of howls from folks that write sub-standard code and complain that it isn't working as they have hoped. I tend to quote the C++ standard in response. _______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

David Abrahams

10:18 p.m.

"Martin Bonner" <martin.bonner@pitechnology.com> writes:

...

Emil Dotchevski wrote:

...
3.6.2.3: It is implementation-defined whether or not the dynamic initialization (8.5, 9.4, 12.1, 12.6.1) of an object of namespace scope is done before the first statement of main. If the initialization is deferred to some point in time after the first statement of main, it shall occur before the first use of any function or object defined in the same translation unit as the object to be initialized.

Yes, but I think you will find that unless the implementation /does/ perform dynamic initialization before the first statement of main, it will have unsolvable ordering problems.

You can have unsolvable ordering problems anyway. That's the whole reason for the Meyers Singleton.

...

Consider a class A defined in a.cpp, and B defined in b.cpp.

a.cpp: extern B theB; A::A() { .... }

b.cpp: extern A theA; B::B() { .... }

Now then. If something from a.cpp is referenced externally then the compiler must initialize 'theB' before it starts using any of the functions in a.cpp.

Not according to the standard. Why do you say it "must?"

...

...
...
As far as "deadstripping" code - this has indeed been a big problem - especially in release mode. I've referred to it as overzealous optimization.

...
It isn't overzealous -- it's within the specifications of the standard, whether you like it or not.

Chapter and verse? (Given the above argument that dynamic initialization must happen first.)

Completely unrelated, even if your argument about ordering didn't have holes in it. Even if it is a fact that event X cannot follow event Y, there's still no reason event X has to happen at all. -- Dave Abrahams Boost Consulting www.boost-consulting.com

Emil Dotchevski

17 Dec 17 Dec

10 p.m.

...

...
3.6.2.3: It is implementation-defined whether or not the dynamic initialization (8.5, 9.4, 12.1, 12.6.1) of an object of namespace scope is done before the first statement of main. If the initialization is deferred to some point in time after the first statement of main, it shall occur before the first use of any function or object defined in the same translation unit as the object to be initialized.

Yes, but I think you will find that unless the implementation /does/ perform dynamic initialization before the first statement of main, it will have unsolvable ordering problems. Consider a class A defined in a.cpp, and B defined in b.cpp.

a.cpp: extern B theB; A::A() { .... }

b.cpp: extern A theA; B::B() { .... }

Now then. If something from a.cpp is referenced externally then the compiler must initialize 'theB' before it starts using any of the functions in a.cpp.

I couldn't find anything like this in the standard. However, in general you can't safely assume that theB, theA, or any other global object is initialized (beyond the mandatory zero initialization) because you may be called from a constructor of another global object. Note that as a library writer, this is beyond your control -- the global object that calls you could be in user code.

...

...
...
As far as "deadstripping" code - this has indeed been a big problem - especially in release mode. I've referred to it as overzealous optimization.

...
It isn't overzealous -- it's within the specifications of the standard, whether you like it or not.

Chapter and verse? (Given the above argument that dynamic initialization must happen first.)

The standard does not deal with deadstripping but to me it's obvious that if A) the compiler is not required to initialize a namespace-scope object until a function from its translation unit is called, and B) no function from that translation unit is called, the obvious conclusion is that the object will never be initialized and therefore can be safely deadstripped.

Robert Ramey

10:20 p.m.

Emil Dotchevski wrote:

...

...
...
...
As far as "deadstripping" code - this has indeed been a big problem - especially in release mode. I've referred to it as overzealous optimization.

...
It isn't overzealous -- it's within the specifications of the standard, whether you like it or not.

Chapter and verse? (Given the above argument that dynamic initialization must happen first.)

The standard does not deal with deadstripping but to me it's obvious that if A) the compiler is not required to initialize a namespace-scope object until a function from its translation unit is called, and B) no function from that translation unit is called, the obvious conclusion is that the object will never be initialized and therefore can be safely deadstripped.

If I recal, the "dead stripping" occured even if a function was called from the class. test_exported calls the constructor for a derived class (via new) but doesn't explicitly call any "serialize" function. WIthout extra "help", the program won't link in release mode. I called this "overzealous optimization". I'm not sure we're referring to the same thing here. I solved it by marking the dropped functions with "_export" so that the linker wouldn't know they were never called. Again, I'm not sure we're referring to the same thing here. Then there's the question of an inline constructor. Its not at all clear to me what translation unit - if any it belongs. Note that the problem I've been referring to even occurs in the absense of export so - I don't think this (at least this particular problem) is a C++ standards issue. Except perhaps to say the C++ doesn't even address things like dynamic export of functions which is very common in windows programs at least. Robert Ramey

David Abrahams

16 Dec 16 Dec

12:01 a.m.

"Emil Dotchevski" <emildotchevski@hotmail.com> writes:

...

I looked in export.hpp.

It seems to me that on non-metrowerks compilers it simply defines a namespace-scope const reference object, in a nameless namespace. This means that the reference is guaranteed to be initialized only if you execute a function in the compilation unit that includes export.hpp. So again, you're at the mercy of the optimizer to not deadstrip this code.

It is not allowed to deadstrip the code if any other code in the translation unit gets used, because the object must be initialized before the first use of any object or function in the TU.

...

Second, even if the code is not deadstripped, you don't know when the registration takes place, all you know is it'll happen before you enter a function from the compilation unit that includes export.hpp. Therefore, you are still prone to threading problems. There is nothing in the C++ standard which guarantees globals will be initialized before main().

Correct.

...

For Metrowerks, the code in export.hpp has this comment:

// CodeWarrior fails to construct static members of class templates // when they are instantiated from within templates, so on that // compiler we ask users to specifically register base/derived class // relationships for exported classes. On all other compilers, use of // this macro is entirely optional.

I don't have access to Metrowerks right now but I suspect that this has nothing to do with initializing static members of class templates.

I think you're wrong; I was pretty careful in my analysis.

...

If I correctly recall my experience from about 2 years ago, Metrowers simply deadstrips the entire compilation unit, functions, global objects -- everything -- simply because it figures that nothing calls this code, and this has nothing to do with templates. I believe that this is correct, standard-complying behavior.

Yes, it is standard conforming to deadstrip the TU if nothing calls the code, and yes, Metrowerks is really good at making that optimization, but no, that's not what the comment is referring to. There really is a compiler bug IIRC.

...

In my opinion, the only portable option is to require manual registration, and leave it up to the user to come up with their own, non-portable solution for automatic registration, and deal with multi-threading/deadstripping problems this could cause. I personally find it easier to register my classes "manually" (note that systems other than the serialization library also require class registration.)

Yes, if you can't guarantee the TU isn't deadstripped (by guaranteeing that it is used somehow), you need to do manual registration. On the other hand, that was the case before I came along and made my order-independence changes to export.hpp; I figured that the Serialization library must have already had some way of guaranteeing that the TU was used, and I wasn't trying to solve that problem. -- Dave Abrahams Boost Consulting www.boost-consulting.com

Emil Dotchevski

12:20 a.m.

...

...
It seems to me that on non-metrowerks compilers it simply defines a namespace-scope const reference object, in a nameless namespace. This means that the reference is guaranteed to be initialized only if you execute a function in the compilation unit that includes export.hpp. So again, you're at the mercy of the optimizer to not deadstrip this code.

It is not allowed to deadstrip the code if any other code in the translation unit gets used, because the object must be initialized before the first use of any object or function in the TU.

Right, but as far as I can see, the whole point of this automatic registration is that it happens just because you link a particular object file that includes the serialization library and uses the registration macros, without a need to explicitly call a function defined in it.

...

...
I don't have access to Metrowerks right now but I suspect that this has nothing to do with initializing static members of class templates.

I think you're wrong; I was pretty careful in my analysis.

...
If I correctly recall my experience from about 2 years ago, Metrowers simply deadstrips the entire compilation unit, functions, global objects -- everything -- simply because it figures that nothing calls this code, and this has nothing to do with templates. I believe that this is correct, standard-complying behavior.

Yes, it is standard conforming to deadstrip the TU if nothing calls the code, and yes, Metrowerks is really good at making that optimization, but no, that's not what the comment is referring to. There really is a compiler bug IIRC.

Ah I understand. Right, so it's a different issue altogether, but the automatic registration is still not portable. Or did I completely misunderstand the documentation? Robert, was the intended meaning that *if* you explicitly call a function from foo.cpp, then using the class export macro guarantees that class foo is properly registered with the serialization library?

David Abrahams

1:36 a.m.

"Emil Dotchevski" <emildotchevski@hotmail.com> writes:

...

Or did I completely misunderstand the documentation? Robert, was the intended meaning that *if* you explicitly call a function from foo.cpp, then using the class export macro guarantees that class foo is properly registered with the serialization library?

That is certainly the only correct way to document it unless we find other means to make that work. -- Dave Abrahams Boost Consulting www.boost-consulting.com

Robert Ramey

3:13 a.m.

David Abrahams wrote:

...

"Emil Dotchevski" <emildotchevski@hotmail.com> writes:

...
Or did I completely misunderstand the documentation? Robert, was the intended meaning that *if* you explicitly call a function from foo.cpp, then using the class export macro guarantees that class foo is properly registered with the serialization library?

My intention is and was that proper registration will occur whether or not the type is explicitly referred to. That was my goal and I believe it has been achieved for all compilers that boost supports. In order to accomplish this I resorted to compiler specific syntax not described by the standard. Now it seems the question being raised is whether this is/was the right thing to do. For me the answer was easy. The comments in the first review made it clear that this was a feature considered necessary by a significant number of users. Given that I had set the goal of getting this library into boost, I felt I had to find a way to do it. This is re-inforced by my feeling that using the export macro is the way that most users prefer to address this issue. This permits better scalabilty and automatic registration of those and only those types included. So the question is - what do you tell those users who want this? Its not really a C++ question but a more general and interesting one. What is a library writer to do when the language standard conflicts with the way you and/or your target audience want to do things? In some cases, you stick to the standard and suffer some inconvience in order to guarentee portability. In other cases - portability is not a requirement. In this case I "squared the circle" by "solving the issue" for each compiler in a different way so that its "portable" between this subset and still considered convenient. Was this the wroing decision? Should I have tried harder to say - You shouldn't want that facility because it's not going to be guareteed to be portable. I tried that and it didn't fly. Writing a library that hopes to become the most widely used in its class is about a lot more than adhering to the standard. This is reflected in some aspects of the serialization library. I would wager that this is not the only one that has done this.

...

That is certainly the only correct way to document it unless we find other means to make that work.

I would say it has been made to work - its just not guarenteed to be portable to new compilers. Robert Ramey

David Abrahams

10:13 p.m.

"Robert Ramey" <ramey@rrsd.com> writes:

...

David Abrahams wrote:

...
"Emil Dotchevski" <emildotchevski@hotmail.com> writes:

...
Or did I completely misunderstand the documentation? Robert, was the intended meaning that *if* you explicitly call a function from foo.cpp, then using the class export macro guarantees that class foo is properly registered with the serialization library?

My intention is and was that proper registration will occur whether or not the type is explicitly referred to.

"Whether the type is explicitly referred to" is completely irrelevant. The question is whether any objects or functions from the translation unit containing the EXPORT are used. Did you intend that proper registration occur even if no objects or functions from the translation unit containing the EXPORT are used?

...

That was my goal and I believe it has been achieved for all compilers that boost supports.

I would be very surprised if that were the case for Metrowerks; as pointed out by Emil, that compiler is very good at optimizing out unused translation units.

...

In order to accomplish this I resorted to compiler specific syntax not described by the standard.

Not for every compiler... and there is no guarantee that the next version of GCC or MSVC won't apply the optimization more aggressively.

...

Now it seems the question being raised is whether this is/was the right thing to do.

No, I don't think that is the question at all, at least, not mine. My question is what promises you can legitimately make to users about what will happen, and whether you have made your promises sufficiently explicit.

...

Its not really a C++ question but a more general and interesting one. What is a library writer to do when the language standard conflicts with the way you and/or your target audience want to do things? In some cases, you stick to the standard and suffer some inconvience in order to guarentee portability. In other cases - portability is not a requirement. In this case I "squared the circle" by "solving the issue" for each compiler in a different way so that its "portable" between this subset and still considered convenient.

That's a legitimate thing to do, but I think you need to be very explicit about what you're doing.

...

...
That is certainly the only correct way to document it unless we find other means to make that work.

I would say it has been made to work

I would be very surprised. Do any of your tests actually exercise the case where the EXPORT is in a TU with no used functions or objects?

...

- its just not guarenteed to be portable to new compilers.

Or new versions of existing compilers. -- Dave Abrahams Boost Consulting www.boost-consulting.com

Robert Ramey

11:08 p.m.

David Abrahams wrote:

...

"Robert Ramey" <ramey@rrsd.com> writes:

...
David Abrahams wrote:

...
"Emil Dotchevski" <emildotchevski@hotmail.com> writes:

...
Or did I completely misunderstand the documentation? Robert, was the intended meaning that *if* you explicitly call a function from foo.cpp, then using the class export macro guarantees that class foo is properly registered with the serialization library?

My intention is and was that proper registration will occur whether or not the type is explicitly referred to.

"Whether the type is explicitly referred to" is completely irrelevant. The question is whether any objects or functions from the translation unit containing the EXPORT are used.

Did you intend that proper registration occur even if no objects or functions from the translation unit containing the EXPORT are used?

I don't see how one can determine at compile time whether or not an object of a particular type is going to be used. In fact, it may well vary from one exectution run to the next. The situation I had in mind is that of serialization of derived through an abstract base class. suppose one is going to load an archive. The sequence of events is very roughly: a) read the export tag b) look it up in the extended type info table c) using this pointer, lookup the deserializer function d) and invoke them So the extended type info table has to be initialized with all the types that might be in the archive. This is what export attempts to do. But when compiled for release mode, some compilers drop the deseralization code. The results in a run time error "unregistered type". I always interpreted this as the compilers droping code that was never explicitly called. This is what occurs when we have an executable or dll which contains entry points never called. Some compilers provide the keyword "__export" to tell the compilers that such functions shouldn't be dropped even though they don't seem to be called. So I used these to be sure the the specified code didn't get dropped. Perhaps I should have described this but

...

...
That was my goal and I believe it has been achieved for all compilers that boost supports.

...

I would be very surprised if that were the case for Metrowerks; as pointed out by Emil, that compiler is very good at optimizing out unused translation units.

Well, that would explain one reason we've had so much problem with this comipiler - its too conforming!!!

...

...
In order to accomplish this I resorted to compiler specific syntax not described by the standard.

Not for every compiler... and there is no guarantee that the next version of GCC or MSVC won't apply the optimization more aggressively.

...
Now it seems the question being raised is whether this is/was the right thing to do.

No, I don't think that is the question at all, at least, not mine. My question is what promises you can legitimately make to users about what will happen, and whether you have made your promises sufficiently explicit.

...

...

...

That's a legitimate thing to do, but I think you need to be very explicit about what you're doing.

Well, that's easy to fix. I guess that all we need to say is: a) functionality of export cannot be guarenteed to be made to work on all compilers. b) If you want your code to be guarenteed to be portable to other compilers, don't use export - use explicit registration instead.

...

Do any of your tests actually exercise the case where the EXPORT is in a TU with no used functions or objects?

Well, I thought demo_pimpl did this - but I looked at it and it doesn't use export. So I checked test_exported. Its only one translation unit so it doesn't explicitly test this situation - the derived class constructor is called explicitly. this could easily be moved to another *.cpp file to test this. I did find one thing that was new to me and very interesting: BOOST_CLASS_EXPORT(polymorphic_derived1) // MWerks users can do this to make their code work BOOST_SERIALIZATION_MWERKS_BASE_AND_DERIVED(polymorphic_base, polymorphic_derived1) which I had never seen before. I don't know if this is related in some way.

...

...
- its just not guarenteed to be portable to new compilers.

Or new versions of existing compilers.

Obviously Rober Ramey

David Abrahams

18 Dec 18 Dec

9:41 p.m.

"Robert Ramey" <ramey@rrsd.com> writes:

...

David Abrahams wrote:

...
"Robert Ramey" <ramey@rrsd.com> writes:

...
David Abrahams wrote:

...
"Emil Dotchevski" <emildotchevski@hotmail.com> writes:

...
Or did I completely misunderstand the documentation? Robert, was the intended meaning that *if* you explicitly call a function from foo.cpp, then using the class export macro guarantees that class foo is properly registered with the serialization library?

My intention is and was that proper registration will occur whether or not the type is explicitly referred to.

"Whether the type is explicitly referred to" is completely irrelevant. The question is whether any objects or functions from the translation unit containing the EXPORT are used.

Did you intend that proper registration occur even if no objects or functions from the translation unit containing the EXPORT are used?

You didn't answer my question. Will you answer the question?

...

I don't see how one can determine at compile time whether or not an object of a particular type is going to be used. In fact, it may well vary from one exectution run to the next. The situation I had in mind is that of serialization of derived through an abstract base class.

"Of a particular type" is irrelevant. The question is whether any objects or functions in the TU are going to be used. The compiler only has to prove that none will be used, and it can optimize out the TU. No, obviously that doesn't happen at compile-time; it happens at link time.

...

suppose one is going to load an archive. The sequence of events is very roughly:

a) read the export tag b) look it up in the extended type info table c) using this pointer, lookup the deserializer function d) and invoke them

So the extended type info table has to be initialized with all the types that might be in the archive. This is what export attempts to do. But when compiled for release mode, some compilers drop the deseralization code. The results in a run time error "unregistered type". I always interpreted this as the compilers droping code that was never explicitly called. This is what occurs when we have an executable or dll which contains entry points never called. Some compilers provide the keyword "__export" to tell the compilers that such functions shouldn't be dropped even though they don't seem to be called. So I used these to be sure the the specified code didn't get dropped. Perhaps I should have described this but

but?

...

...
...
That was my goal and I believe it has been achieved for all compilers that boost supports.

...
I would be very surprised if that were the case for Metrowerks; as pointed out by Emil, that compiler is very good at optimizing out unused translation units.

Well, that would explain one reason we've had so much problem with this comipiler - its too conforming!!!

Maybe, but there's also one bug (at least).

...

...
...
Now it seems the question being raised is whether this is/was the right thing to do.

No, I don't think that is the question at all, at least, not mine. My question is what promises you can legitimately make to users about what will happen, and whether you have made your promises sufficiently explicit.

...
That's a legitimate thing to do, but I think you need to be very explicit about what you're doing.

Well, that's easy to fix. I guess that all we need to say is:

a) functionality of export cannot be guarenteed to be made to work on all compilers.

I don't think that's sufficiently precise. EXPORT works just fine as long as you ensure that some function in the TU gets called. You could have 50 EXPORTs and one dummy function in the TU, called from main(), that prevents it from getting linked out.

...

b) If you want your code to be guarenteed to be portable to other compilers, don't use export - use explicit registration instead.

Too big a hammer, IMO.

...

...
Do any of your tests actually exercise the case where the EXPORT is in a TU with no used functions or objects?

Well, I thought demo_pimpl did this - but I looked at it and it doesn't use export.

So I checked test_exported. Its only one translation unit so it doesn't explicitly test this situation - the derived class constructor is called explicitly. this could easily be moved to another *.cpp file to test this.

I did find one thing that was new to me and very interesting:

BOOST_CLASS_EXPORT(polymorphic_derived1)

// MWerks users can do this to make their code work

BOOST_SERIALIZATION_MWERKS_BASE_AND_DERIVED(polymorphic_base, polymorphic_derived1)

which I had never seen before. I don't know if this is related in some way.

Sure you have; I pointed it out to you very explicitly when I checked in the new export code. It's not related to the optimization; it is a workaround for the Metrowerks bug we've been discussing elsewhere in this thread.

...

...
...
- its just not guarenteed to be portable to new compilers.

Or new versions of existing compilers.

Obviously

Of course, if the compiler's docs say it will respect __export, then you're pretty darned safe in the future. -- Dave Abrahams Boost Consulting www.boost-consulting.com

David Abrahams

3 Jan 3 Jan

4:32 p.m.

"Robert Ramey" <ramey@rrsd.com> writes:

...

David Abrahams wrote:

...
"Robert Ramey" <ramey@rrsd.com> writes:

...
David Abrahams wrote:

...
"Emil Dotchevski" <emildotchevski@hotmail.com> writes:

...
Or did I completely misunderstand the documentation? Robert, was the intended meaning that *if* you explicitly call a function from foo.cpp, then using the class export macro guarantees that class foo is properly registered with the serialization library?

My intention is and was that proper registration will occur whether or not the type is explicitly referred to.

"Whether the type is explicitly referred to" is completely irrelevant. The question is whether any objects or functions from the translation unit containing the EXPORT are used.

Did you intend that proper registration occur even if no objects or functions from the translation unit containing the EXPORT are used?

I don't see how one can determine at compile time whether or not an object of a particular type is going to be used.

It has nothing to do with whether an object "of a particular type" is going to be used; it has to do with whether objects **defined in the translation unit** are used.

...

In fact, it may well vary from one exectution run to the next.

So what? The compiler/linker make that determination, and for code to be stripped they don't need to determine precisely whether all objects are used or not it's sufficient for them to be able to prove that certain objects *won't* be used. That is sometimes provable, so initialization code is sometimes skipped and/or stripped. Can you answer the question? Did you intend that proper registration occur even if no objects or functions from the translation unit containing the EXPORT are used?

...

suppose one is going to load an archive. The sequence of events is very roughly:

a) read the export tag b) look it up in the extended type info table c) using this pointer, lookup the deserializer function d) and invoke them

So the extended type info table has to be initialized with all the types that might be in the archive. This is what export attempts to do.

It doesn't do enough, and there is no portable thing it can do that would be enough.

...

But when compiled for release mode, some compilers drop the deseralization code. The results in a run time error "unregistered type". I always interpreted this as the compilers droping code that was never explicitly called.

That's a very imprecise way to put it. The compilers are legally optimizing out the initialization of objects in translation units that define no functions or objects used outside the TU. Whether or not code is actually removed from the executable image is another matter.

...

This is what occurs when we have an executable or dll which contains entry points never called. Some compilers provide the keyword "__export" to tell the compilers that such functions shouldn't be dropped even though they don't seem to be called. So I used these to be sure the the specified code didn't get dropped. Perhaps I should have described this but

The documentation should describe the limitations on portability of relying on the export code being registered even if nothing else in the TU is used, and should describe how to ensure that it does get registered (just call a little stub function in the TU from main()).

...

...
...
That was my goal and I believe it has been achieved for all compilers that boost supports.

...
I would be very surprised if that were the case for Metrowerks; as pointed out by Emil, that compiler is very good at optimizing out unused translation units.

Well, that would explain one reason we've had so much problem with this comipiler - its too conforming!!!

You clearly forgot the smiley.

...

...
No, I don't think that is the question at all, at least, not mine. My question is what promises you can legitimately make to users about what will happen, and whether you have made your promises sufficiently explicit.

...
...

...
That's a legitimate thing to do, but I think you need to be very explicit about what you're doing.

Well, that's easy to fix. I guess that all we need to say is:

a) functionality of export cannot be guarenteed to be made to work on all compilers.

I think that's too broad. It will work as long as you ensure something in the TU is used.

...

b) If you want your code to be guarenteed to be portable to other compilers, don't use export - use explicit registration instead.

Again, too broad. It's much less work to ensure the TU is used (one stub function call) than to replace all the uses of EXPORT with explicit registrations.

...

...
Do any of your tests actually exercise the case where the EXPORT is in a TU with no used functions or objects?

Well, I thought demo_pimpl did this - but I looked at it and it doesn't use export.

So I checked test_exported. Its only one translation unit so it doesn't explicitly test this situation - the derived class constructor is called explicitly. this could easily be moved to another *.cpp file to test this.

Not a bad idea.

...

I did find one thing that was new to me and very interesting:

BOOST_CLASS_EXPORT(polymorphic_derived1)

// MWerks users can do this to make their code work

BOOST_SERIALIZATION_MWERKS_BASE_AND_DERIVED(polymorphic_base, polymorphic_derived1)

which I had never seen before.

Sure you have. I told you about it and asked you to look it over when I checked in the EXPORT ordering fixes.

...

I don't know if this is related in some way.

No, it's a completely separate issue. -- Dave Abrahams Boost Consulting www.boost-consulting.com

Robert Ramey

6:08 p.m.

David Abrahams wrote:

...

"Robert Ramey" <ramey@rrsd.com> writes:

...

...
I don't see how one can determine at compile time whether or not an object of a particular type is going to be used.

...

It has nothing to do with whether an object "of a particular type" is going to be used; it has to do with whether objects **defined in the translation unit** are used.

...

...
In fact, it may well vary from one exectution run to the next.

So what? The compiler/linker make that determination, and for code to be stripped they don't need to determine precisely whether all objects are used or not it's sufficient for them to be able to prove that certain objects *won't* be used. That is sometimes provable, so initialization code is sometimes skipped and/or stripped.

...

Can you answer the question?

Did you intend that proper registration occur even if no objects or functions from the translation unit containing the EXPORT are used?

Hmmm - there are a couple of things going on which are a little confusing: a) instantiation of code to serialize data type T with archive A. This may include (I forget the exact names) basic_archive_serializer<A, T>. It was my intention that this code be instantiated for all combinations of archive types and data types actually used and no others. This the fundamental function of BOOST_CLASS_EXPORT b) construction of an instance void_caster for each pair of types B and D where D is B is a base class of D. This is the function of .. BOOST_SERIALIZATION_BASE_OBJECT_NVP c) construction of an instance of extended_type_info for type T which is used to support the above. So, if I make a program which loads objects of derived class D through a pointer to base class B I will need the above described code to be instantiated even though no calls the the functions in the header containing EXPORT are explicitly made. If this code isn't found at runtime - due to code stripping - the program will through an "unregistered" exception. This is how it is intended to function and is indeed how I believe it functions on the compilers we test with.

...

...
So the extended type info table has to be initialized with all the types that might be in the archive. This is what export attempts to do.

It doesn't do enough, and there is no portable thing it can do that would be enough.

Hmmm - well it does work. But I DID have to include "_export" - non portable construct to make it work - that is the function of the "force_include.hpp" header.

...

The documentation should describe the limitations on portability of relying on the export code being registered even if nothing else in the TU is used, and should describe how to ensure that it does get registered

Well, that's an easy fix. On the other hand, I don't know that all compilers will correctly implement this behavior. For example, will VC behave as expected when "function level linking" (which I always use and test with) is enabled? Only testing can really resolve this.

...

(just call a little stub function in the TU from main()).

I can already hear the howling in the distance. If one can do this the it would be just as easy to explcitly the all the derived types at the beginning of an archive and avoid BOOST_CLASS_EXPORT all together. Then there is the question of dynamically loaded DLLS which is involved here. Given that is is OS dependent, I expect the usage of dynamically loaded DLLS is not portable in any case so its easy to address just by so indicating in the documentation. In any case, its easy to add this explanation to the documentation and I will do this.

...

...
Well, that would explain one reason we've had so much problem with this comipiler - its too conforming!!!

...

You clearly forgot the smiley.

I'm not sure I meant that as a joke. BTW, in the latest HEAD cw tests there are only two tests failures (excluding a couple of failures due to artifacts of the tests). So I'm thinking that this is going to a non issue for all practical purposes. On the other hand, I don't think the tests have been run with this compiler in release mode so its not a real test. On the other hand, if this compiler has the capability to build windows DLLS, it has to have something like the _export keyword. Currently, force_include.hpp doesn't include anything like this for CW but maybe it could. Note that other compilers in wide usage (MS, Borland, GCC) all include some syntax for addressing this issue. The header force_include.hpp defines macros for inserting this syntax in the proper place for each compiler. I'm surprised that no other libraries that I know of have not had this problem. If it were more common I would expect something like "force_include.hpp" would be needed at the Boost level rather than the BS level. As an aside - sort of - implementing serialization in a way which meets the expections of most programmers - as reflected on this and the user list could not be done without a number of contortions. I would be interesting to consider what changes/refinements to the C++ language would be helpful in making a better system of this type. Aspects of this are in a) extended type info - not portable external representation of a type and not an obvious way to handle namespaces. b) code instantiation in certain case like the above c) sequence of intialization constructors d) DLLS and all of the above. e) compile time reflection Robert Ramey

Emil Dotchevski

6:42 p.m.

...

...
Did you intend that proper registration occur even if no objects or functions from the translation unit containing the EXPORT are used?

Hmmm - there are a couple of things going on which are a little confusing:

a) instantiation of code to serialize data type T with archive A. This may include (I forget the exact names) basic_archive_serializer<A, T>. It was my intention that this code be instantiated for all combinations of archive types and data types actually used and no others. This the fundamental function of BOOST_CLASS_EXPORT

How does BOOST_CLASS_EXPORT help me if I have my own archive serializer? Even if I'm using one of the archives the library provides for me, I think that it is unreasonable to automatically register all combinations of archivers and types because it introduces physical coupling between translation units which don't depend on each other (in a program where only one archive type is used, which is typical.)

...

b) construction of an instance void_caster for each pair of types B and D where D is B is a base class of D. This is the function of .. BOOST_SERIALIZATION_BASE_OBJECT_NVP

c) construction of an instance of extended_type_info for type T which is used to support the above.

So, if I make a program which loads objects of derived class D through a pointer to base class B I will need the above described code to be instantiated even though no calls the the functions in the header containing EXPORT are explicitly made.

This discussion is not about whether or not we need those templates to be instantiated; of course we do.

...

...
...
So the extended type info table has to be initialized with all the types that might be in the archive. This is what export attempts to do.

It doesn't do enough, and there is no portable thing it can do that would be enough.

Hmmm - well it does work. But I DID have to include "_export" - non portable construct to make it work - that is the function of the "force_include.hpp" header.

It does work until it doesn't. In several projects I have worked on, I have seen similar systems relying on "auto registration" to fail due to seemingly unrelated changes in the code base, or when the code is ported to another operating system or platform. The only compiler I have observed to not deadstrip such code consistently is MSVC. I still think that whether it works or not is irrelevant if the documented behavior can not be delivered within the limits of the C++ standard. In my mind it only works because the compiler is not smart enough.

...

...
The documentation should describe the limitations on portability of relying on the export code being registered even if nothing else in the TU is used, and should describe how to ensure that it does get registered

Well, that's an easy fix.

On the other hand, I don't know that all compilers will correctly implement this behavior. For example, will VC behave as expected when "function level linking" (which I always use and test with) is enabled? Only testing can really resolve this.

This is true for any provision in the standard that is optional, or behaviors that are "implementation defined". I don't think it is a good idea to rely on implementation-defined behavior. If we insist that we rely on it, we should clearly state that the behavior is not portable and has only been tested on such and such compilers/platforms. In my opinion the "auto registration" should be removed from the serialization library altogether, and replaced with an example demonstrating how it could be done, provided that your compiler doesn't dead strip this kind of code even though it is allowed to.

Robert Ramey

8:35 p.m.

Emil Dotchevski wrote:

...

...
...
Did you intend that proper registration occur even if no objects or functions from the translation unit containing the EXPORT are used?

Hmmm - there are a couple of things going on which are a little confusing:

a) instantiation of code to serialize data type T with archive A. This may include (I forget the exact names) basic_archive_serializer<A, T>. It was my intention that this code be instantiated for all combinations of archive types and data types actually used and no others. This the fundamental function of BOOST_CLASS_EXPORT

How does BOOST_CLASS_EXPORT help me if I have my own archive serializer? Even if I'm using one of the archives the library provides for me, I think that it is unreasonable to automatically register all combinations of archivers and types because it introduces physical coupling between translation units which don't depend on each other (in a program where only one archive type is used, which is typical.)

All combinations of A and T actually used - and only those combinations - are used to instantiate code. That is, there is not dead code generated all required code IS instantiated. Daves implementation of this facility (unlike my previous one) does NOT depend on knowing in advance which archives classes might be used. It just works.

...

...
b) construction of an instance void_caster for each pair of types B and D where D is B is a base class of D. This is the function of .. BOOST_SERIALIZATION_BASE_OBJECT_NVP

c) construction of an instance of extended_type_info for type T which is used to support the above.

So, if I make a program which loads objects of derived class D through a pointer to base class B I will need the above described code to be instantiated even though no calls the the functions in the header containing EXPORT are explicitly made.

This discussion is not about whether or not we need those templates to be instantiated; of course we do.

...

...
...
...
So the extended type info table has to be initialized with all the types that might be in the archive. This is what export attempts to do.

It doesn't do enough, and there is no portable thing it can do that would be enough.

Hmmm - well it does work. But I DID have to include "_export" - non portable construct to make it work - that is the function of the "force_include.hpp" header.

It does work until it doesn't. In several projects I have worked on, I have seen similar systems relying on "auto registration" to fail due to seemingly unrelated changes in the code base, or when the code is ported to another operating system or platform. The only compiler I have observed to not deadstrip such code consistently is MSVC.

That's what I found when I tested it. I addressed it in the only way I could figure out how to do it. Now it turns out that its not portable according to the standard - and can't be made so. So you have two choices - use an alternate facility (explicit registration) which is provided or not.

...

I still think that whether it works or not is irrelevant if the documented behavior can not be delivered within the limits of the C++ standard. In my mind it only works because the compiler is not smart enough.

...

I don't think it is a good idea to rely on implementation-defined behavior.

Not everyone shares that opinion. For example, anyone using dynamic loading of DLLS can't be sharing that opinion.

...

If we insist that we rely on it, we should clearly state that the behavior is not portable and has only been tested on such and such compilers/platforms.

We've agreed to do this.

...

In my opinion the "auto registration" should be removed from the serialization library altogether, and replaced with an example demonstrating how it could be done, provided that your compiler doesn't dead strip this kind of code even though it is allowed to.

LOL - remember the only reason EXPORT is even in there is because it's absense was considered a "show stopper" to getting the library approved. So your idea is wildly unrealistic. Of course there's nothing to prevent you from stripping the EXPORT functionality from your own copy. But then, that would be the same as just not using it. So it's wierd to me that you think we should prohibit other users for using parts of the library for things that the C++ doesn't address. How does this prevent you from using the library in the way you feel is appropriate? Robert Ramey

Emil Dotchevski

10:22 p.m.

...

...
How does BOOST_CLASS_EXPORT help me if I have my own archive serializer? Even if I'm using one of the archives the library provides for me, I think that it is unreasonable to automatically register all combinations of archivers and types because it introduces physical coupling between translation units which don't depend on each other (in a program where only one archive type is used, which is typical.)

All combinations of A and T actually used - and only those combinations - are used to instantiate code. That is, there is not dead code generated all required code IS instantiated. Daves implementation of this facility (unlike my previous one) does NOT depend on knowing in advance which archives classes might be used. It just works.

For a combination of A and T to work, you need to turn an instantiation of a << template to a boost::function call (or something similar) that doesn't use templates. Are you saying that this instantiation is somehow postponed until actually used? If not, this is code that might not be called, and in my opinion the fact that it is instantiated is a problem.

...

...
...
...
...
So the extended type info table has to be initialized with all the types that might be in the archive. This is what export attempts to do.

It doesn't do enough, and there is no portable thing it can do that would be enough.

Hmmm - well it does work. But I DID have to include "_export" - non portable construct to make it work - that is the function of the "force_include.hpp" header.

It does work until it doesn't. In several projects I have worked on, I have seen similar systems relying on "auto registration" to fail due to seemingly unrelated changes in the code base, or when the code is ported to another operating system or platform. The only compiler I have observed to not deadstrip such code consistently is MSVC.

That's what I found when I tested it. I addressed it in the only way I could figure out how to do it. Now it turns out that its not portable according to the standard - and can't be made so. So you have two choices - use an alternate facility (explicit registration) which is provided or not.

The serialization library provides a feature which leads to writing more non-portable code. I do want compilers to deadstrip such global objects because it makes sense and because the standard allows it. I wouldn't care if it was just any other library we're talking about; but a boost library is rather influential and should encourage portability, not the other way around.

...

...
I still think that whether it works or not is irrelevant if the documented behavior can not be delivered within the limits of the C++ standard. In my mind it only works because the compiler is not smart enough.

...
I don't think it is a good idea to rely on implementation-defined behavior.

Not everyone shares that opinion. For example, anyone using dynamic loading of DLLS can't be sharing that opinion.

Dynamic loading of DLLs is in fact one of the reasons why the standard does not require that initialization of global objects must be done at startup. Note also that DLLs require specific, well documented support from the compiler. The behavior the auto-registration uses is a failure to remove dead code. I hope you agree that there is a difference.

...

...
In my opinion the "auto registration" should be removed from the serialization library altogether, and replaced with an example demonstrating how it could be done, provided that your compiler doesn't dead strip this kind of code even though it is allowed to.

LOL - remember the only reason EXPORT is even in there is because it's absense was considered a "show stopper" to getting the library approved. So your idea is wildly unrealistic.

I don't remember as I was not part of that discussion. :) Are you suggesting that once a feature is added to a (boost) library, it is unreasonable for someone to argue against it?

...

Of course there's nothing to prevent you from stripping the EXPORT functionality from your own copy. But then, that would be the same as just not using it. So it's wierd to me that you think we should prohibit other users for using parts of the library for things that the C++ doesn't address. How does this prevent you from using the library in the way you feel is appropriate?

It's not about my own copy and whether or not I would use auto registration, rest assured that I wont. :) The only reason why I pointed out this problem a few weeks ago is that I wanted to share my own experience and opinion. I used to use auto registration in several projects I worked on. I have to admit, every time auto registration broke, I was able to fix it, at least temporarily. Eventually I got tired of it and I recognized that my efforts were misguided. We should help compilers remove dead code, not try to prevent them from doing it!

David Abrahams

4 Jan 4 Jan

4:48 a.m.

"Robert Ramey" <ramey@rrsd.com> writes:

...

All combinations of A and T actually used - and only those combinations - are used to instantiate code.

I don't think that can be true for any reasonable definition of "used." Code in one TU can't tell what other TUs are doing, so the code is instantiated for all archives whose headers have been included. IIRC it gets around the ordering problems you were having before by doing the instantiation as the result of overload resolution on an unqualified call, which happens in phase 2 and thus doesn't depend on having seen the overload for a particular archive type before the serializable type is EXPORTed. It doesn't do any other deep magic, IIRC. -- Dave Abrahams Boost Consulting www.boost-consulting.com

Robert Ramey

5:55 a.m.

David Abrahams wrote:

...

"Robert Ramey" <ramey@rrsd.com> writes:

...
All combinations of A and T actually used - and only those combinations - are used to instantiate code.

I don't think that can be true for any reasonable definition of "used." Code in one TU can't tell what other TUs are doing, so the code is instantiated for all archives whose headers have been included. IIRC it gets around the ordering problems you were having before by doing the instantiation as the result of overload resolution on an unqualified call, which happens in phase 2 and thus doesn't depend on having seen the overload for a particular archive type before the serializable type is EXPORTed. It doesn't do any other deep magic, IIRC.

After I thought about this some more I've also come to agree with this. So the above would be better said All combinations of A and T actually used - and only those combinations - are used to instantiate code - in a transation unit with at least one function called from either directly or indirectly from main. I never really picked up on this. This is probably because all but a couple of tests to be only one module long. I suspect that were I to modify one of export tests - it would fail in release mode in one or another of the compilers. But maybe not. The inclusion of "force_include.hpp" and corresponding usage of "_export" will probably guarentee that the compiled code is never stripped. So the problem may never be manifested in such a test. question: does the CW compiler include something like "_export" in order to compile windows (or *nix?) DLLS (shared libraries)? question:what if anything does the standard say about something like "_export" which is necessary in order to communicate that the corresponding code should never be stripped? I've been under the impression that a) "_export" is not portable b) "_export" or something like is necessary to supress code stripping c) suppression of code stripping is necessary in order to make a DLL which would imply that in order to produce a DLL, a compiler must be non-conforming - or at least support non-corformant behavior. Am I missng something here? Robert Ramey

David Abrahams

3:39 p.m.

"Robert Ramey" <ramey@rrsd.com> writes:

...

David Abrahams wrote:

...
"Robert Ramey" <ramey@rrsd.com> writes:

...
All combinations of A and T actually used - and only those combinations - are used to instantiate code.

I don't think that can be true for any reasonable definition of "used." Code in one TU can't tell what other TUs are doing, so the code is instantiated for all archives whose headers have been included. IIRC it gets around the ordering problems you were having before by doing the instantiation as the result of overload resolution on an unqualified call, which happens in phase 2 and thus doesn't depend on having seen the overload for a particular archive type before the serializable type is EXPORTed. It doesn't do any other deep magic, IIRC.

After I thought about this some more I've also come to agree with this.

So the above would be better said

All combinations of A and T actually used - and only those combinations - are used to instantiate code - in a transation unit with at least one function called from either directly or indirectly from main.

Sorry, I don't think that's an accurate statement of anything. This is accurate: In a given translation unit (TU), all combinations of A and T, where A is an archive registered (in the TU) with BOOST_SERIALIZATION_REGISTER_ARCHIVE and T is a class exported (in the TU) with BOOST_CLASS_EXPORT(**) are used to instantiate polymorphic pointer serialization code. Period. The translation unit or some portion of it may later be discarded if the linker can prove none of its functions or objects are used, or else its initializers may be skipped. Separate issue. In more detail: each time you use BOOST_SERIALIZATION_REGISTER_ARCHIVE, it instantiates pointer serialization for that archive in combination with all BOOST_CLASS_EXPORTed types, and vice versa: each time you use BOOST_CLASS_EXPORT, it instantiates pointer serialization for that class in combination with all BOOST_SERIALIZATION_REGISTER_ARCHIVEed archives. In a TU where the uses of these macros do not appear, no such instantiation takes place. By the way, my earlier explanation of the use of phase 2 lookups was inaccurate. The mechanism I describe in this paragraph doesn't depend on (and indeed, can't use) phase 2 lookups. (**) **PLEASE** be a good Boost citizen and change that name to BOOST_SERIALIZATION_EXPORT_CLASS!! The meaning of "exporting a class" cannot remain reserved by the serialization library throughout Boost. It's too general a description of things that many libraries will want to do, and while Boost.Serialization is important, it's not more important than every other library that might want to "export a class."

...

I never really picked up on this. This is probably because all but a couple of tests to be only one module long. I suspect that were I to modify one of export tests - it would fail in release mode in one or another of the compilers. But maybe not. The inclusion of "force_include.hpp" and corresponding usage of "_export" will probably guarentee that the compiled code is never stripped. So the problem may never be manifested in such a test.

question: does the CW compiler include something like "_export" in order to compile windows (or *nix?) DLLS (shared libraries)?

I don't know. I'm sure there are online docs somewhere.

...

question:what if anything does the standard say about something like "_export" which is necessary in order to communicate that the corresponding code should never be stripped?

Nothing; it's a pure extension.

...

I've been under the impression that

a) "_export" is not portable b) "_export" or something like is necessary to supress code stripping c) suppression of code stripping is necessary in order to make a DLL

When making a DLL, the linker is free to decide not to strip anything. DLLs are outside the standard (a pure extension), and the standard doesn't mandate any stripping anyhow.

...

which would imply that in order to produce a DLL, a compiler must be non-conforming - or at least support non-corformant behavior.

Am I missng something here?

Yeah, there's no law that says a linker ever has to strip anything. -- Dave Abrahams Boost Consulting www.boost-consulting.com

Emil Dotchevski

10:05 p.m.

New subject: [Serliazation] Auto registration/physical coupling (was: Thread-safety again)

...

Sorry, I don't think that's an accurate statement of anything. This is accurate:

In a given translation unit (TU), all combinations of A and T, where A is an archive registered (in the TU) with BOOST_SERIALIZATION_REGISTER_ARCHIVE and T is a class exported (in the TU) with BOOST_CLASS_EXPORT(**) are used to instantiate polymorphic pointer serialization code. Period.

The translation unit or some portion of it may later be discarded if the linker can prove none of its functions or objects are used, or else its initializers may be skipped. Separate issue.

Just to clarify, besides the portability issues, I brought up the issue of the templates being instantiated by the BOOST_CLASS_EXPORT macro because it introduces physical coupling between the classes being registered and the serialization library. Perhaps I need to clarify what I mean. //foo.h class foo; template <class A> void operator<<( A &, foo const & ); class foo { public: foo(); private: int m_; template <class A> friend void operator<<( A &, foo const & ); }; template <class A> void operator<<( A & a, foo const & x ) { a << x.m_; } The above foo.h defines class foo, and also specifies how objects of class foo are serialized, yet there is no (physical) coupling between foo.h and a serialization library, because foo.h does not include any headers. This is a good thing: typically, most of the compilation units that #include "foo.h" will not serialize foo objects; the ones that do serialize foo objects will know to #include "foo.h" and a (compatible) serialization library. Also consider that I can write a program that makes use of class foo yet doesn't use/include/link a serialization library. This would not be possible if foo.cpp auto-registers class foo with a serialization library.

Robert Ramey

10:36 p.m.

New subject: [Serliazation] Auto registration/physical coupling(was: Thread-safety again)

Emil Dotchevski wrote:

...

...
Sorry, I don't think that's an accurate statement of anything. This is accurate:

In a given translation unit (TU), all combinations of A and T, where A is an archive registered (in the TU) with BOOST_SERIALIZATION_REGISTER_ARCHIVE and T is a class exported (in the TU) with BOOST_CLASS_EXPORT(**) are used to instantiate polymorphic pointer serialization code. Period.

The translation unit or some portion of it may later be discarded if the linker can prove none of its functions or objects are used, or else its initializers may be skipped. Separate issue.

Just to clarify, besides the portability issues, I brought up the issue of the templates being instantiated by the BOOST_CLASS_EXPORT macro because it introduces physical coupling between the classes being registered and the serialization library. Perhaps I need to clarify what I mean.

//foo.h

class foo;

template <class A> void operator<<( A &, foo const & );

class foo { public: foo(); private: int m_; template <class A> friend void operator<<( A &, foo const & ); };

template <class A> void operator<<( A & a, foo const & x ) { a << x.m_; }

The above foo.h defines class foo, and also specifies how objects of class foo are serialized, yet there is no (physical) coupling between foo.h and a serialization library, because foo.h does not include any headers. This is a good thing: typically, most of the compilation units that #include "foo.h" will not serialize foo objects; the ones that do serialize foo objects will know to #include "foo.h" and a (compatible) serialization library. Also consider that I can write a program that makes use of class foo yet doesn't use/include/link a serialization library.

This would not be possible if foo.cpp auto-registers class foo with a serialization library.

_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Robert Ramey

10:53 p.m.

New subject: [Serliazation] Auto registration/physical coupling(was: Thread-safety again)

Emil Dotchevski wrote:

...

...
Sorry, I don't think that's an accurate statement of anything. This is accurate:

In a given translation unit (TU), all combinations of A and T, where A is an archive registered (in the TU) with BOOST_SERIALIZATION_REGISTER_ARCHIVE and T is a class exported (in the TU) with BOOST_CLASS_EXPORT(**) are used to instantiate polymorphic pointer serialization code. Period.

The translation unit or some portion of it may later be discarded if the linker can prove none of its functions or objects are used, or else its initializers may be skipped. Separate issue.

Just to clarify, besides the portability issues, I brought up the issue of the templates being instantiated by the BOOST_CLASS_EXPORT macro because it introduces physical coupling between the classes being registered and the serialization library. Perhaps I need to clarify what I mean.

//foo.h

class foo;

template <class A> void operator<<( A &, foo const & );

class foo { public: foo(); private: int m_; template <class A> friend void operator<<( A &, foo const & ); };

template <class A> void operator<<( A & a, foo const & x ) { a << x.m_; }

The above foo.h defines class foo, and also specifies how objects of class foo are serialized, yet there is no (physical) coupling between foo.h and a serialization library, because foo.h does not include any headers. This is a good thing: typically, most of the compilation units that #include "foo.h" will not serialize foo objects; the ones that do serialize foo objects will know to #include "foo.h" and a (compatible) serialization library. Also consider that I can write a program that makes use of class foo yet doesn't use/include/link a serialization library.

We have considered that. And there is even a specific test to be sure that serialization code isn't included if serialization isn't being used. The test is test_include. You can use BOOST_CLASS_EXPORT in a header and extra code will be generated until you do: ar << t a) So, what I suggest is using BOOST_CLASS_EXPORT in one's header; b) Declare - but don't define the serialize function c) in a separate foo.ipp define serialize function. d) make one module which includes foo.ipp and compile it and add it to your library of common application types. This module will include headers for all archives you expect to use or alternatively, the polymorphic archive. e) include foo.hpp in any program which calls any functions on foo - including serialization. f) compile application. g) link with library of common application types. If the application uses serialization, it gets the code from the library. Otherwise not.

...

This would not be possible if foo.cpp auto-registers class foo with a serialization library.

Robert Ramey

Emil Dotchevski

11:17 p.m.

New subject: [Serliazation] Auto registration/physical coupling(was:Thread-safety again)

I think I do not understand.

...

...
Just to clarify, besides the portability issues, I brought up the issue of the templates being instantiated by the BOOST_CLASS_EXPORT macro because it introduces physical coupling between the classes being registered and the serialization library. Perhaps I need to clarify what I mean.

//foo.h

class foo;

template <class A> void operator<<( A &, foo const & );

class foo { public: foo(); private: int m_; template <class A> friend void operator<<( A &, foo const & ); };

template <class A> void operator<<( A & a, foo const & x ) { a << x.m_; }

The above foo.h defines class foo, and also specifies how objects of class foo are serialized, yet there is no (physical) coupling between foo.h and a serialization library, because foo.h does not include any headers. This is a good thing: typically, most of the compilation units that #include "foo.h" will not serialize foo objects; the ones that do serialize foo objects will know to #include "foo.h" and a (compatible) serialization library. Also consider that I can write a program that makes use of class foo yet doesn't use/include/link a serialization library.

We have considered that. And there is even a specific test to be sure that serialization code isn't included if serialization isn't being used. The test is test_include. You can use BOOST_CLASS_EXPORT in a header and extra code will be generated until you do: ar << t

a) So, what I suggest is using BOOST_CLASS_EXPORT in one's header;

How am I supposed to use BOOST_CLASS_EXPORT without including a header file from the serialization library? Including serialization library headers introduces physical coupling between foo.h and the serialization library, because as Dave pointed out, BOOST_CLASS_EXPORT instantiates all kinds of templates. You would be relying on the optimizer to strip those instances if they're not used, but the physical coupling is still there.

...

b) Declare - but don't define the serialize function c) in a separate foo.ipp define serialize function. d) make one module which includes foo.ipp and compile it and add it to your library of common application types. This module will include headers for all archives you expect to use or alternatively, the polymorphic archive.

Wouldn't that mean that foo.ipp (which I suppose is part the "foo" stuff and is therefore shared between all projects that use foo) would have to know in advance what archives I will be using to serialize foo objects? Note that in my example above, I don't merely declare the serialization function, I define it but I do not include any headers. Now each project that uses foo can include different (including unknown in advance) archives and still use the serialization functionality defined by foo.h. Alternatively, a project that uses foo is free to not include any archives and to not call serialization functions. I don't see how this could be achieved in the system you describe. I would not be able to build a project that uses foo if boost serialization is unavailable, whereas in my example that's possible.

...

e) include foo.hpp in any program which calls any functions on foo - including serialization. f) compile application. g) link with library of common application types.

If the application uses serialization, it gets the code from the library. Otherwise not.

Yet there is still physical coupling between foo.h and the serialization library. Or am I missing something?

Robert Ramey

11:26 p.m.

New subject: [Serliazation] Auto registration/physicalcoupling(was:Thread-safety again)

Emil Dotchevski wrote:

...

I think I do not understand.

...
...
Just to clarify, besides the portability issues, I brought up the issue of the templates being instantiated by the BOOST_CLASS_EXPORT macro because it introduces physical coupling between the classes being registered and the serialization library. Perhaps I need to clarify what I mean.

//foo.h

class foo;

template <class A> void operator<<( A &, foo const & );

class foo { public: foo(); private: int m_; template <class A> friend void operator<<( A &, foo const & ); };

template <class A> void operator<<( A & a, foo const & x ) { a << x.m_; }

The above foo.h defines class foo, and also specifies how objects of class foo are serialized, yet there is no (physical) coupling between foo.h and a serialization library, because foo.h does not include any headers. This is a good thing: typically, most of the compilation units that #include "foo.h" will not serialize foo objects; the ones that do serialize foo objects will know to #include "foo.h" and a (compatible) serialization library. Also consider that I can write a program that makes use of class foo yet doesn't use/include/link a serialization library.

We have considered that. And there is even a specific test to be sure that serialization code isn't included if serialization isn't being used. The test is test_include. You can use BOOST_CLASS_EXPORT in a header and extra code will be generated until you do: ar << t

a) So, what I suggest is using BOOST_CLASS_EXPORT in one's header;

How am I supposed to use BOOST_CLASS_EXPORT without including a header file from the serialization library? Including serialization library headers introduces physical coupling between foo.h and the serialization library, because as Dave pointed out, BOOST_CLASS_EXPORT instantiates all kinds of templates. You would be relying on the optimizer to strip those instances if they're not used, but the physical coupling is still there.

...
b) Declare - but don't define the serialize function c) in a separate foo.ipp define serialize function. d) make one module which includes foo.ipp and compile it and add it to your library of common application types. This module will include headers for all archives you expect to use or alternatively, the polymorphic archive.

Wouldn't that mean that foo.ipp (which I suppose is part the "foo" stuff and is therefore shared between all projects that use foo) would have to know in advance what archives I will be using to serialize foo objects?

Yes - that's why I recommend compiling the modules to a library and linking the final application with the library. Hence code is only included when an archive is created. Note that "function level linking" is useful here. Note that in my example above, I don't merely

...

declare the serialization function, I define it but I do not include any headers. Now each project that uses foo can include different (including unknown in advance) archives and still use the serialization functionality defined by foo.h.

Alternatively, a project that uses foo is free to not include any archives and to not call serialization functions. I don't see how this could be achieved in the system you describe. I would not be able to build a project that uses foo if boost serialization is unavailable, whereas in my example that's possible.

...
e) include foo.hpp in any program which calls any functions on foo - including serialization. f) compile application. g) link with library of common application types.

If the application uses serialization, it gets the code from the library. Otherwise not.

Yet there is still physical coupling between foo.h and the serialization library.

Or am I missing something?

Why don't you look at export.hpp and compile and link test_inclusion and tell me what I'm missing? Robert Ramey

Emil Dotchevski

5 Jan 5 Jan

1:15 a.m.

New subject: [Serliazation] Autoregistration/physicalcoupling(was:Thread-safety again)

...

Why don't you look at export.hpp and compile and link test_inclusion and tell me what I'm missing?

export.hpp includes many boost headers, then defines the templates that BOOST_CLASS_EXPORT instantiates, and finally defines BOOST_CLASS_EXPORT itself. Any header that calls BOOST_CLASS_EXPORT is physically coupled with boost serialization, as well as with other boost components. My guess is that you're telling me that one can simply not define the templates that specify how a particular class is serialized (only provide declarations) and later use explicit template instantiation to make the compiler generate the instances that are needed. However if you use BOOST_CLASS_EXPORT, your classes are physically coupled with boots serialization and with boost in general. Beyond that, any project that includes a header that uses BOOST_CLASS_EXPORT will link the template instances generated by BOOST_CLASS_EXPORT, even if no call to boost serialization is ever made. Compare this to the example I provided a few levels up: foo.h does not include any header files and yet defines how objects of class foo are to be serialized. Correct me if I'm wrong but I think that that example is essentially compatible with boost serialization; it is the intended use of BOOST_CLASS_EXPORT in header files that introduces the physical coupling I'm complaining about. Or am I missing something still? :)

David Abrahams

6:41 p.m.

New subject: [Serliazation] Autoregistration/physicalcoupling

"Emil Dotchevski" <emildotchevski@hotmail.com> writes:

...

Compare this to the example I provided a few levels up: foo.h does not include any header files and yet defines how objects of class foo are to be serialized. Correct me if I'm wrong but I think that that example is essentially compatible with boost serialization; it is the intended use of BOOST_CLASS_EXPORT in header files that introduces the physical coupling I'm complaining about.

Is there something wrong with defining your classes in foo.hpp and exporting them in a foo_serialization.hpp that isn't included by foo.hpp? -- Dave Abrahams Boost Consulting www.boost-consulting.com

Emil Dotchevski

7 Jan 7 Jan

9:16 p.m.

New subject: [Serliazation] Autoregistration/physicalcoupling

...

Is there something wrong with defining your classes in foo.hpp and exporting them in a foo_serialization.hpp that isn't included by foo.hpp?

Nothing wrong with that. The question is where is foo_serialization.hpp #included? I am guessing that what you have in mind is that it would be included by all compilation units that serialize foo objects. Strictly speaking, you don't need the class registration in those units. It could be argued that something that serializes foo objects is physically coupled with the serialization library anyway, so the additional coupling with the class registration is not that big of a deal. The point I am making is that even code that serializes foo objects doesn't have to be physically coupled with the serialization library; all it needs is a function template overload for serializing foo objects, which -- as I illustrated -- doesn't have to be physically coupled with any serialization library. Going back to your suggestion to use foo_serialization.hpp: if we assume that all it does is register class foo (but doesn't serialize foo objects), I don't see why it's a separate header. I'd rather have a single compilation unit in my program which registers all archives I'm using, all necessary upcasts and classes that need to be serialized dynamically (in that program).

David Abrahams

5 Jan 5 Jan

6:39 p.m.

New subject: ** Please don't overquote **

"Robert Ramey" <ramey@rrsd.com> writes: <snip about 70 lines>

...

...
serialize foo objects?

Yes - that's why I recommend compiling the modules to a library and linking the final application with the library. Hence code is only included when an archive is created. Note that "function level linking" is useful here.

Robert, please try not to overquote. -- Dave Abrahams Boost Consulting www.boost-consulting.com

David Abrahams

4 Jan 4 Jan

4:50 a.m.

"Robert Ramey" <ramey@rrsd.com> writes:

...

...
(just call a little stub function in the TU from main()).

I can already hear the howling in the distance. If one can do this the it would be just as easy to explcitly the all the derived types at the beginning of an archive and avoid BOOST_CLASS_EXPORT all together.

Not at all; that couples the archive to the types being serialized, and requires one explicit registration (a relatively complicated thing) per type being serialized as opposed to one call to a void function (simple) per TU including EXPORT. -- Dave Abrahams Boost Consulting www.boost-consulting.com

Robert Ramey

5:42 a.m.

David Abrahams wrote:

...

"Robert Ramey" <ramey@rrsd.com> writes:

...
...
(just call a little stub function in the TU from main()).

I can already hear the howling in the distance. If one can do this the it would be just as easy to explcitly the all the derived types at the beginning of an archive and avoid BOOST_CLASS_EXPORT all together.

Not at all; that couples the archive to the types being serialized, and requires one explicit registration (a relatively complicated thing) per type being serialized as opposed to one call to a void function (simple) per TU including EXPORT.

I'll buy this argument. Robert Ramey

Emil Dotchevski

17 Dec 17 Dec

10:20 p.m.

I don't think it is unreasonable to require that users register each class and archive type before they use them together. If portability is not a problem for someone, they can stick this registration in a global object in the cpp file that defines the class, and suffer later when they port to another system. Even if portability was not an issue, I would still do the registration "manually". I don't think the author of class foo should be the one who decides whan archives will be used to serialize objects of class foo. Besides, what if I want to use class foo but I will never serialize it? The physical coupling introduced by the "automatic" registration will link all kinds of dead code to my executable. No, thanks. On top of that you have multi-threading problems to deal with.

Robert Ramey

11:29 p.m.

Emil Dotchevski wrote:

...

I don't think it is unreasonable to require that users register each class and archive type before they use them together.

I'm not the person that has to be convinced. Frankly it was easier to implement the functionality than to convince a large number of people that such functionality wasn't necessary.

...

If portability is not a problem for someone, they can stick this registration in a global object in the cpp file that defines the class, and suffer later when they port to another system.

I don't think they will suffer any with the compilers we've seen so far.

...

Even if portability was not an issue, I would still do the registration "manually".

Note that there is nothing preventing you from doing this. In fact in some special cases, it is actually required - see demo_pimple.

...

I don't think the author of class foo should be the one who decides whan archives will be used to serialize objects of class foo.

I don't think so either and I don't think the current implementation requires this.

...

Besides, what if I want to use class foo but I will never serialize it? The physical coupling introduced by the "automatic" registration will link all kinds of dead code to my executable. No, thanks.

Actually this doesn't occur. Code is only generated for those archives, and only those archives. whose headers are included. The only "dead code" is for those classes "exported" and serialized with a particular archive class. The code is there but may never be called. But its unknowable at the translation unit level so it has to bethere in case some other translation unit creates one of these things to be serialized. So its not even clear that "dead code" is the correct thing to call it. Maybe its only dead code because most optimizers work at the translation unit level. But even global optimizers can't know what some DLL is going to create - etc. Honestly, I think there are more subtilties to this question than first apprears and I think that a deeper examination would lead one to the conclusion that most of them are handled by the library in an effective, practical and correct manner. Robert Ramey

David Abrahams

3 Jan 3 Jan

4:35 p.m.

"Emil Dotchevski" <emildotchevski@hotmail.com> writes:

...

I don't think it is unreasonable to require that users register each class and archive type before they use them together. If portability is not a problem for someone, they can stick this registration in a global object in the cpp file that defines the class, and suffer later when they port to another system.

More work than necessary. Ensuring a particular TU's initializations are done is trivial. It's just that the library can't (portably) do it for you.

...

Even if portability was not an issue, I would still do the registration "manually". I don't think the author of class foo should be the one who decides whan archives will be used to serialize objects of class foo. Besides, what if I want to use class foo but I will never serialize it? The physical coupling introduced by the "automatic" registration will link all kinds of dead code to my executable. No, thanks.

No need to be concerned with that; you can put the uses of EXPORT in a separate header that includes the other header. -- Dave Abrahams Boost Consulting www.boost-consulting.com

Roland Schwarz

14 Dec 14 Dec

11:34 a.m.

Robert Ramey wrote:

...

I'm not convinced its that bad. We don't need a general solution - all we need is a specific one for this particular problem.

Yes of course. What I was talking about: We currently do not have a general solution to this problem, that is worth making it a part of e.g. Boost.Thread. Agreed? Roland

6771

Age (days ago)

6796

Last active (days ago)

List overview

Download

45 comments

6 participants

participants (6)

David Abrahams
Emil Dotchevski
Martin Bonner
Robert Ramey
Roland Schwarz
Sean Huang