[serialization] Compatibility between 1.32 and 1.33

Klaus Nowikow

5 Jul 2005 5 Jul '05

7:41 a.m.

Will a file created with a binary archive from Boost 1.32 be readable with Boost 1.33? I am just building the latest version from the cvs to try it myself, but I thought it would also be a good idea to ask this here. Best regards, Klaus

Show replies by date

Klaus Nowikow

5 Jul 5 Jul

10:20 a.m.

Klaus Nowikow <nowikow <at> decomsys.com> writes:

...

Will a file created with a binary archive from Boost 1.32 be readable with Boost 1.33? I am just building the latest version from the cvs to try it myself, but I thought it would also be a good idea to ask this here.

Just wanted to let you know: I tried it now with my own classes (no pointers, but std::vectors and maps), and it seems to work. I did not do extensive testing, though. Regards, Klaus

Robert Ramey

3:20 p.m.

Klaus Nowikow wrote:

...

Will a file created with a binary archive from Boost 1.32 be readable with Boost 1.33?

That is the intention.

...

I am just building the latest version from the cvs to try it myself, but I thought it would also be a good idea to ask this here.

I certainly would appreciate someone trying this. Note that for shared_ptr, one will have to include shared_ptr_132.hpp before shared_ptr.hpp in order to be able to load previously archived shared_ptrs. This hasn't made it into the manual yet. Robert Ramey

David Abrahams

4:14 p.m.

"Robert Ramey" <ramey@rrsd.com> writes:

...

Klaus Nowikow wrote:

...
Will a file created with a binary archive from Boost 1.32 be readable with Boost 1.33?

That is the intention.

...
I am just building the latest version from the cvs to try it myself, but I thought it would also be a good idea to ask this here.

I certainly would appreciate someone trying this.

Note that for shared_ptr, one will have to include shared_ptr_132.hpp before shared_ptr.hpp in order to be able to load previously archived shared_ptrs.

What is the mechanism at work here? Anything involving an #include order dependency worries me, because usually these things are a playground for ODR violations or other inducers of undefined behavior. -- Dave Abrahams Boost Consulting www.boost-consulting.com

Edward Diener

5:06 p.m.

David Abrahams wrote:

...

"Robert Ramey" <ramey@rrsd.com> writes:

...
Klaus Nowikow wrote:

...
Will a file created with a binary archive from Boost 1.32 be readable with Boost 1.33?

That is the intention.

...
I am just building the latest version from the cvs to try it myself, but I thought it would also be a good idea to ask this here.

I certainly would appreciate someone trying this.

Note that for shared_ptr, one will have to include shared_ptr_132.hpp before shared_ptr.hpp in order to be able to load previously archived shared_ptrs.

What is the mechanism at work here? Anything involving an #include order dependency worries me, because usually these things are a playground for ODR violations or other inducers of undefined behavior.

I made a similar point about #include order dependency and the serialization library in this NG, but my point was that the creator of a library can always make sure that a particular header file is included if necessary, through checking #defines, and that the end user should never have to worry about header include order. I think this is really important from an end-user's point of view. I do not believe that any library in which, from the end-user's perspective, header files have to be included in a particular order can be really robust, even though I know that this is sometimes done in some famous situations. Regarding the particular situation explained above, I would much rather that the end-user have to #define a macro for a particular file in order to load a previously archived object from an earlier version, if that is what it takes, than that he/she need to include header files in a particular order. Header file order dependencies, from an end-user's perspective, are a minefield it is best to avoid at all costs.

Robert Ramey

6:02 p.m.

Edward Diener wrote:

...

I made a similar point about #include order dependency and the serialization library in this NG, but my point was that the creator of a library can always make sure that a particular header file is included if necessary, through checking #defines, and that the end user should never have to worry about header include order.

I think this is really important from an end-user's point of view. I do not believe that any library in which, from the end-user's perspective, header files have to be included in a particular order can be really robust, even though I know that this is sometimes done in some famous situations.

Regarding the particular situation explained above, I would much rather that the end-user have to #define a macro for a particular file in order to load a previously archived object from an earlier version, if that is what it takes, than that he/she need to include header files in a particular order.

Header file order dependencies, from an end-user's perspective, are a minefield it is best to avoid at all costs.

Well, that would be my preference, but in some cases its unavoidable. In the serialization library I've got a couple of situations: export.hpp: the function of BOOST_CLASS_EXPORT("classname") is to instantiate code for the particular class for all archives used. The list of all archives used is built from looking at the guard macros from previously seen *archive.hpp files. This permits instantiations to be limited to those *archive classes actually used whose declarations have actually been included. That alternative to this would be to instanciate code for all archives - which would make programs much bigger and take much longer to compile. two-phase lookup: In general, the existence of two-phase lookup can alter the symantics of the program depending on header order. I believe that this can be addressed with partial template specialization but not all compilers supported by the serialization library support this. This resulted in a bunch of quirky and non-obvious rules about which namespace to put serialization specializations in - see the 1.32 documentation. The rules depended on whether or not partial template specialization was supported. So the stl serialzation was filled with alot of #ifdef ... . By adhereing to the rule that all *archive.hpp headers come before all *serialization headers all these problems were resolved. typeinfo_implementation.hpp The serialization system needs a system for handling the type of data at runtime. During the review, there was one reviewer who made a point that RTTI should not be required. I wasn't really convinced but it turned out I had to make a thing called extended_type_info which supplemented the functionality of RTTI by adding GUID (globally unique identifier - a string constant). As I worked on this, I was able to factor out the implementation of this so that in fact I wasn't really tied to RTTI to implement it. The result was that the system permits one to plug-in his own preferred extended_type_info implemention. This is tested and demoed in test_no_rtti. So far so good. Now it turns out that no one really uses this facility as the RTTI one is more convenient. So I implemented the idea that if no type_info system has been #included, #include the extended_type_info_typeid.hpp one as a default. So, that is how things arrived at the current situation. As I said, it wouldn't be my first choice, but I find it much preferable to the alternatives. So I imposed the rule: "all <boost/archive/...> should be listed before all the <boost/serialization/...> includes" Its very easy to remember and is enforced by and #error ... if the rule is violated. It does inhibit the mixing of <boost/archve/..> and <boost/serialization/..> includes other header modules. But in view this should never be done anyway as doing so compromises the orthogonality of the <boost/serialization/..> and <boost/archive/..> headers which is a key concept of the library implementation. I'm sitting here hoping against hope that this will not turn into another very long thread. Robert Ramey

Edward Diener

9:22 p.m.

Robert Ramey wrote:

...

Edward Diener wrote:

...
I made a similar point about #include order dependency and the serialization library in this NG, but my point was that the creator of a library can always make sure that a particular header file is included if necessary, through checking #defines, and that the end user should never have to worry about header include order.

I think this is really important from an end-user's point of view. I do not believe that any library in which, from the end-user's perspective, header files have to be included in a particular order can be really robust, even though I know that this is sometimes done in some famous situations.

Regarding the particular situation explained above, I would much rather that the end-user have to #define a macro for a particular file in order to load a previously archived object from an earlier version, if that is what it takes, than that he/she need to include header files in a particular order.

Header file order dependencies, from an end-user's perspective, are a minefield it is best to avoid at all costs.

Well, that would be my preference, but in some cases its unavoidable. In the serialization library I've got a couple of situations: snipped...

Thanks for the explanation. Although I really dislike header file dependencies, the rule that you have about including archive headers before serialization headers is easy to understand. Regarding the partial specialization and no-RTTI problems, and without knowing your internal code, I would just like to suggest that for those using conforming compilers and RTTI you should strive not to impose the header file ordering rule of having the header archives files included before the header serialization files. I realize you are doing this, even in this case, anyway because the BOOST_CLASS_EXPORT("class") works properly with the rule, but I would still seek to eliminate this header file dependency for conforming compilers and those who naturally use RTTI. Do not think, by saying the above, I do not applaud your heroic efforts to get serialization to work with non-conforming compilers and those people who prefer to turn off RTTI, but only that there must come a point where those who use standard C++ should not face any of the restrictions of those who do not. Not that it is necessarily the programmer's fault that their compiler is deficient, or that they must not use RTTI for some reason, but I consider both non-standard C++. As a long shot, and again not knowing the internals, is it not possible to have BOOST_CLASS_EXPORT("class") work as each archive is included, perhaps by including export.h at the end of each of your own archives and building the exported information one at a time ? No I am not trying to create extra work for you <g>, but just suggesting future possibilities to eliminate the very reasonable header file rule you have.

Robert Ramey

10:22 p.m.

Edward Diener wrote:

...

Thanks for the explanation. Although I really dislike header file dependencies, the rule that you have about including archive headers before serialization headers is easy to understand.

...

Regarding the partial specialization and no-RTTI problems, and without knowing your internal code, I would just like to suggest that for those using conforming compilers and RTTI you should strive not to impose the header file ordering rule of having the header archives files included before the header serialization files. I realize you are doing this, even in this case, anyway because the BOOST_CLASS_EXPORT("class") works properly with the rule, but I would still seek to eliminate this header

Its generally underestimated how much effort it takes to make something like this work on a wide variety of platforms. In order to accomplish this one sometimes tries for the least common denominator. Other times different code is used for different platforms. One fixed point that I tried to maintain is that it be possible that one can write code that will work on all platforms that boost supports. I viewed this as essential to permit other library writers to serialization modules for their classes once and only once. This sort of supports the lowest common denominator approach which I tend to favor.

...

file dependency for conforming compilers and those who naturally use RTTI.

I should say I wasn't crazy about providing an alternative to RTTI. But I ended up doing it for a couple of reasons. I had a list of criticisms to address from the first rejection. There were a number of other criticism that I wasn't crazy about addressing either. But in addressing each of these one by one, I came to change my mind and see them as better ideas and making a better package. The last one was no-rtti. I thought this was really not worth the effort but then I thought - hmmm, that's what I thought about the others also. So, I'll experiment and see what happens. I factored out the extended_type_info code as separate header and implementation and made a non-rtti implementation. I still wasn't convinced it was a valuable feature. But it did isolate a fundamental piece in a more logical, and understandable and modular piece. So I left it it. It turns out that this plays a key part in "user" code - new serialization of shared_ptr so now I have to add extended_type_info to the documentation - sort of pain. I don't have a real point here. I'm just trying to convey how things get to where they are and how its not always easy to make a concise rationale for the way things are.

...

Do not think, by saying the above, I do not applaud your heroic efforts to get serialization to work with non-conforming compilers and those people who prefer to turn off RTTI, but only that there must come a point where those who use standard C++ should not face any of the restrictions of those who do not. Not that it is necessarily the programmer's fault that their compiler is deficient, or that they must not use RTTI for some reason, but I consider both non-standard C++.

For better or worse, this is a feature of the boost process. In order to get a library accepted, almost every requested requirement has to be accomodated unless it directly conflicts with some other requested feature. Net result is a library which covers far more than any one programmer or project needs. When someone makes a "strong" case for a some arcane but hard to implement "feature", there is no one (other than the library author) with a strong case for not doing it. And the library ends up including almost everything. (Oh don't forget compile time and run-time performance as well) The difficulty of getting a library accepted grows disproportionatly to its size and the number of people that might be interested in it. I'm not saying this is necessarily a bad thing. That's just the way it is.

...

As a long shot, and again not knowing the internals, is it not possible to have BOOST_CLASS_EXPORT("class") work as each archive is included, perhaps by including export.h at the end of each of your own archives and building the exported information one at a time ? No I am not trying to create extra work for you <g>, but just suggesting future possibilities to eliminate the very reasonable header file rule you have.

I never considered such an idea so its hard to comment on. During the first review there was a lot of objection that modules were way too big. So I made effort so that one only had to include what one used. So one only has to include export.hpp if and only if BOOST_CLASS_EXPORT is in fact being used. That's how we got here. Robert Ramey

Vladimir Prus

6 Jul 6 Jul

5:59 a.m.

Robert Ramey wrote:

...

...
Header file order dependencies, from an end-user's perspective, are a minefield it is best to avoid at all costs.

Well, that would be my preference, but in some cases its unavoidable. In the serialization library I've got a couple of situations:

export.hpp:

the function of BOOST_CLASS_EXPORT("classname") is to instantiate code for the particular class for all archives used. The list of all archives used is built from looking at the guard macros from previously seen *archive.hpp files. This permits instantiations to be limited to those *archive classes actually used whose declarations have actually been included. That alternative to this would be to instanciate code for all archives - which would make programs much bigger and take much longer to compile.

I recall that I did suggested one approach. BOOST_CLASS_EXPORT can register the class with all previously included archives and, unconditionally, with polymorphic archive. During saving, you can check if the saved class is registered with specific archive type. If not, you wrap archive in polymorphic archive and save. That would be slower, but in most situation extra virtual function call won't be a practical problem. In the rare case where it will be a problem, user can: 1. Include necessary archive headers 2. Invoke BOOST_CLASS_EXPORT *again* in his module, after including another archive header. If BOOST_CLASS_EXPORT tolerates multiple invocations, this will instantiate the code for the needed archive type, and make saving to that archive type go without polymorphic arhive.

...

two-phase lookup:

In general, the existence of two-phase lookup can alter the symantics of the program depending on header order. I believe that this can be addressed with partial template specialization but not all compilers supported by the serialization library support this. This resulted in a bunch of quirky and non-obvious rules about which namespace to put serialization specializations in - see the 1.32 documentation. The rules depended on whether or not partial template specialization was supported. So the stl serialzation was filled with alot of #ifdef ... . By adhereing to the rule that all *archive.hpp headers come before all *serialization headers all these problems were resolved.

I never understood this. BTW, did you have a chance to try my ADL patch?

...

"all <boost/archive/...> should be listed before all the <boost/serialization/...> includes"

Its very easy to remember and is enforced by and #error ... if the rule is violated. It does inhibit the mixing of <boost/archve/..> and <boost/serialization/..> includes other header modules. But in view this should never be done anyway as doing so compromises the orthogonality of the <boost/serialization/..> and <boost/archive/..> headers which is a key concept of the library implementation.

I'm sitting here hoping against hope that this will not turn into another very long thread.

Sorry, it might turn into a long thread after all, because you haven't still answered the attached message of mine. To summarize, I claim that the rule immediately prevents the "include my own headers right at the top" practice that I use. - Volodya

Robert Ramey

4 p.m.

Vladimir Prus wrote: Robert Ramey wrote:

...

...
...
Header file order dependencies, from an end-user's perspective, are a minefield it is best to avoid at all costs.

Well, that would be my preference, but in some cases its unavoidable. In the serialization library I've got a couple of situations:

export.hpp:

the function of BOOST_CLASS_EXPORT("classname") is to instantiate

...
code for the particular class for all archives used. The list of all archives used is built from looking at the guard macros from previously seen *archive.hpp files. This permits instantiations to be limited to those *archive classes actually used whose declarations have actually been included. That alternative to this would be to instanciate code for all archives - which would make programs much bigger and take much longer to compile.

I recall that I did suggested one approach. BOOST_CLASS_EXPORT can register the class with all previously included archives and, unconditionally, with polymorphic archive. During saving, you can check if the saved class is registered with specific archive type. If not, you wrap archive in polymorphic archive and save. That would be slower, but in most situation extra virtual function call won't be a practical problem.

...

In the rare case where it will be a problem, user can: 1. Include necessary archive headers 2. Invoke BOOST_CLASS_EXPORT *again* in his module, after including another archive header. If BOOST_CLASS_EXPORT tolerates multiple invocations, this will instantiate the code for the needed archive type, and make saving to that archive type go without polymorphic arhive.

export.hpp is already quite complicated. This would seem to add a huge amount of effort at compiler and at rntime. It would also add dead code to almos t every executable using the library. It would seem a huge price to pay just to permit one to say. #include <boost/serialization/export.hpp #include <boost/archive/text_iarchive.hpp> rather than #include <boost/archive/text_iarchive.hpp> #include <boost/serialization/export.hpp

...

I never understood this. BTW, did you have a chance to try my ADL patch?

I considered your patch, tested it on my own system, and rolled into the main CVS sometime ago.

...

...
"all <boost/archive/...> should be listed before all the <boost/serialization/...> includes"

Its very easy to remember and is enforced by and #error ... if the rule is violated. It does inhibit the mixing of <boost/archve/..> and <boost/serialization/..> includes other header modules. But in view this should never be done anyway as doing so compromises the orthogonality of the <boost/serialization/..> and <boost/archive/..> headers which is a key concept of the library implementation.

I'm sitting here hoping against hope that this will not turn into another very long thread.

...

Sorry, it might turn into a long thread after all, because you haven't still answered the attached message of mine. To summarize, I claim that the rule immediately prevents the "include my own headers right at the top" practice that I use.

Here is a copy of your preveious post - I'll answer it here

...

...
...
...
How? Clearly, if I include base_object.hpp in a header, I cannot obey the above rule. Ok leaving just:

#include <boost/serialization/access.hpp> #include <boost/serialization/split_member.hpp> #include <boost/serialization/base_object.hpp>

...

...
I don't understand you. If A.hpp contains the above, and my .cpp files contains

#include "A.hpp"

#include <boost/archive/text_oarchive.hpp>

I would recommend taht the above be changed to: #include <boost/archive/text_oarchive.hpp> #include "A.hpp"

...

...
Please take a look at:

http://zigzag.lvk.cs.msu.su/~ghost/serialization_problems/

I looked at your examples, made some changes in the library, tested them an checked them in.

...

...
The second example is typical and very easily resolved. Just move the *archive includes above static.h (which indirectly include the serializations of for a class Module data.

...

As I did explain, I want "static_data.h" (just like every header corresponding to my .cpp) to be the very first include in my .cpp file.

...

Did you see my arguments? Do you find them weak? If so, can you explain why?

Your requirement that static_data.h be first in every header conflicts with my requirement that serialization headers be after archive headers. I've explained why, given the alternatives I have available, I think its the best one. That's the best I can do. Robert Ramey

Robert Ramey

5 Jul 5 Jul

5:35 p.m.

David Abrahams wrote:

...

"Robert Ramey" <ramey@rrsd.com> writes:

What is the mechanism at work here? Anything involving an #include order dependency worries me, because usually these things are a playground for ODR violations or other inducers of undefined behavior.

The idea is that I didn't want to burden user's of the library with loading the shared_ptr_132.hpp header if they don't need it. So if you don't need to read older archives one can use: #include <boost/serialization/shared_ptr.hpp> which is what everyone would expect. This skips over useless and wasteful code that is not needed. If one needs his code to be able to read shared_ptr archived under the 1.32 system, he would use: #include <boost/serialization/shared_ptr_132.hpp> #include <boost/serialization/shared_ptr.hpp> Of course an alternative would be for <boost/serialization/shared_ptr.hpp> to include .../shared_ptr_132.hpp but I thought people without old archives would object to that - as I would personally. Robert Ramey

Jody Hagins

6:11 p.m.

On Tue, 5 Jul 2005 10:35:21 -0700 "Robert Ramey" <ramey@rrsd.com> wrote:

...

Of course an alternative would be for <boost/serialization/shared_ptr.hpp> to include .../shared_ptr_132.hpp but I thought people without old archives would object to that - as I would personally.

Sounds like a preprocessor definition should be used instead, and then serialization/shared_ptr.hpp can optionally include the 132 header file, if the apprpriate definition for use-132-shared-ptr exists... This effectively hides it from everyone, and only those who need it would make the appropriate definition...

Robert Ramey

7:07 p.m.

Jody Hagins wrote:

...

On Tue, 5 Jul 2005 10:35:21 -0700 "Robert Ramey" <ramey@rrsd.com> wrote:

...
Of course an alternative would be for <boost/serialization/shared_ptr.hpp> to include .../shared_ptr_132.hpp but I thought people without old archives would object to that - as I would personally.

Sounds like a preprocessor definition should be used instead, and then serialization/shared_ptr.hpp can optionally include the 132 header file, if the apprpriate definition for use-132-shared-ptr exists...

This effectively hides it from everyone, and only those who need it would make the appropriate definition...

So I presume that instead of #include <boost/serialization/shared_ptr_132.hpp> #include <boost/serialization/shared_ptr.hpp> one would use #define BOOST_SERIALIZATION_SHARED_PTR_132_COMPATIBILITY #include <boost/serialization/shared_ptr.hpp> Is that really an improvement? Actually, I'm wondering about having shared_ptr_132.hpp including shared_ptr.hpp So one would include just one or the other. Robert Ramey

Jody Hagins

8:10 p.m.

On Tue, 5 Jul 2005 12:07:07 -0700 "Robert Ramey" <ramey@rrsd.com> wrote:

...

So I presume that instead of

#include <boost/serialization/shared_ptr_132.hpp> #include <boost/serialization/shared_ptr.hpp>

one would use

#define BOOST_SERIALIZATION_SHARED_PTR_132_COMPATIBILITY #include <boost/serialization/shared_ptr.hpp>

Is that really an improvement?

I was thinking that the define would be in the proper makefiles, or config headers, depending on the project build specifications.

...

Actually, I'm wondering about having shared_ptr_132.hpp including shared_ptr.hpp So one would include just one or the other.

I think you'd still end up with application developers defining something, so that when they change their code, they do not have to go change all the locations where they included the shared_ptr_WHATEVER headers.

Peter Dimov

8:15 p.m.

Robert Ramey wrote:

...

So I presume that instead of

#include <boost/serialization/shared_ptr_132.hpp> #include <boost/serialization/shared_ptr.hpp>

one would use

#define BOOST_SERIALIZATION_SHARED_PTR_132_COMPATIBILITY #include <boost/serialization/shared_ptr.hpp>

Is that really an improvement?

It might be. The macro can be #defined at project level, ensuring consistency.

AlisdairM

10:34 p.m.

Jody Hagins wrote:

...

...
Of course an alternative would be for <boost/serialization/shared_ptr.hpp> to include .../shared_ptr_132.hpp but I thought people without old archives would object to that - as I would personally.

Sounds like a preprocessor definition should be used instead, and then serialization/shared_ptr.hpp can optionally include the 132 header file, if the apprpriate definition for use-132-shared-ptr exists...

This effectively hides it from everyone, and only those who need it would make the appropriate definition...

Definitely a step in the right direction, but it does not remove the need for shared_ptr_132.hpp to check for the header-guard of serialization/shared_ptr.hpp and flag errors on mis-use. Belt AND braces if you please <g> -- AlisdairM

Thorsten Ottosen

6 Jul 6 Jul

3:55 p.m.

Robert Ramey <ramey <at> rrsd.com> writes:

...

Klaus Nowikow wrote:

...
Will a file created with a binary archive from Boost 1.32 be readable with Boost 1.33?

That is the intention.

...
I am just building the latest version from the cvs to try it myself, but I thought it would also be a good idea to ask this here.

I certainly would appreciate someone trying this.

How can we be even close to guaranteing this if we have no test that shows it? br -Thorsten

Robert Ramey

4:05 p.m.

The library contains the test "test_shared_ptr_132.cpp" which I used to test backward compatibility for shared_ptr. This is the only serializaiton whose version is changed between the two versions. I concede this isn't exhaustive. A truely exhaustive test would take all the tests we have now, use them to create 1.32 version files, and read them with 1.33 version code. That would be quite a project. I'm pleased to encourage anyone who wants to take this on. Robert Ramey Thorsten Ottosen wrote:

...

Robert Ramey <ramey <at> rrsd.com> writes:

...
Klaus Nowikow wrote:

...
Will a file created with a binary archive from Boost 1.32 be readable with Boost 1.33?

That is the intention.

...
I am just building the latest version from the cvs to try it myself, but I thought it would also be a good idea to ask this here.

I certainly would appreciate someone trying this.

How can we be even close to guaranteing this if we have no test that shows it?

br

-Thorsten

_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

7326

Age (days ago)

7327

Last active (days ago)

List overview

Download

17 comments

9 participants

participants (9)

AlisdairM
David Abrahams
Edward Diener
Jody Hagins
Klaus Nowikow
Peter Dimov
Robert Ramey
Thorsten Ottosen
Vladimir Prus