Vladimir Prus wrote:
Robert Ramey wrote:
In that case, the issue David encountered is unambiguously a bug in Boost.Serialization, and should be fixed in a future version of Boost.
The version # was envisioned as a small integer. All the examples tests and demos used this. The problem comes about because it was unanticipated that someone would like to include actual data (ie a date in this case) in a version #. Note that this is the first time in 9 years that this has come up.
Well, I assume that most folks who use Boost.Serialization don't post here to report what version scheme they have used. So, I suggest you don't make broad conclusions based on the number of reports.
I would expect that people running across problems with the scheme would post here. So I think it's correct that this hasn't been a problem up until now.
If you have changed the number of bytes used to store the class version, and you do *not* store boost version number in archive, then how can 1.42 read an archive created by 1.41 -- even assuming the classes being serialized did not change themself.
This should be clear from reading the documentation. If it's not, we can enhance the documentation. It's very simple. Each class is assigned a version # starting with 0. When a new member is added to the class the version # is changed with BOOST_CLASS_VERSION(name, #). The signature for loading is: void load(Archive & ar, T & t, const version){ ar >> m_x; ... if(version >= 1) ar >> m_z; } in addition, there is a serialization library version returned with get_library_version. This is used internally by the library to address changes in serialization of primitives and other types for which class information is not kept in the archive. I believe that this version # is now up to 4. The class version # is the version # of the class - NOT boost, not the application, not anything else.
If not, how do you make changes to the archive format (e.g. the change David found in 1.42.0) without breaking old archives?
In this case, David overloaded the version # in a way I never anticipated. Their usage of the version number presumed a 32 bit integer. binary archives only use 16 bits. So their change would make it impossible for them to use binary archives. This issue doesn't show up in text archives since a variable length string is used for integers. I made the change to detect exactly this type of unintended usage which would make serializations non-portable. Obviously I did this a few years too late.
This is described in the documentation. The version # is maintained on a class basis and is completely independent of any other number such as program or boost version. A little reflection should make it clear why it pretty much has to be this way.
Unless you promise and document that format used by boost.serialize will never change, it seems like you also have to include the version number for the archive format itself.
it's in the documentation as get_library_version(). So far I don't think any user has ever had to call this function. We do call it in the implementation of serialization for collections. I think the need for this arose in the implementation of fast array serialization which made a "shortcut" through the normal procedure.
How would an application ever be able to exchange data with older, deployed, versions of itself, without this capability?
Again, a little reflection will make it clear that an older version of a program can't anticipate changes in a subsequent version. I'm sorry - it's just logically not possible. Think about it.
Assuming the use classes are not changed, why program built with 1.41 cannot read archive created by 1.42?
It can. In this particular case, the situation is that a program built with 1.42 cannot read an archive created by 1.41 Robert Ramey