Jarl Lindrud wrote:
The version # was envisioned as a small integer. All the examples tests and demos used this. The problem comes about because it was unanticipated that someone would like to include actual data (ie a date in this case) in a version #. Note that this is the first time in 9 years that this has come up. So I think it's a little much to characterize it as a bug in the library. It would better be called an unanticipated usage of the version #.
IIUC, in 1.41.0 and earlier, the version number was an int. In 1.42.0, it is now 16 bits, which is a breaking change on just about every platform.
The version # has always been 16 bits. The binary archive has always stored 16 bits for the version #. The code used an int - whose size varies between 16 to 64 bits depending on the platform. Text archives convert the int to a string and this conversion doesn't trap when the number passes 16 bits.
The responsibility of dealing with this archive format change surely lies with Boost.Serialization itself?
There is no format change in the library.
Or do Boost.Serialization users need to know that archives they write are not necessarily readable by later versions?
Hmmm - storing a 32 bit integer in a value saved as a 16 bit value (binary_archive) is not a good idea. I recognize that it was not obvious when one did that and that it could work in some cases - such as this users. That's exactly what the level 4 warning was telling me. So I fixed the code to suppress the warning ! and here we are.
I can't see much middle ground here - either you're backwards compatible, or you're not.
lol - no question about that.
If not, how do you make changes to the archive format (e.g. the change David found in 1.42.0) without breaking old archives?
This is described in the documentation. The version # is maintained on a class basis and is completely independent of any other number such as program or boost version. A little reflection should make it clear why it pretty much has to be this way.
I'm talking about changes within Boost.Serialization itself, not changes to user-defined types. The 32-bit-to-16-bit change that triggered this discussion, is a good example. How will Boost.Serialization in the future, know whether to read a 16 or 32 bit version number, from an archive?
In this particular case, the situation is not that bad. This particular code has only been tested with text archives. (It would break immediately with binary ones). So the only issue is what size should the version # be read into. Even here it's a specific case as on a machine with a 16 bit int, the users code would have already failed. I'm still thinking about this, but I can see that reading the version # into an int rather than an int_least16_t would solve his problem - though it wouldn't address the other issues I've mentioned. I'll consider this for version 1.43. This would permit him to load old archives. 1.42 will trap when a version # exceeds 16 bits. I wouldn't expect this to change though. So the problem of how use version # will have to be dealt with.
If it always reads a 16 bit version number, then you've broken compatibility with all pre-1.42.0 archives. If it always reads a 32 bit version number, then you've broken compatibility with 1.42.0.
To deal with this, you really need to know which version of Boost was used to create the archive.
There is a mechanism for addressing these kinds of issues - it's the library version # as described in the documentation. So far, that # is up to 4.
How would an application ever be able to exchange data with older, deployed, versions of itself, without this capability?
Again, a little reflection will make it clear that an older version of a program can't anticipate changes in a subsequent version. I'm sorry - it's just logically not possible. Think about it.
Do you realize that e.g. Microsoft Word 2007 can be instructed to save files in such a way that they can be loaded with Word 2003? What is logically impossible about that?
Can Microsoft 2003 word load files created with Microsoft word 2007? That is what we're talking about here. The question of being able to create previous versions has been discussed. In fact, there is a section of the documentation in which this is discussed as a possible extension. It wouldn't be all that hard to implement - but no one has shown any interest in doing it.
Your response to David seemed to be essentially "too bad, maybe you can find a way around it yourself", so I can't see that (1) is being taken very seriously.
If I had an easy answer, honestly I would share it. Really. I don't. Sorry.
Fair enough, but then it should be stated clearly in the documentation: "Archives created by one version of Boost.Serialization are *not* guaranteed to be readable by subsequent versions of Boost.Serialization.".
Hmmm - I might be willing to say a) that the intention is to make such a guarentee b) and every effort has been made to that end c) and that every attempt has been made to anticipate the usage of the library d) and that the library has been in usage for many years e) and that versioning is a widely used facility f) that has had very few problems from users g) and that continual efforts are being made to make that guarentee stronger h) but that it's possible that there is something I haven't anticipated which will create a problem. But I suppose that goes without saying.
Of course... The point is that with a robust versioning scheme in place, archive format changes can be implemented without breaking older software.
There is a robust (and efficient) versioning scheme has been in place since the beginning. It was never designed to be able to hold extra data. It's unfortunate that I didn't trap such an unintended usage. I try really hard - but I haven't been able to trap every case where something is used in a way that doesn't occur to me.
How can you call it robust? It is evidently not providing compatibility in either the backwards, or the forwards, direction.
Honestly, I can't help but wonder if you've read the documentation or used the library. Robert Ramey