Re: [Boost-users] [serialization] class versioning changes in boost 1.42

25 Feb 2010

      Jarl Lindrud wrote:
...
...
The version # was envisioned as a small integer.  All the examples
tests and demos used this.  The problem comes about because it was
unanticipated that someone would like to include actual data (ie a
date in this case) in a version #.  Note that this is the first time
in 9 years that this has come up.  So I think it's a little much to
characterize
it as a bug in the library.  It would better be called an
unanticipated usage of the version #.
IIUC, in 1.41.0 and earlier, the version number was an int. In
1.42.0, it is now 16 bits, which is a breaking change on just about
every platform.
The version # has always been 16 bits.  The binary archive has
always stored 16 bits for the version #.  The code used an int -
whose size varies between 16 to 64 bits depending on the platform.
Text archives convert the int to a string and this conversion doesn't
trap when the number passes 16 bits.
...
The responsibility of dealing with this archive
format change surely lies with Boost.Serialization itself?
There is no format change in the library.
...
Or do
Boost.Serialization users need to know that archives they write are
not necessarily readable by later versions?
Hmmm - storing a 32 bit integer in a value saved as a 16 bit
value (binary_archive) is not a good idea.  I recognize that it
was not obvious when one did that and that it could work in
some cases - such as this users.  That's exactly what the level
4 warning was telling me.  So I fixed the code to suppress the
warning ! and here we are.
...
I can't see much middle ground here - either you're backwards
compatible, or you're not.
lol - no question about that.
...
...
...
If not, how do you make changes
to the archive format (e.g. the change David found in 1.42.0)
without breaking old archives?
This is described in the documentation.  The version # is maintained
on a class basis and is completely independent of any other number
such as program or boost version.  A little reflection should make
it clear why it pretty much has to be this way.
...
I'm talking about changes within Boost.Serialization itself, not
changes to user-defined types. The 32-bit-to-16-bit change that
triggered this discussion, is a good example. How will
Boost.Serialization in the future, know whether to read a 16 or 32
bit version number, from an archive?
In this particular case, the situation is not that bad.  This particular
code has only been tested with text archives. (It would break
immediately with binary ones).  So the only issue is what
size should the version # be read into.  Even here it's a specific
case as on a machine with a 16 bit int, the users code would
have already failed.  I'm still thinking about this, but I can
see that reading the version # into an int rather than an int_least16_t
would solve his problem - though it wouldn't address the
other issues I've mentioned.  I'll consider this for version 1.43.
This would permit him to load old archives.

1.42 will trap when a version # exceeds 16 bits.  I wouldn't
expect this to change though.  So the problem of how
use version # will have to be dealt with.
...
If it always reads a 16 bit
version number, then you've broken compatibility with all pre-1.42.0
archives. If it always reads a 32 bit version number, then you've
broken compatibility with 1.42.0.
To deal with this, you really need to know which version of Boost was
used to create the archive.
There is a mechanism for addressing these kinds of issues - it's the
library version # as described in the documentation.  So far, that #
is up to 4.
...
...
...
How would an application ever be able to exchange data with older,
deployed, versions of itself, without this capability?
Again, a little reflection will make it clear that an older version
of
a program can't anticipate changes in a subsequent version.  I'm
sorry - it's just logically not possible.  Think about it.
Do you realize that e.g. Microsoft Word 2007 can be instructed to
save files in such a way that they can be loaded with Word 2003? What
is logically impossible about that?
Can Microsoft 2003 word load files created with Microsoft word 2007?
That is what we're talking about here.

The question of being able to create previous versions has been
discussed. In fact, there is a section of the documentation in
which this is discussed as a possible extension.  It wouldn't be
all that hard to implement - but no one has shown any interest
in doing it.
...
...
...
Your response to David seemed to be essentially "too bad, maybe you
can find a way around it yourself", so I can't see that (1) is being
taken very seriously.
If I had an easy answer, honestly I would share it.  Really.  I
don't. Sorry.
Fair enough, but then it should be stated clearly in the
documentation: "Archives created by one version of
Boost.Serialization are *not* guaranteed to be readable by subsequent
versions of Boost.Serialization.".
Hmmm - I might be willing to say
a) that the intention is to make such a guarentee
b) and every effort has been made to that end
c) and that every attempt has been made to anticipate the
usage of the library
d) and that the library has been in usage for many years
e) and that versioning is a widely used facility
f) that has had very few problems from users
g) and that continual efforts are being made to make that
guarentee stronger
h) but that it's possible that there is something I haven't
anticipated which will create a problem.

But I suppose that goes without saying.
...
...
...
Of course... The point is that with a robust versioning scheme in
place, archive format changes can be implemented without breaking
older software.
There is a robust (and efficient) versioning scheme has been in
place since the beginning.  It was never designed to be able to hold
extra data.  It's unfortunate that I didn't trap such an unintended
usage. I try really hard - but I haven't been able to trap every
case where something is used in a way that doesn't occur to me.
How can you call it robust? It is evidently not providing
compatibility in either the backwards, or the forwards, direction.
Honestly, I can't help but wonder if you've read the documentation
or used the library.

Robert Ramey