[serialization] Reading back old archives

Hello, I have a problem reading back serialization archives generated by older version of boost. I used to have a YYYYMMDD class versioning scheme. This broke with boost 1.42, which truncates the class version numbers to 16 bits, causing my code to generated corrupted archives. I fixed my code by implementing my own class versioning when compiling with boost 1.42, so I am again able to create archives that I can read back. Unfortunately this fixes only half of the problem. How can I read back archives that were saved using older releases of the serialization library? I saw that someone else opened ticket 3990 for the same problem. Here you suggested using the function get_library_version() in the following way: template<class Archive> void serialization(Archive & ar, unsigned int version){ library_version_type library_version = get_library_version if(library_version < 6) // the library version is a date my_date = version; else{ // don't use the boost serialization number, use ours instead ar & my_date; } ar && old_data_items if(ar.is_loading) if(my_date > "jan 10, 2010") ar & new_data_item; but I can't get this to work: the version argument is already truncated, and can't be interpreted as a date. As the bug reporter said, this is a case of data loss. Should I reopen ticket 3990, or is this code supposed to work with svn (I admit I did not test the trunk)? Thanks for your help, David.

David Raulo wrote:
Hello,
I have a problem reading back serialization archives generated by older version of boost. I used to have a YYYYMMDD class versioning scheme. This broke with boost 1.42, which truncates the class version numbers to 16 bits, causing my code to generated corrupted archives. I fixed my code by implementing my own class versioning when compiling with boost 1.42, so I am again able to create archives that I can read back. Unfortunately this fixes only half of the problem. How can I read back archives that were saved using older releases of the serialization library?
I saw that someone else opened ticket 3990 for the same problem. Here you suggested using the function get_library_version() in the following way:
template<class Archive> void serialization(Archive & ar, unsigned int version){ library_version_type library_version = get_library_version if(library_version < 6) // the library version is a date my_date = version; else{ // don't use the boost serialization number, use ours instead ar & my_date; } ar && old_data_items if(ar.is_loading) if(my_date > "jan 10, 2010") ar & new_data_item;
but I can't get this to work: the version argument is already truncated, and can't be interpreted as a date. As the bug reporter said, this is a case of data loss. Should I reopen ticket 3990, or is this code supposed to work with svn (I admit I did not test the trunk)?
I've addressed this in the trunk and merged to release. I took special care to address situations such as your's. BUT... , I really haven't a good way to test this. So I would be greatful if you could download the lastest release version and verify that this works as I believe it should. If not, we should be able to fix it. Robert Ramey

Le 12 juil. 2010 à 01:16, Robert Ramey a écrit :
I've addressed this in the trunk and merged to release. I took special care to address situations such as your's. BUT... , I really haven't a good way to test this. So I would be greatful if you could download the lastest release version and verify that this works as I believe it should. If not, we should be able to fix it.
Robert Ramey
Thank you, I will test and report back here. David.

Hi, Sorry this took so long, I was hit by another, unrelated bug while compiling my code with boost 1.43 (BTW that bug is already tracked in ticket #4351 and concerns mpl, but I suspect many users of your library are at risk since it helps trigger that bug). Le 12 juil. 2010 à 01:16, Robert Ramey a écrit :
David Raulo wrote:
I saw that someone else opened ticket 3990 for the same problem. Here you suggested using the function get_library_version() in the following way: [...]
I've addressed this in the trunk and merged to release. I took special care to address situations such as your's. BUT... , I really haven't a good way to test this. So I would be greatful if you could download the lastest release version and verify that this works as I believe it should. If not, we should be able to fix it.
Unfortunately it does not work. The serialization library throws an exception before reaching my serialize() method. This happens at the
following XML line:
<MyClass class_id="11" tracking_level="1" version="20090123" object_id="_11">
Here is the relevant part of the backtrace:
#8 0x001df891 in boost::serialization::throw_exceptionboost::archive::archive_exception (e=@0xbfffe2d0) at throw_exception.hpp:36
#9 0x001a3cee in boost::archive::detail::iserializer

David Raulo wrote:
Here you suggested using the function get_library_version() in the
following way:
[...]
I've addressed this in the trunk and merged to release. I took special care to address situations such as your's. BUT... , I really haven't a good way to test this. So I would be greatful if you could download the lastest release version and verify that this works as I believe it should. If not, we should be able to fix it.
Unfortunately it does not work. The serialization library throws an exception before reaching my serialize() method. This happens at the following XML line: <MyClass class_id="11" tracking_level="1" version="20090123" object_id="_11">
Here is the relevant part of the backtrace:
#8 0x001df891 in boost::serialization::throw_exceptionboost::archive::archive_exception (e=@0xbfffe2d0) at throw_exception.hpp:36 #9 0x001a3cee in boost::archive::detail::iserializer
::load_object_data (this=0xbea698, ar=@0xbffff284, x=0x2a19770, file_version=36107) at iserializer.hpp:173 Here you can see that the original version 20090123 got truncated to 36107 == 20090123 & 65535. This in itself might be a bug, I'm not sure.
I can't see this. But I do see how this assertion would be tripped. If you comment out this assertion does this make it work? Robert Ramey

Le 12 juil. 2010 à 22:43, Robert Ramey a écrit :
David Raulo wrote:
#8 0x001df891 in boost::serialization::throw_exceptionboost::archive::archive_exception (e=@0xbfffe2d0) at throw_exception.hpp:36 #9 0x001a3cee in boost::archive::detail::iserializer
::load_object_data (this=0xbea698, ar=@0xbffff284, x=0x2a19770, file_version=36107) at iserializer.hpp:173 Here you can see that the original version 20090123 got truncated to 36107 == 20090123 & 65535. This in itself might be a bug, I'm not sure.
I can't see this. But I do see how this assertion would be tripped.
in the backtrace, the value of the file_version argument is 36107, whereas in the XML archive, the version is 20090123. So it looks to me that something truncates the value to 16 bits.
If you comment out this assertion does this make it work?
Will do tomorrow, and report back. Thanks. David.

Hello, Le 12 juil. 2010 à 22:43, Robert Ramey a écrit :
David Raulo wrote:
#8 0x001df891 in boost::serialization::throw_exceptionboost::archive::archive_exception (e=@0xbfffe2d0) at throw_exception.hpp:36 #9 0x001a3cee in boost::archive::detail::iserializer
::load_object_data (this=0xbea698, ar=@0xbffff284, x=0x2a19770, file_version=36107) at iserializer.hpp:173 Here you can see that the original version 20090123 got truncated to 36107 == 20090123 & 65535. This in itself might be a bug, I'm not sure.
I can't see this. But I do see how this assertion would be tripped. If you comment out this assertion does this make it work?
Then the execution proceeds without exceptions, my serialize() method is called, but I verified that it receives 36107 as the version instead of 20090123. David Raulo.

Hi, Le 13 juil. 2010 à 14:03, David Raulo a écrit :
Le 12 juil. 2010 à 22:43, Robert Ramey a écrit :
David Raulo wrote:
Here you can see that the original version 20090123 got truncated to 36107 == 20090123 & 65535. This in itself might be a bug, I'm not sure.
I can't see this. But I do see how this assertion would be tripped. If you comment out this assertion does this make it work?
Then the execution proceeds without exceptions, my serialize() method is called, but I verified that it receives 36107 as the version instead of 20090123.
So, what can I do to help? In my view there are 2 problems: - since boost 1.42, the version numbers are internally truncated to 16 bits when reading archives; this causes data loss, which the user might be able to work around in some cases. - since boost 1.43, reading these archives will raise an assertion (iserializer.hpp:173) before the user get any chance to work around the truncated version numbers. I addressed the 2d problem in the atttached patch. Comments welcome. For the first problem, I believe the only solution is to change the definition of version_type in basic_archive.hpp back to uint_least32_t. This would fix a data-loss regression, without any downside that I can think of. Please note that your wish to prevent any new version number bigger than 255 would still be enforced by the static_assert in serialization/version.hpp you introduced with boost 1.43. Is that acceptable? Thanks for your help. David.

David Raulo wrote:
Hi,
Le 13 juil. 2010 à 14:03, David Raulo a écrit :
Le 12 juil. 2010 à 22:43, Robert Ramey a écrit :
David Raulo wrote:
Here you can see that the original version 20090123 got truncated to 36107 == 20090123 & 65535. This in itself might be a bug, I'm not sure.
I can't see this. But I do see how this assertion would be tripped. If you comment out this assertion does this make it work?
Then the execution proceeds without exceptions, my serialize() method is called, but I verified that it receives 36107 as the version instead of 20090123.
So, what can I do to help?
In my view there are 2 problems: - since boost 1.42, the version numbers are internally truncated to 16 bits when reading archives; this causes data loss, which the user might be able to work around in some cases.
- since boost 1.43, reading these archives will raise an assertion (iserializer.hpp:173) before the user get any chance to work around the truncated version numbers.
I addressed the 2d problem in the atttached patch. Comments welcome.
For the first problem, I believe the only solution is to change the definition of version_type in basic_archive.hpp back to uint_least32_t. This would fix a data-loss regression, without any downside that I can think of. Please note that your wish to prevent any new version number bigger than 255 would still be enforced by the static_assert in serialization/version.hpp you introduced with boost 1.43.
Is that acceptable?
I have been working on this. I thought I had it. I hadn't realized the the issue arose prior to 1.43. I think that you're analysis is correct. I'll take yet another look at this. Thanks for your patience.
Thanks for your help.
David.
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users

David Raulo wrote:
Hi, In my view there are 2 problems: - since boost 1.42, the version numbers are internally truncated to 16 bits when reading archives; this causes data loss, which the user might be able to work around in some cases.
- since boost 1.43, reading these archives will raise an assertion (iserializer.hpp:173) before the user get any chance to work around the truncated version numbers.
I've suppressed this.
I addressed the 2d problem in the atttached patch. Comments welcome.
For the first problem, I believe the only solution is to change the definition of version_type in basic_archive.hpp back to uint_least32_t. This would fix a data-loss regression, without any downside that I can think of.
unfortunately this would break binary archives created by versions 1.42 and 1.43. I've looked more carefully into this: I've discovered the class_id_type has the same problem. So current trunk version is passing all the tests I expect it to pass. That leaves the problem of reading version 6 binary archives of which there are two types. I made the 1.44 library version 8. That leaves version 7 available for the "mis-labeled" binary archives. I've included in example a program called fix_six. This program will change the library version # from 6 to 7. So if one has a problem with a version 6 binary archive, it should be addressable by running fix_six <filename>. Then everything should be fine.
So, what can I do to help?
I'd appreciate it if you or any one else want's to try the trunk and the fix_six program to verify that this actually addresses the issue and actually makes these broken archives readable. One could also verify that no information is lost when reading back the version #. That is, internal to the library, the version has been left at int32 so that it can be detected as originally written. Robert Ramey

Hi, Le 20 juil. 2010 à 00:24, Robert Ramey a écrit :
I'd appreciate it if you or any one else want's to try the trunk and the fix_six program to verify that this actually addresses the issue and actually makes these broken archives readable.
One could also verify that no information is lost when reading back the version #. That is, internal to the library, the version has been left at int32 so that it can be detected as originally written.
I confirm that I am now able to access the original version number. Thanks a lot for fixing this! I will try the fix_six program on binary archives as soon as I can. Will this code be included in the 1.44 boost release? Best regards, David Raulo

David Raulo wrote:
Hi,
Le 20 juil. 2010 à 00:24, Robert Ramey a écrit :
I'd appreciate it if you or any one else want's to try the trunk and the fix_six program to verify that this actually addresses the issue and actually makes these broken archives readable.
One could also verify that no information is lost when reading back the version #. That is, internal to the library, the version has been left at int32 so that it can be detected as originally written.
I confirm that I am now able to access the original version number. Thanks a lot for fixing this!
Thanks even more for testing this!. If fact, if you can test this some more and on other achives I would be very greatful.
I will try the fix_six program on binary archives as soon as I can. Will this code be included in the 1.44 boost release?
YES Robert Ramey
participants (2)
-
David Raulo
-
Robert Ramey