Candidate for 1.66.1, if there is one

James E. King, III

23 Dec 2017 23 Dec '17

3:58 p.m.

In Boost 1.66.0, there was a fix to date_time for the year 2038 problem: https://github.com/boostorg/date_time/pull/35 The serialization code in date_time had no static assertions enforcing the size of data values, and adding proper versioning support was missed in the release by myself and not picked up in the code reviews. As a result, serializing in anything written by earlier boost versions fails for date_time related structures, which might be considered serious enough to justify a 1.66.1 release. (We could also get the latest msvc compatibility in there I suppose...) I'm adding proper versioning right now, however Boost 1.66.0 will remain a problem since it serializes in/out as Version 0 with 64-bit values; and 1.65.1 serializes in/out Version 0 with 32-bit values so I'm not certain there will be a way to detect this issue. The issue is recorded in github here: https://github.com/boostorg/date_time/issues/56 Any additional thoughts are appreciated. I'm looking at adding a test to write out a version 0 and then read it in when the current version is 1, but I don't want to add test-only support code to the headers that get shipped, so if anyone can point me at a current example unit test that writes out a "n-1" version and then reads in a "n" version with boost::serialization I would appreciate it. Thanks, Jim

Show replies by date

Robert Ramey

23 Dec 23 Dec

4:25 p.m.

On 12/23/17 7:58 AM, James E. King, III via Boost wrote:

...

In Boost 1.66.0, there was a fix to date_time for the year 2038 problem:

https://github.com/boostorg/date_time/pull/35

<snip> Personally, I think any efforts to making a point release would be better spent on the next release. In other words I don't think this should be done. What think is better would be: a) make changes in the develop branch as soon as is convenient b) watch the test matrix and results from the CI implementations c) when one is satisfied with these results - merge to master If someone has an urgent need to included these lastest fixes, he can merge the lastest master to his local system before the next release. Of course he should run the test suite on this new version of the maser on his local system. As boost gets bigger, managing a boost release where everything is "perfect" cannot scale. Thus we are seeing that the difficulty of making such a release is increasing over time. I believe that a better goal would be to "Allow users to have a perfect implementation of the libraries they use" on their own schedule. I think this is more realistic, practical, timely for users and easier for maintainers. I recognize that it's a radical change: a) It effectively decouples boost libraries from each other - some more than others. b) It places more responsibility on users to test those libraries they use on the systems they use them on. c) b) above requires infrastructure such as procedures, utilities and practices that are either not defined, or not documented. But not moving in this direction is hold us back.

...

I'm looking at adding a test to write out a version 0 and then read it in when the current version is 1, but I don't want to add test-only support code to the headers that get shipped, so if anyone can point me at a current example unit test that writes out a "n-1" version and then reads in a "n" version with boost::serialization I would appreciate it.

This is something which would be very useful and has always been desired. a) It would be quite a bit of work to implement such a series of tests b) and describe how to set it up c) The serialization library already places a lot of burden on the testing infrastructure. This would increases that load significantly. Robert Ramey

...

Thanks,

Jim

_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

James E. King, III

5:08 p.m.

On Sat, Dec 23, 2017 at 11:25 AM, Robert Ramey via Boost < boost@lists.boost.org> wrote:

...

On 12/23/17 7:58 AM, James E. King, III via Boost wrote:

...
I'm looking at adding a test to write out a version 0 and then read it in when the current version is 1, but I don't want to add test-only support code to the headers that get shipped, so if anyone can point me at a current example unit test that writes out a "n-1" version and then reads in a "n" version with boost::serialization I would appreciate it.

This is something which would be very useful and has always been desired.

a) It would be quite a bit of work to implement such a series of tests b) and describe how to set it up c) The serialization library already places a lot of burden on the testing infrastructure. This would increases that load significantly.

Robert Ramey

Two ways I can see doing this: 1. Craft a version 0 output stream manually or with the help of serialization code directly in the test, then read it in. 2. Typically the conditionals for version are on the load() side; is there a way to tell archive to: ar << use_version<boost::posix_time::ptime, 0>() << ptime(some_value) and add conditional code to the save() code path so that it would write out a version 0 like the previous release did? I'm not sure it's typical to see the version number actually used in the save() code path. That would be pretty useful and allow for understandable unit tests, but I don't know how easy it would be to implement. - Jim

Robert Ramey

5:19 p.m.

On 12/23/17 9:08 AM, James E. King, III via Boost wrote:

...

Two ways I can see doing this:

1. Craft a version 0 output stream manually or with the help of serialization code directly in the test, then read it in.

actually you'd need test archives for ALL previous verisions. This would be quite a thing to manage.

...

2. Typically the conditionals for version are on the load() side; is there a way to tell archive to: ar << use_version<boost::posix_time::ptime, 0>() << ptime(some_value) and add conditional code to the save() code path so that it would write out a version 0 like the previous release did? I'm not sure it's typical to see the version number actually used in the save() code path. That would be pretty useful and allow for understandable unit tests, but I don't know how easy it would be to implement.

The capability to produce previous versions of an archive has be requested multiple times. The first few times I was dismissive of the idea. One I was goaded into looking into it, I could see that it was in fact doable and there was a natural interface through the archive opening flags. In other words, it was possible to implement this without turning the whole serialization library into an unholy mess - assuming one doesn't believe that it already is. But still a lot of tricky work, large increment in testing, etc. And as one might imagine, I would be reluctant to undertake a large investment of effort to implement an enhancement to the library which would be of value to only a few people. Of course if someone where to offer me (a lot) of money ... Robert Ramey

...

- Jim

_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

James E. King, III

5:37 p.m.

On Sat, Dec 23, 2017 at 12:19 PM, Robert Ramey via Boost < boost@lists.boost.org> wrote:

...

And as one might imagine, I would be reluctant to undertake a large investment of effort to implement an enhancement to the library which would be of value to only a few people. Of course if someone where to offer me (a lot) of money ...

Robert Ramey

It would be useful to anyone writing a unit test for any class that uses Boost.Serialization and versioned output, but just a few people. - Jim

Peter Dimov

5:51 p.m.

James E. King, III wrote:

...

1. Craft a version 0 output stream manually or with the help of serialization code directly in the test, then read it in.

This looks perfectly sensible to me. Is there a problem with it?

James E. King, III

6:11 p.m.

On Sat, Dec 23, 2017 at 12:51 PM, Peter Dimov via Boost < boost@lists.boost.org> wrote:

...

James E. King, III wrote:

1. Craft a version 0 output stream manually or with the help of

...
serialization code directly in the test, then read it in.

This looks perfectly sensible to me. Is there a problem with it?

I have a pull request out that does this, please check it out: https://github.com/boostorg/date_time/pull/58 - Jim

Peter Dimov

7:19 p.m.

James E. King, III wrote:

...

...
...
1. Craft a version 0 output stream manually or with the help of serialization code directly in the test, then read it in.

This looks perfectly sensible to me. Is there a problem with it?

I have a pull request out that does this, please check it out:

https://github.com/boostorg/date_time/pull/58

I agree with Steven - I think that the v0 serialized source should not be generated by the current codebase, but generated once by the old codebase and then preserved and hardcoded into the test. Otherwise you're not testing whether you can read old files - you're testing whether you can read new-compatible-with-old files.

Robert Ramey

6:25 p.m.

On 12/23/17 9:51 AM, Peter Dimov via Boost wrote:

...

James E. King, III wrote:

...
1. Craft a version 0 output stream manually or with the help of serialization code directly in the test, then read it in.

This looks perfectly sensible to me. Is there a problem with it?

LOL - I'm not sure you're thinking of a one time effort or a general solution. In either case, I'm thinking it would be more effort than first meets the eye. Another way would be think about enhancing the test infrastructure to save/maintain older versions of the test output files. This would be pretty useful, but looks like quite an effort and lots of resources. Robert Ramey

...

_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Steven Watanabe

6:49 p.m.

AMDG On 12/23/2017 11:25 AM, Robert Ramey via Boost wrote:

...

On 12/23/17 9:51 AM, Peter Dimov via Boost wrote:

...
James E. King, III wrote:

...
1. Craft a version 0 output stream manually or with the help of serialization code directly in the test, then read it in.

This looks perfectly sensible to me. Is there a problem with it?

LOL - I'm not sure you're thinking of a one time effort or a general solution. In either case, I'm thinking it would be more effort than first meets the eye.

Another way would be think about enhancing the test infrastructure to save/maintain older versions of the test output files. This would be pretty useful, but looks like quite an effort and lots of resources.

I think something like this should be stored in git along with tests, rather being maintained directly by the test infrastructure. (We really want the format to be memorialised for all time, regardless of future changes to the test system). It should be pretty simple to set up for portable archive formats. The main problem that I see is sorting out which binary archive is for the current platform. pseudo-code: dir = guess_archive_directory_for_platform() test_loading_all_old_archives(dir) write_current_archive_format(new_file) if new_file is not identical to one of the old archives then error: "archive format has changed" In Christ, Steven Watanabe

Robert Ramey

9:25 p.m.

On 12/23/17 10:49 AM, Steven Watanabe via Boost wrote:

...

I think something like this should be stored in git along with tests, rather being maintained directly by the test infrastructure.

In my brain I'm incline to fuse following: a) test system b) build system c) source control/git/.. This kind of makes sense since each of these is dependent on same file layout, naming conventions, scripts, etc. I agree with your statement above and with Peters on the opinion on the same subject. That is, writing code to create previous versions would really be an orthogonal issue to testing ability to read previous archive versions. This latter would/should/could be implemented by making the the "test infrastructure" that much more elaborate. It doesn't take too much imagination to figure out how much enthusiasm I or anyone else could muster up for that. (We really want the format

...

to be memorialised for all time, regardless of future changes to the test system). It should be pretty simple to set up for portable archive formats.

Right. I'm not even sure it could be done for binary archives given the evolution in compilers/libraries etc.

...

The main problem that I see is sorting out which binary archive is for the current platform.

since binary archives are not likely to be used for longer term storage, the problem would/could be ignored for this case. Of course someone would raise the issue about using serialization to pass data on the wire between different versions of the application. This is all quite interesting. But I'm inclined to just accept the fact that I/we can't solve every problem. The number of cases that this problem has actually occurred is pretty small, especially when Since I burned myself on this in a bad way in may 2010, the number of cases that this problem has actually occurred is very, very small. Somehow I'm thinking that it won't happen to Mr. King again at least. Robert Ramey

Steven Watanabe

9:54 p.m.

AMDG On 12/23/2017 02:25 PM, Robert Ramey via Boost wrote:

...

On 12/23/17 10:49 AM, Steven Watanabe via Boost wrote:

...
(We really want the format to be memorialised for all time, regardless of future changes to the test system). It should be pretty simple to set up for portable archive formats.

Right. I'm not even sure it could be done for binary archives given the evolution in compilers/libraries etc.

Actually, I think you could make it work, if you make the key a list of the sizes of all known built in types + various standard typedefs (size_t, ptrdiff_t, streamsize). New toolchains that change any of this will automatically go into a new bin. We can at least guarantee that the archives are readable between ABI compatible toolchains (which is the most that we can reasonably expect for binary archives).

...

...
The main problem that I see is sorting out which binary archive is for the current platform.

since binary archives are not likely to be used for longer term storage, the problem would/could be ignored for this case. Of course someone would raise the issue about using serialization to pass data on the wire between different versions of the application.

This is all quite interesting. But I'm inclined to just accept the fact that I/we can't solve every problem. The number of cases that this problem has actually occurred is pretty small, especially when

Since I burned myself on this in a bad way in may 2010, the number of cases that this problem has actually occurred is very, very small. Somehow I'm thinking that it won't happen to Mr. King again at least.

In Christ, Steven Watanabe

James E. King, III

24 Dec 24 Dec

4:42 p.m.

On Sat, Dec 23, 2017 at 4:54 PM, Steven Watanabe via Boost < boost@lists.boost.org> wrote:

...

AMDG

On 12/23/2017 02:25 PM, Robert Ramey via Boost wrote:

...
On 12/23/17 10:49 AM, Steven Watanabe via Boost wrote:

...
(We really want the format to be memorialised for all time, regardless of future changes to the test system). It should be pretty simple to set up for portable archive formats.

Right. I'm not even sure it could be done for binary archives given the evolution in compilers/libraries etc.

Actually, I think you could make it work, if you make the key a list of the sizes of all known built in types + various standard typedefs (size_t, ptrdiff_t, streamsize). New toolchains that change any of this will automatically go into a new bin. We can at least guarantee that the archives are readable between ABI compatible toolchains (which is the most that we can reasonably expect for binary archives).

...
...
The main problem that I see is sorting out which binary archive is for the current platform.

since binary archives are not likely to be used for longer term storage, the problem would/could be ignored for this case. Of course someone would raise the issue about using serialization to pass data on the wire between different versions of the application.

This is all quite interesting. But I'm inclined to just accept the fact that I/we can't solve every problem. The number of cases that this problem has actually occurred is pretty small, especially when

Since I burned myself on this in a bad way in may 2010, the number of cases that this problem has actually occurred is very, very small. Somehow I'm thinking that it won't happen to Mr. King again at least.

In Christ, Steven Watanabe

Much of the issue I'm trying to fix could have been avoided if the serialization code in date_time had static assertions on the size of each field being written. I added that, and I recommend everyone else add the same to their serialization implementations. Something interesting happened with my pull request. On my local msvc-14.1 system I added some test checks for binary serialization of date_time. Version 0 is 62 bytes and version 1 is 74 bytes. 12 bytes makes sense since a field used three times was changed from boost::int32_t to boost::int64_t. That said, my builds on Travis CI all failed, claiming the size is 66 and not 62! See: https://travis-ci.org/boostorg/date_time/jobs/320724282#L1558 This leads me to wonder if serialization is compatible across platforms and if it is *supposed *to be compatible across platforms. Is this a bug?? Shouldn't binary serialization be the same size on every platform, assuming platform- specific types are not used? (Maybe that's the issue, perhaps not every type is platform-agnostic, I wonder if date_time documentation makes any claims to cross-platform binary serialization, or if one is supposed to use text or xml serialization always to achieve this). - Jim

Peter Dimov

5:04 p.m.

James E. King, III wrote:

...

This leads me to wonder if serialization is compatible across platforms and if it is *supposed *to be compatible across platforms. Is this a bug?? Shouldn't binary serialization be the same size on every platform, assuming platform-specific types are not used?

As far as I know, the built-in binary archives are completely platform-dependent, in both size and endianness. It should be possible to create portable ones based on Endian's buffer types, one just needs to choose an appropriate size for 'int' (32 bits would probably be fine here) and 'long' (32, 48, or 64? hard to say.) Another approach is to make the portable archives only take uintN_t. This takes care of the size, but since those types are typedefs, if you serialize an int, you still get a non-portable result. Myself, I prefer to use the basic types.

Seth

5:07 p.m.

On 24-12-17 18:04, Peter Dimov via Boost wrote:

...

This leads me to wonder if serialization is compatible across platforms and if it is *supposed *to be compatible across platforms. Is this a bug?? Shouldn't binary serialization be the same size on every platform, assuming platform-specific types are not used?

That's documented http://www.boost.org/doc/libs/1_66_0/libs/serialization/doc/todo.html#portab... Anyways, there's EOS Portable Archive https://epa.codeplex.com/

Robert Ramey

6:33 p.m.

On 12/24/17 9:07 AM, Seth via Boost wrote:

...

On 24-12-17 18:04, Peter Dimov via Boost wrote:

...
This leads me to wonder if serialization is compatible across platforms and if it is *supposed *to be compatible across platforms. Is this a bug?? Shouldn't binary serialization be the same size on every platform, assuming platform-specific types are not used?

That's documented http://www.boost.org/doc/libs/1_66_0/libs/serialization/doc/todo.html#portab...

Anyways, there's EOS Portable Archive https://epa.codeplex.com/

This is a well done package which I believe was inspired by the portable binary archive example in the serialization library. I made efforts toward integrating this into the current serialization package. But it wasn't a trivial fit so I had to set it aside... for now. But I'm convinced that the serialization library needs a couple of enhancements: a) portable_binary_archive equivalent to EOS above b) JSON archive I've shied away from promoting portable_binary_archive to "official status" as the main issue was dealing with floating point in a portable way. This is a lot trickier than first meets the eye. But fortunately or unfortunately, I've found myself getting involved with this at a low level so I might make moves in this direction. Lately, I've received more PRS form interested parties. This has inspired me to invest a little more effort in evolving the library relevant at least until the standards committee were to design and implement their official version. Robert Ramey

Bjorn Reese

25 Dec 25 Dec

10:09 a.m.

On 12/24/17 19:33, Robert Ramey via Boost wrote:

...

But I'm convinced that the serialization library needs a couple of enhancements:

a) portable_binary_archive equivalent to EOS above b) JSON archive

Both are available here: https://github.com/breese/trial.protocol

Robert Ramey

24 Dec 24 Dec

6:10 p.m.

On 12/24/17 9:04 AM, Peter Dimov via Boost wrote:

...

James E. King, III wrote:

...
This leads me to wonder if serialization is compatible across platforms and if it is *supposed *to be compatible across platforms. Is this a bug?? Shouldn't binary serialization be the same size on every platform, assuming platform-specific types are not used?

As far as I know, the built-in binary archives are completely platform-dependent, in both size and endianness.

Correct. Only text base archives are guarenteed to be portable across platforms, times, versions and whatever.

...

It should be possible to create portable ones based on Endian's buffer types, one just needs to choose an appropriate size for 'int' (32 bits would probably be fine here) and 'long' (32, 48, or 64? hard to say.)

Look at the potable archive example in examples and documentation. It's all there. But it's based on a better idea than the above one. Robert Ramey

Olaf van der Spek

10:28 p.m.

On Sun, Dec 24, 2017 at 6:04 PM, Peter Dimov via Boost <boost@lists.boost.org> wrote:

...

It should be possible to create portable ones based on Endian's buffer types, one just needs to choose an appropriate size for 'int' (32 bits would probably be fine here) and 'long' (32, 48, or 64? hard to say.)

48?? ;) How about variable-width encoding? Hard-coding the size of int might be problematic. -- Olaf

Robert Ramey

11:18 p.m.

On 12/24/17 2:28 PM, Olaf van der Spek via Boost wrote:

...

On Sun, Dec 24, 2017 at 6:04 PM, Peter Dimov via Boost <boost@lists.boost.org> wrote:

...
It should be possible to create portable ones based on Endian's buffer types, one just needs to choose an appropriate size for 'int' (32 bits would probably be fine here) and 'long' (32, 48, or 64? hard to say.)

48?? ;)

How about variable-width encoding? Hard-coding the size of int might be problematic.

That's what the sample portable_binary_archive in the serialization library already does. The issue gets sticker for floating point numbers in the portable binary archive. Robert Ramey

Andrey Semashev

25 Dec 25 Dec

9:05 a.m.

On 12/25/17 01:28, Olaf van der Spek via Boost wrote:

...

On Sun, Dec 24, 2017 at 6:04 PM, Peter Dimov via Boost <boost@lists.boost.org> wrote:

...
It should be possible to create portable ones based on Endian's buffer types, one just needs to choose an appropriate size for 'int' (32 bits would probably be fine here) and 'long' (32, 48, or 64? hard to say.)

48?? ;)

How about variable-width encoding? Hard-coding the size of int might be problematic.

I think, the main problem is not encoding but what to do if the encoded value does not fit in the type on the decoding end.

Robert Ramey

4:49 p.m.

On 12/25/17 1:05 AM, Andrey Semashev via Boost wrote:

...

On 12/25/17 01:28, Olaf van der Spek via Boost wrote:

...

I think, the main problem is not encoding but what to do if the encoded value does not fit in the type on the decoding end.

In the example portable binary archive, that is handled by a exception object for that purpose. Robert Ramey

Daniel James

23 Dec 23 Dec

8:30 p.m.

On 23 December 2017 at 15:58, James E. King, III via Boost <boost@lists.boost.org> wrote:

...

I'm adding proper versioning right now, however Boost 1.66.0 will remain a problem since it serializes in/out as Version 0 with 64-bit values; and 1.65.1 serializes in/out Version 0 with 32-bit values so I'm not certain there will be a way to detect this issue.

If you want to something to the release notes let me know, or create a website pull request as normal. Could also add a patch if you want, and perhaps add something to the news feed. If we're going to create a point release, there are other changes on date_time master, so the best thing is probably to branch date_time at the boost-1.66.0 tag using: git checkout boost-1.66.0 -b fixes/1.66

2766

Age (days ago)

2768

Last active (days ago)

List overview

Download

22 comments

9 participants

participants (9)

Andrey Semashev
Bjorn Reese
Daniel James
James E. King, III
Olaf van der Spek
Peter Dimov
Robert Ramey
Seth
Steven Watanabe