cross-platfrom binary serialization?

older
Building boost 1.36 regex library...

Ákos Maróy

8 Aug 2008 8 Aug '08

10:01 a.m.

Hi, I'm working on a cross-platform project, where we're serializing data and sending it over TCP, using boost::serialization and boost::asio. As the data is quite big, and is mainly numeric, binary serialization provides a nice advantage over XML-based serialization, both in terms of performance and data size. But, I've found that the serialized data itself is not cross-platform. E.g. if I have a simple app, that sends serialized data via a TCP connection, it works fine as long as both ends of the connection are on the same platform (say both are Linux x86_64, or both are Windows XP). but they don't interact with each other - the data sent by one platform is not accepted 'as is' by the other. I wonder what provisions have to be done to achieve binary serialization that would work in such a context? Akos

Show replies by date

Mathieu Peyréga

8 Aug 8 Aug

10:05 a.m.

Hello, have you checked the portable binary archive (in the exemples of the serialisation lib source directory) I had a similar issue with non portable serialized files, and it solved it ! By the way, i'm wondering how many such undocumented treasures are hidding in this amazing source code maze that the boost source tree is :-) ! Regards, Mathieu Ákos Maróy a écrit :

...

Hi,

I'm working on a cross-platform project, where we're serializing data and sending it over TCP, using boost::serialization and boost::asio. As the data is quite big, and is mainly numeric, binary serialization provides a nice advantage over XML-based serialization, both in terms of performance and data size.

But, I've found that the serialized data itself is not cross-platform. E.g. if I have a simple app, that sends serialized data via a TCP connection, it works fine as long as both ends of the connection are on the same platform (say both are Linux x86_64, or both are Windows XP). but they don't interact with each other - the data sent by one platform is not accepted 'as is' by the other.

I wonder what provisions have to be done to achieve binary serialization that would work in such a context?

Akos

Andrea Denzler

10:08 a.m.

If you send a byte, a byte should arrive. If you send a wchar_t it has a different size on different platforms. If you send a integer it has different endianess (byte order). Maybe this last point is your issue. You have the same problem with file formats. Andrea -----Messaggio originale----- Da: boost-users-bounces@lists.boost.org [mailto:boost-users-bounces@lists.boost.org] Per conto di Ákos Maróy Inviato: venerdì 8 agosto 2008 12.01 A: boost-users@lists.boost.org Oggetto: [Boost-users] cross-platfrom binary serialization? Hi, I'm working on a cross-platform project, where we're serializing data and sending it over TCP, using boost::serialization and boost::asio. As the data is quite big, and is mainly numeric, binary serialization provides a nice advantage over XML-based serialization, both in terms of performance and data size. But, I've found that the serialized data itself is not cross-platform. E.g. if I have a simple app, that sends serialized data via a TCP connection, it works fine as long as both ends of the connection are on the same platform (say both are Linux x86_64, or both are Windows XP). but they don't interact with each other - the data sent by one platform is not accepted 'as is' by the other. I wonder what provisions have to be done to achieve binary serialization that would work in such a context? Akos _______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users

dizzy

10:20 a.m.

On Friday 08 August 2008 13:08:57 Andrea Denzler wrote:

...

If you send a byte, a byte should arrive. If you send a wchar_t it has a different size on different platforms. If you send a integer it has different endianess (byte order). Maybe this last point is your issue. You have the same problem with file formats.

I think he was expecting that binary_archive is already fixed/portable binary layout. Apparently not (that is news for me too). As for the other mentioned portable archive, I'm curious about integral encoding efficiency, Google's Protocol Buffers seem to be very efficient in that regard (both in bytes used for representation but also encoding/decoding operations speed) http://code.google.com/apis/protocolbuffers/docs/encoding.html -- Mihai RUSU Email: dizzy@roedu.net "Linux is obsolete" -- AST

Ákos Maróy

10:59 a.m.

dizzy wrote:

...

On Friday 08 August 2008 13:08:57 Andrea Denzler wrote:

...
If you send a byte, a byte should arrive. If you send a wchar_t it has a different size on different platforms. If you send a integer it has different endianess (byte order). Maybe this last point is your issue. You have the same problem with file formats.

a possible solution could be to implement my own serializers for integral types, that use the very same bitwise pattern? (endiannes, size, etc.)

...

I think he was expecting that binary_archive is already fixed/portable binary layout. Apparently not (that is news for me too).

yes, basically that was my expectation, that I serialize something. store it, de-serialize it using the _same_ implementation, but maybe compiled for a different platform, and then it would work fine.

...

As for the other mentioned portable archive, I'm curious about integral encoding efficiency, Google's Protocol Buffers seem to be very efficient in that regard (both in bytes used for representation but also encoding/decoding operations speed) http://code.google.com/apis/protocolbuffers/docs/encoding.html

interesting. I'll take a look.. Akos

Andrea Denzler

11:12 a.m.

For everything that is not a byte/char you have to check the encoding on different platforms. With integers you have the endianess problem. With floating types you may have different encodings too (I'm not sure if the latest compilers use all the standard IEEE encoding, but I hope they do). So if you encode your data into a byte/char stream then you will not have problems about. Note that you must solve this even if your application saves files and you want that they are binary portable on different platforms. The google protocol is very nice, but I wonder why the first example takes 3 bytes. Isn't it possible to encode it with 2 bytes only? Like with the string "testing" where the header that include the string length (value 7) is encoded with 2 bytes. Andrea -----Messaggio originale----- a possible solution could be to implement my own serializers for integral types, that use the very same bitwise pattern? (endiannes, size, etc.) Akos

Andrea Denzler

11:26 a.m.

I also suppose that serialization is completely cross platform on the data structures. In any case we need a library that solve those problems.

Jeff Flinn

11:37 a.m.

Ákos Maróy wrote:

...

Hi,

...

I wonder what provisions have to be done to achieve binary serialization that would work in such a context?

lookup portable_binary_archive, IIRC in the ...libs/serialization/example directory. Make sure you use the version from Trunk which has had some bugs fixed. Jeff Flinn

Ákos Maróy

9 Aug 9 Aug

10:50 a.m.

Jeff,

...

lookup portable_binary_archive, IIRC in the ...libs/serialization/example directory. Make sure you use the version from Trunk which has had some bugs fixed.

thanks, I'll take a look. can you give me a subversion URL? the SourceForge site is only displaying error messages :( Akos

Ákos Maróy

12:54 p.m.

Jeff,

...

lookup portable_binary_archive, IIRC in the ...libs/serialization/example directory. Make sure you use the version from Trunk which has had some bugs fixed.

meanwhile I found the svn URL for boost, at http://svn.boost.org/svn/boost/trunk I'm looking at the sample, and I see the following comment: // "Portable" input binary archive. It addresses integer size and endienness so // that binary archives can be passed across systems. Note:floating point types // not addressed here so I guess floating point type support needs to be added to this implementation? what would be your suggestion to implement floating point support? use some portable binary representation for floating point types? Akos

Andrea Denzler

1:56 p.m.

Most modern platforms use the IEEE 754 (1985) encoding for binary data (float/double). But I suppose there is still the (easy to handle) endianess issue. http://www.appinf.com/download/FPIssues.pdf -----Messaggio originale----- Da: boost-users-bounces@lists.boost.org [mailto:boost-users-bounces@lists.boost.org] Per conto di Ákos Maróy Inviato: sabato 9 agosto 2008 14.54 A: boost-users@lists.boost.org Oggetto: Re: [Boost-users] cross-platfrom binary serialization? Jeff,

...

lookup portable_binary_archive, IIRC in the ...libs/serialization/example directory. Make sure you use the version from Trunk which has had some bugs fixed.

Ákos Maróy

11 Aug 11 Aug

7:24 a.m.

Andrea Denzler wrote:

...

Most modern platforms use the IEEE 754 (1985) encoding for binary data (float/double). But I suppose there is still the (easy to handle) endianess issue. http://www.appinf.com/download/FPIssues.pdf

and how would I go about handling the endianness issue? is there a compile-time define in boost already that signals endianness of floating point types? or is the endiannes of the floating point types the very same as for integer types? I was looking at the implementation of the Qt class QDataStream, which serves a similar purpose. What they simply do is that they store the 4 bytes for floats in big endian order (swap if the system is little endian). for doubles, they have a more elaborate mapping, handling four cases: - normal little endian format - swapped little endian format - normal big endian format - swapped big endian format their test is rather simple - they take a number that has a double representation equivalent of the character array: "0123ABCD0123ABCD\0\0\0\0\0\0\0" in normal little endian format, and then use the pattern resulting on the target system to determine the format of the system itself. but this has to be done at compile time. my question is: is there some similar, compile / configure time test for boost available already, which result I could use to add float and double support to the portable_binary_archive class, thus making it complete? or do I have to implement my own tests for this purpose? (this would add complexity to my project, as so far I myself don't have configure-time tests at all, but depend on boost and similar libraries to already provide me with target platform details). Akos

Roman Perepelitsa

9:27 a.m.

Jeff Flinn <TriumphSprint2000 <at> hotmail.com> writes:

...

lookup portable_binary_archive, IIRC in the ...libs/serialization/example directory. Make sure you use the version from Trunk which has had some bugs fixed.

Even better, take a look at portable_binary_archive in the file vault: http://preview.tinyurl.com/5j46aq. It correctly handles floating point types as well as integer types. Roman Perepelitsa.

Ákos Maróy

9:37 a.m.

Roman,

...

Even better, take a look at portable_binary_archive in the file vault: http://preview.tinyurl.com/5j46aq. It correctly handles floating point types as well as integer types.

Thanks. How does this compare to the binary archive that's in the boost repository? there it has the following files: portable_binary_archive.hpp portable_binary_iarchive.cpp portable_binary_iarchive.hpp portable_binary_oarchive.cpp portable_binary_oarchive.hpp while yours is merely two header files: portable_binary_iarchive.hpp portable_binary_oarchive.hpp are these complete replacements for the above? Akos

Roman Perepelitsa

3:19 p.m.

Ákos Maróy <akos <at> maroy.hu> writes:

...

How does this compare to the binary archive that's in the boost repository?

It's an implementation of portable binary archive, while the think you can find in boost repository is just an example of how one could try to implement portable binary archive, therefore it's incomplete.

...

there it has the following files:

portable_binary_archive.hpp

This one contains shared code for iarchive and oarchive.

...

portable_binary_iarchive.cpp portable_binary_iarchive.hpp

It's an implementation of iarchive.

...

portable_binary_oarchive.cpp portable_binary_oarchive.hpp

It's an implementation of oarchive.

...

while yours is merely two header files:

portable_binary_iarchive.hpp portable_binary_oarchive.hpp

Well, they are not mine :) There is no need for cpp files because implementation is inline. Also they don't use anything like portable_binary_archive.hpp, but it's an implementation detail anyway.

...

are these complete replacements for the above?

Yes. Roman Perepelitsa.

Ákos Maróy

3:23 p.m.

Roman,

...

...
are these complete replacements for the above?

Yes.

Thanks for the info.. Akos

Ákos Maróy

12 Aug 12 Aug

8:23 p.m.

Roman,

...

Even better, take a look at portable_binary_archive in the file vault: http://preview.tinyurl.com/5j46aq. It correctly handles floating point types as well as integer types.

I'm looking at the portable_binary_archive contents you pointed me to, and it seems to be a bit problematic. I see it was written by people mostly using MS Visual Studio (I guess from the #pragma once line), and it seems the code through a lot of warnings under gcc. a lot of signed / unsigned comparison warnings for example, or checking if an unsigned value is negative. (it also checks on the BOOST_VERSION macro without including boost/versio.hpp). I wonder how stable this code is, and if it is really used among multiple systems. Are you actually using this code? Akos

Roman Perepelitsa

13 Aug 13 Aug

8:53 a.m.

Ákos Maróy <akos <at> maroy.hu> writes:

...

I'm looking at the portable_binary_archive contents you pointed me to, and it seems to be a bit problematic. I see it was written by people mostly using MS Visual Studio (I guess from the #pragma once line), and it seems the code through a lot of warnings under gcc. a lot of signed / unsigned comparison warnings for example, or checking if an unsigned value is negative. (it also checks on the BOOST_VERSION macro without including boost/versio.hpp).

I see. I suppose you are right, original author used this code only with MSVC.

...

I wonder how stable this code is, and if it is really used among multiple systems. Are you actually using this code?

I don't. Cross-platform binary serialization is requested/discussed quite frequently on boost-users mailing list and as far as I know, there are only 2 implementations available: one in the file fault and another one in serialization/examples, former being superior. That's why I pointed you to the version from file vault. For more info you might want to search boost-users archive for more info or contact the author of the code (Christian Pfligersdorffer <christian.pfligersdorffer at eos.info>). Roman Perepelitsa.

Pfligersdorffer, Christian

21 Aug 21 Aug

11:42 a.m.

Hi Roman and Akos! boost-users-bounces@lists.boost.org on :

...

Ákos Maróy <akos <at> maroy.hu> writes:

...
I'm looking at the portable_binary_archive contents you pointed me to, and it seems to be a bit problematic. I see it was written by people mostly using MS Visual Studio (I guess from the #pragma once line), and it seems the code through a lot of warnings under gcc. a lot of signed / unsigned comparison warnings for example, or checking if an unsigned value is negative. (it also checks on the BOOST_VERSION macro without including boost/versio.hpp).

I see. I suppose you are right, original author used this code only with MSVC.

And gcc-4, which also supports the #pragma once. The use of this construct and the warnings is simply a result from my strive to minimalize code length. What I express in code is the algorithmical idea in the shortest possible way I can think of. Let the compiler complain about it, I don't care because I _know_ what I'm doing and the warnings are vain. They issue from using one metafunction for all integral types, be them signed or unsigned. Of course you could separate those but only by introducing verboseness. Originally I even let the integral function treat the bool case resulting in even more warnings. The zero bool tweak in mind I decided to write an extra overload for bool. However if you have suggestions that fix the warnings and come close to a minimal solution I'll be happy to look into it. I include enough boost headers which implicitly include version.hpp and the like so I need not bother making this explicit.

...

...
I wonder how stable this code is, and if it is really used among multiple systems. Are you actually using this code?

I don't. Cross-platform binary serialization is requested/discussed quite frequently on boost-users mailing list and as far as I know, there are only 2 implementations available: one in the file fault and another one in serialization/examples, former being superior. That's why I pointed you to the version from file vault.

We do. Using the archives we transferred terabytes between x86 and ppc (32-bit only). Using boost-1.33.1. Other combinations have not been tested or I do not know of the results. PS: The portable binary archive that comes with the library examples should be complete, maybe just the comment was not removed. Robert Ramey often pointed out that this was sponsored by someone and will be in 1.36. I did not look at it yet, though. Best regards, -- Christian Pfligersdorffer Software Engineering http://www.eos.info

Ákos Maróy

24 Aug 24 Aug

8:41 a.m.

Christian,

...

And gcc-4, which also supports the #pragma once. The use of this

But the #pragma keyword is not for this. Even the MSDN page for #pragma says: "Each implementation of C and C++ supports some features unique to its host machine or operating system ... The #pragma directives offer a way for each compiler to offer machine- and operating system-specific features" see http://msdn.microsoft.com/en-us/library/d9x1s805(VS.80).aspx having to guard a header file from inclusion is in no way a machine or OS dependent 'feature', thus it's not something you'd want to solve with #pragma. also see chapter 24 from C++ Coding Standards by Herb Sutter and Andrei Alexandrescu, titled "Always write internal #include guards. Never write external #include guards.", http://www.gotw.ca/publications/c++cs.htm basically using #pragma once is bad style.

...

I include enough boost headers which implicitly include version.hpp and the like so I need not bother making this explicit.

see chapter 23 from C++ Coding Standards by Herb Sutter and Andrei Alexandrescu, titled "Make header files self-sufficient.". the fact that you include version.hpp frequently doesn't mean it shouldn't be explicit in this header file. if a header uses a feature, it should include the header for that feature.

...

We do. Using the archives we transferred terabytes between x86 and ppc (32-bit only). Using boost-1.33.1. Other combinations have not been tested or I do not know of the results.

glad to hear.

...

PS: The portable binary archive that comes with the library examples should be complete, maybe just the comment was not removed. Robert Ramey often pointed out that this was sponsored by someone and will be in 1.36. I did not look at it yet, though.

good to know :) if you're interested I can send you a version of your implementation I changed, along the following: - added #ifdef guards / removed #pragma once - made the files self-sufficient - removed signed / unsigned comparision / conversion warnings from the save() functions of both iarchive and oarchive - solved portability issue with right-shifting signed values in the save() function of portable_binary_oarchive (right-shifting signed values is implementation dependent) I'd be glad to send it the changed code over, if interested. Akos

Bijan

3:59 p.m.

...

...
PS: The portable binary archive that comes with the library examples should be complete, maybe just the comment was not removed. Robert Ramey often pointed out that this was sponsored by someone and will be in 1.36. I did not look at it yet, though.

good to know :)

Under boost_1_36_0\libs\serialization\example there are some hpp and cpp files for portable binary format (portable_binary_archive.hpp, portable_binary_iarchive.cpp ...) and I am wondering how performant the code is. Bijan

Robert Ramey

9:50 p.m.

Bijan wrote:

...

...
...
PS: The portable binary archive that comes with the library examples should be complete, maybe just the comment was not removed. Robert Ramey often pointed out that this was sponsored by someone and will be in 1.36. I did not look at it yet, though.

good to know :)

Under boost_1_36_0\libs\serialization\example there are some hpp and cpp files for portable binary format (portable_binary_archive.hpp, portable_binary_iarchive.cpp ...) and I am wondering how performant the code is.

This was actually tested on various platforms of different 32/64 endian combinations. And the test consisted of running ALL serialization tests as is done with the "official" archive implementations. To me, the main proble is that its missing support for floating point types. Including such support in a definitive, portable manner would be a significant effort which so far no one has deigned to undertake. Robert Ramey

Bo Peng

25 Aug 25 Aug

3:39 a.m.

...

This was actually tested on various platforms of different 32/64 endian combinations. And the test consisted of running ALL serialization tests as is done with the "official" archive implementations. To me, the main problem is that its missing support for floating point types. Including such support in a definitive, portable manner would be a significant effort which so far no one has deigned to undertake.

Please forgive my ignorance, but what is 'floating point types'? QDataStream simply dump the internal presentation of float or double numbers and swap them for big small endian coding if necessary... Thanks. Bo

Matthias Troyer

3:44 a.m.

On 24 Aug 2008, at 21:39, Bo Peng wrote:

...

...
This was actually tested on various platforms of different 32/64 endian combinations. And the test consisted of running ALL serialization tests as is done with the "official" archive implementations. To me, the main problem is that its missing support for floating point types. Including such support in a definitive, portable manner would be a significant effort which so far no one has deigned to undertake.

Please forgive my ignorance, but what is 'floating point types'? QDataStream simply dump the internal presentation of float or double numbers and swap them for big small endian coding if necessary...

The internal presentation is not the same on all platforms. Matthias

Roman Perepelitsa

6:44 a.m.

2008/8/24 Robert Ramey <ramey@rrsd.com>

...

...
Under boost_1_36_0\libs\serialization\example there are some hpp and cpp files for portable binary format (portable_binary_archive.hpp, portable_binary_iarchive.cpp ...) and I am wondering how performant the code is.

This was actually tested on various platforms of different 32/64 endian combinations. And the test consisted of running ALL serialization tests as is done with the "official" archive implementations. To me, the main proble is that its missing support for floating point types. Including such support in a definitive, portable manner would be a significant effort which so far no one has deigned to undertake.

But portable binary archive in the boost vault portably supports floating points, doesn't it? Can it be merged with the version that is provided by official serialization library? Roman Perepelitsa.

Pfligersdorffer, Christian

8:36 a.m.

Hi everyone! On 24 Aug 2008, at 21:39, Bo Peng wrote:

...

...
This was actually tested on various platforms of different 32/64 endian combinations. And the test consisted of running ALL serialization tests as is done with the "official" archive implementations. To me, the main problem is that its missing support

...

...
for floating point types. Including such support in a definitive, portable manner would be a significant effort which so far no one has deigned to undertake.

Please forgive my ignorance, but what is 'floating point types'? QDataStream simply dump the internal presentation of float or double numbers and swap them for big small endian coding if necessary...

That's what I do too: using fp_utilities I dump the bit pattern and restore it later on. This works for "almost all" environments. However, the word "almost" is a problem for an ultraportable library such as boost. Andrea Denzler already mentioned the difficulties: IEEE754 is not universally, nan-representations and endianness differ and forget about long double when talking about portability. Conclusion: the "definitive, portable manner" Robert talks about will be very hard to achive. On the other hand: supporting IEEE 754 types float and double is (almost) easy. Let's be pragmatic about that! Regards, -- Christian Pfligersdorffer Software Engineering http://www.eos.info

Bo Peng

2:03 p.m.

...

That's what I do too: using fp_utilities I dump the bit pattern and restore it later on. This works for "almost all" environments. However, the word "almost" is a problem for an ultraportable library such as boost. Andrea Denzler already mentioned the difficulties: IEEE754 is not universally, nan-representations and endianness differ and forget about long double when talking about portability. Conclusion: the "definitive, portable manner" Robert talks about will be very hard to achive. On the other hand: supporting IEEE 754 types float and double is (almost) easy. Let's be pragmatic about that!

Thank you very much for your explanation. My understanding now is that your version of portable binary serialization supports, like Qt, only IEEE754 floating types, and the official version under boost/examples does not. Does your implementation provides a mechanism to test IEEE754 floating point support? If I am going to use your library, I need to at least give a warning in such cases, something like "This system does not support IEEE754 floating point presentation so the created archive may not be read correctly on other systems". I would consider my program portable enough if this can be done. Thank you very much. Bo

Robert Ramey

5:04 p.m.

Pfligersdorffer, Christian wrote:

...

Hi everyone!

On 24 Aug 2008, at 21:39, Bo Peng wrote:

...
...
This was actually tested on various platforms of different 32/64 endian combinations. And the test consisted of running ALL serialization tests as is done with the "official" archive implementations. To me, the main problem is that its missing support

...
...
for floating point types. Including such support in a definitive, portable manner would be a significant effort which so far no one has deigned to undertake.

Please forgive my ignorance, but what is 'floating point types'? QDataStream simply dump the internal presentation of float or double numbers and swap them for big small endian coding if necessary...

That's what I do too: using fp_utilities I dump the bit pattern and restore it later on. This works for "almost all" environments. However, the word "almost" is a problem for an ultraportable library such as boost. Andrea Denzler already mentioned the difficulties: IEEE754 is not universally, nan-representations and endianness differ and forget about long double when talking about portability. Conclusion: the "definitive, portable manner" Robert talks about will be very hard to achive. On the other hand: supporting IEEE 754 types float and double is (almost) easy. Let's be pragmatic about that!

Regards,

This should make it apparent why I've never wanted to make a "portable binary archive" but left it as a demo or example. There is now way I could do this without making choice what some people would view as not being what they envision a "portable binary archive" to be. This would leave me with a life time task of defending these choices forever on this and other lists. I believe that making a "portable binary archive" is possible. In my view such an archive should be truely universally portable as the text archives are. This is the standard I would expect to meet if I were to undertake it. However, meeting such a standard would require: a) quite a bit of effort to address the variety of compilers, word sizes, endienness, etc. b) quite a bit of testing on all these environments c) quite a bit of research into floating point formats, Nans, and other stuff. d) quite a bit of detailed documentation to explain the compromises, decisions and rationale so the same batttle don't have to be constantly re-fought. If anyone want's to do this to make an "official archive" for the serializaton library it would be a great thing. Such a person should be prepared to: a) Do all of the above b) Submit his archive for a formal or mini review so that other interested parties can comment on the submission and agree that it represents a concensus set of choices in those areas regarding trade offs. c) Add the required documentation to the serialization documentation d) Monitor the test results and take responsability for keeping the code running in the face of changes in platforms. e) Monitor the user/devel lists to address issues raised by users. There is precedent for this. Matthias Troyer has done all this (and a bit more) for binary archives and it has worked out well. So its up to you - I don't know who "you" is here. Depends on who want's to step up. Robert Ramey

Robert Mecklenburg

4:40 p.m.

Robert Ramey writes:

...

This should make it apparent why I've never wanted to make a "portable binary archive" but left it as a demo or example.

I'm sure many will think this is a totally stupid suggestion, but I can't resist making a fool of myself. ;-) Since floats are the problem with portable binary archives, why not punt on this issue and render floating point types (only) in ascii. For many uses floating point is not the critical path and binary archives solve many problems other than floating point: endian-ness, native integer size differences, etc. And those issues can be quite difficult to deal with otherwise. This can be viewed as a special case of a standard technique: for highly non-standard data using an independent format that translates easily into each proprietary format. In this case the independent format is simply ascii. In fact, since we are only rendering the characters "[-+e.0-9]" we could use a modified BCD or other compressed format to provide the compression that is typically what people assume in binary formats. Thoughts? -- Robert

François Mauger

8:19 p.m.

Hi,

...

binary archive" but left it as a demo or example.

I'm sure many will think this is a totally stupid suggestion, but I can't resist making a fool of myself. ;-)

...

Since floats are the problem with portable binary archives, why not punt on this issue and render floating point types (only) in ascii.

For many uses floating point is not the critical path

I don't agree. In scientific applications one uses huge amount (>>TB) of data in float format (particularly doubles) and one needs to share data files on some computing clusters knowning nothing about the architecture of the systems used by end-users. The question of exactness (no loss of the initial precision), and I/O fastness is very important. It is also crucial to minimize storage space with additionnal typical capabilities of compression (gzip and bz2 filters are ok for that).

...

and binary archives solve many problems other than floating point: endian-ness, native integer size differences, etc.

yes this part is a must.

...

And those issues can be quite difficult to deal with otherwise.

This can be viewed as a special case of a standard technique: for highly non-standard data using an independent format that translates easily into each proprietary format. In this case the independent format is simply ascii. In fact, since we are only rendering the characters "[-+e.0-9]" we could use a modified BCD or other compressed format to provide the compression that is typically what people assume in binary formats.

ok this is only a set of 14 glyphs so it could be hosted via short ints (with 2 bits unused) consider a typical float (relative precision ~1e-7). If one need to store pi as +0.3141592e+01 (ASCII) it is 14 characters (only 11 is one saves leading'+' and exponent '+0' chars for >0 mantissa and exponent) that could be serialized using 14/11 shorts, so this is 28/22 bytes. This has to be compared with 4 bytes for floats! This induces a typical increase of storage by a factor ~6 at the additionnal CPU cost of the underlying internal format conversion (ala sprintf). For me this is not acceptable. Similar approach holds for doubles. More, as soon as you use ASCII format to store a float/double, you must make a decision about the rounding of the last significant digit. In most applications, this is not a real problem for people don't care about ultimate numeric precision... but in some circunstances, the reproductibility/portability I/O of the whole numeric precision is absolutely necessary. Imagine X be 3.141592654... being stored as 3.14160 in some archive for computing sqrt(pi-X) on some other system. I'm not sure this behaviour is acceptable by scientists (at least not by me ;-) ). So for me 'IEEE' format is the best approach one could use. Boost.archive is so simple and easy to use (with a little care) that I see no reason why we should not get an efficient portable binary archive with floats. Of course for some very very specific application that needs long doubles or other not so portable stuff, one could use the HDF5 library but this is not as simple as boost, unfortunately. regards frc -- Francois Mauger Laboratoire de Physique Corpusculaire de Caen et Universite de Caen ENSICAEN - 6, Boulevard du Marechal Juin, 14050 CAEN Cedex, FRANCE e-mail: mauger@lpccaen.in2p3.fr tel.: (0/+33) 2 31 45 25 12 fax: (0/+33) 2 31 45 25 49

Bo Peng

26 Aug 26 Aug

2:42 a.m.

...

So for me 'IEEE' format is the best approach one could use. Boost.archive is so simple and easy to use (with a little care) that I see no reason why we should not get an efficient portable binary archive with floats.

If a 'truly portable binary archive' is so difficult to achieve, we do not have to call Christian's implementation 'portable'. I mean, why cannot we make the current 'binary archive' almost portable by 1. Make types such as integers portable, 2. Use IEEE745 format to make float and double almost portable, 3. Give a mechanism to test if a system is IEEE745 compatible, and mark if an archive is created on a compatible system. Most users would still treat this 'binary archive' as non-portable, as it is the case now. For other users who care about portability, they are given a function to test if an archive can be opened safely on another system. I would consider this as a great improvement over the existing 'binary archive' implementation. Cheers, Bo

Nat Goodspeed

27 Sep 27 Sep

1:41 p.m.

I do understand that this issue has already been debated to the point of a new implementation. I just want to ask about a side point I didn't quite understand earlier. François Mauger wrote:

...

...
In fact, since we are only rendering the characters "[-+e.0-9]" we could use a modified BCD or other compressed format to provide the compression that is typically what people assume in binary formats.

ok this is only a set of 14 glyphs so it could be hosted via short ints (with 2 bits unused)

? I think each of 14 glyphs could be represented in 4 bits, with 2 bit patterns left over.

...

consider a typical float (relative precision ~1e-7). If one need to store pi as +0.3141592e+01 (ASCII) it is 14 characters (only 11 is one saves leading'+' and exponent '+0' chars for >0 mantissa and exponent) that could be serialized using 14/11 shorts, so this is 28/22 bytes. This has to be compared with 4 bytes for floats!

It seems to me that 14 characters in the constrained glyph set could be represented with 14 4-bit "nybbles," or 7 bytes. It's still worse than 4 bytes, but by a factor < 2 rather than ~6. Please forgive me if I've misunderstood you.

François Mauger

8:59 p.m.

Dear Nat and boosters Ooops, rereading my message, I simply cannot figure out the reason why I wrote such a stupid thing! You are absolutely right and your calculation of this factor ~<2 rather than 6 is ok. So this hypothetical system is still bad, but not as bad as I claimed before. Thank you for the fix. And sorry for the trouble ("Au temps pour moi!"). regards frc --

...

I do understand that this issue has already been debated to the point of a new implementation. I just want to ask about a side point I didn't quite understand earlier.

François Mauger wrote:

...
...
In fact, since we are only rendering the characters "[-+e.0-9]" we could use a modified BCD or other compressed format to provide the compression that is typically what people assume in binary formats.

ok this is only a set of 14 glyphs so it could be hosted via short ints (with 2 bits unused)

? I think each of 14 glyphs could be represented in 4 bits, with 2 bit patterns left over.

...
consider a typical float (relative precision ~1e-7). If one need to store pi as +0.3141592e+01 (ASCII) it is 14 characters (only 11 is one saves leading'+' and exponent '+0' chars for >0 mantissa and exponent) that could be serialized using 14/11 shorts, so this is 28/22 bytes. This has to be compared with 4 bytes for floats!

It seems to me that 14 characters in the constrained glyph set could be represented with 14 4-bit "nybbles," or 7 bytes.

It's still worse than 4 bytes, but by a factor < 2 rather than ~6.

Please forgive me if I've misunderstood you. _______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users

-- Francois Mauger Laboratoire de Physique Corpusculaire de Caen et Universite de Caen ENSICAEN - 6, Boulevard du Marechal Juin, 14050 CAEN Cedex, FRANCE e-mail: mauger@lpccaen.in2p3.fr tel.: (0/+33) 2 31 45 25 12 fax: (0/+33) 2 31 45 25 49

Ákos Maróy

26 Aug 26 Aug

12:47 p.m.

Robert Mecklenburg wrote:

...

Since floats are the problem with portable binary archives, why not punt on this issue and render floating point types (only) in ascii.

For many uses floating point is not the critical path and binary archives solve many problems other than floating point: endian-ness, native integer size differences, etc. And those issues can be quite difficult to deal with otherwise.

yes, this is a wide range of applications, but for example where I'm using it, most of our data is in floats :) but you're right - it's best to get a result that's not perfect but works, first :) Akos

Daryle Walker

27 Aug 27 Aug

8:01 p.m.

On Aug 25, 2008, at 12:40 PM, Robert Mecklenburg wrote:

...

Robert Ramey writes:

...
This should make it apparent why I've never wanted to make a "portable binary archive" but left it as a demo or example.

I'm sure many will think this is a totally stupid suggestion, but I can't resist making a fool of myself. ;-)

Since floats are the problem with portable binary archives, why not punt on this issue and render floating point types (only) in ascii.

For many uses floating point is not the critical path and binary archives solve many problems other than floating point: endian-ness, native integer size differences, etc. And those issues can be quite difficult to deal with otherwise.

This can be viewed as a special case of a standard technique: for highly non-standard data using an independent format that translates easily into each proprietary format. In this case the independent format is simply ascii. In fact, since we are only rendering the characters "[-+e.0-9]" we could use a modified BCD or other compressed format to provide the compression that is typically what people assume in binary formats.

Thoughts?

I have an alternate suggestion: what about continued fractions? Turn the floating point value into a list of integers. This works no matter what f.p. systems the source and destination use. You just need a portable integer serialization format. 1. Serialize whether or not the float is in a NaN state as a Boolean. If it's true, then you're done. Receiving systems that don't support such states could either return a zero or throw. If it's false, keep going. 2. Serialize the sign as a Boolean. This counts even for zero values, if the f.p. uses the "negative zero" concept. Receiving systems that don't should ignore the read sign for zero values. Continue, but use the absolute value instead (even for infinity). 3. Serialize the exponent as an Integer. The exponent is the shift needed to bring the base value between one and two (including 1, excluding 2). So values of two and above get a positive shift, values under one use negative shifts, and those that happen to be in our implementation range use a zero shift. If the base value is initially zero or infinite, use zero as the shift amount. 4. Serialize the base value as a list of continued fraction components. The length is variable, so make sure to serialize it too! For a zero base value, the list consists of a single element of value zero. For infinity, use an empty list. For all other values, start with the 1 as the whole part then proceed with the rest of the components. (Since the f.p. state represents a binary fraction, I suggest not using subtraction/truncation and reciprocating in floating-point, but manipulating the virtual numerator and denominator as integers with division/modulus. If the f.p. radix isn't 2, you may want to do the exponent and continued fraction in the native radix, then convert afterwards.) Saving a serialization should start with the whole 1 and go down to smaller contributions. Loading back a serialization should read in the entire list first, then expand starting from the smallest/last contribution up to the whole 1. Example: -2.75 -> !is_nan, is_negative, 1.325 << 1; 1.325 is stored as [1.01100], which is 44/32, which is 11/8, which has a c.f. of [1; 2, 1, 2]; so you'll serialize {False, True, 1, {4; 1, 2, 1, 2}}. Example: NaN -> is_nan -> {True}; you mustn't look for any other component. -- Daryle Walker Mac, Internet, and Video Game Junkie darylew AT hotmail DOT com

Bo Peng

8:38 p.m.

...

I have an alternate suggestion: what about continued fractions? Turn the floating point value into a list of integers. This works no matter what f.p. systems the source and destination use. You just need a portable integer serialization format.

Then why cannot we use IEEE745 format for float serialization on all systems? 1. If a systems is IEEE745 compatible, serialize float numbers directly. This will work 99.9% of the times, and will be very efficient. 2. If a system is not IEEE745 compatible, to serialize a float number, we write the number as 32 or 64 continuous bits, in IEEE745 format, to de-serialize a float number, we read the number in IEEE745 formats and write in native float format. An intermediate string representation can be used for the translations. I mean, these continuous IEEE745 bits are equivalent to your "portable integer serialization format", but will be much more efficient on IEEE745 compatible systems. Cheers, Bo

Pfligersdorffer, Christian

28 Aug 28 Aug

6:48 a.m.

Bo Peng on Wednesday, August 27, 2008 10:38 PM:

...

...
I have an alternate suggestion: what about continued fractions? Turn the floating point value into a list of integers. This works no matter what f.p. systems the source and destination use. You just need a portable integer serialization format.

Then why cannot we use IEEE745 format for float serialization on all systems?

1. If a systems is IEEE745 compatible, serialize float numbers directly. This will work 99.9% of the times, and will be very efficient. 2. If a system is not IEEE745 compatible, to serialize a float number, we write the number as 32 or 64 continuous bits, in IEEE745 format, to de-serialize a float number, we read the number in IEEE745 formats and write in native float format. An intermediate string representation can be used for the translations.

I mean, these continuous IEEE745 bits are equivalent to your "portable integer serialization format", but will be much more efficient on IEEE745 compatible systems.

A splendid idea! Btw it's IEEE754... ;) But yeah, let's take it for the portable binary archive's standard for floating point values. Machines using a different notion may contribute a conversion algorithm or simply throw an exception. However, the intermediate string approach would be a last resort in my view. Regards, -- Christian Pfligersdorffer Software Engineering http://www.eos.info

Daryle Walker

9:13 a.m.

On Aug 27, 2008, at 4:38 PM, Bo Peng wrote:

...

...
I have an alternate suggestion: what about continued fractions? Turn the floating point value into a list of integers. This works no matter what f.p. systems the source and destination use. You just need a portable integer serialization format.

Then why cannot we use IEEE745 format for float serialization on all systems?

Because I don't think that the IEEE-754 internal format is as stable as you think it is. (Looking at Wikipedia's entries on IEEE-754 and 754r, there is a standard conceptual format, but problem is how do various implementations carry out the internal bit-wise format. Room for interpretation will doom your plan.)

...

1. If a systems is IEEE745 compatible, serialize float numbers directly. This will work 99.9% of the times, and will be very efficient. 2. If a system is not IEEE745 compatible, to serialize a float number, we write the number as 32 or 64 continuous bits, in IEEE745 format, to de-serialize a float number, we read the number in IEEE745 formats and write in native float format. An intermediate string representation can be used for the translations.

The string or "direct" conversions may introduce rounding errors.

...

I mean, these continuous IEEE745 bits are equivalent to your "portable integer serialization format", but will be much more efficient on IEEE745 compatible systems.

Is it worth potentially screwing 0.01% of your customers when you may not have to? -- Daryle Walker Mac, Internet, and Video Game Junkie darylew AT hotmail DOT com

Bo Peng

12:56 p.m.

...

...
Then why cannot we use IEEE745 format for float serialization on all systems?

Because I don't think that the IEEE-754 internal format is as stable as you think it is. (Looking at Wikipedia's entries on IEEE-754 and 754r, there is a standard conceptual format, but problem is how do various implementations carry out the internal bit-wise format. Room for interpretation will doom your plan.)

Cannot we choose a stable, boost-specific format and consider all others incompatible? This is actually required if we are going to convert non-IEEE754 float numbers to a standard IEEE754 format by ourselves. I guess we can define a set of characterization float numbers and their standard binary representation in boost. Only those systems that represent these numbers in the standard way can be considered as IEEE754 compatible (subject to Big/Small Endian swap) and can be archived directly.

...

The string or "direct" conversions may introduce rounding errors.

I guess all string-based conversions have rounding errors. I am using a text archive for its portability and I think the situation will not be worse if I switch to an imperfect portable binary archive. Also, there might be platform specific loseless conversion methods...

...

Is it worth potentially screwing 0.01% of your customers when you may not have to?

I guess more than 0.01% boost/serialization users are suffering from not having a portable binary archive, I am one of them. Bo

Robert Ramey

27 Aug 27 Aug

9:29 p.m.

Daryle Walker wrote:

...

I have an alternate suggestion: what about continued fractions? Turn the floating point value into a list of integers. This works no matter what f.p. systems the source and destination use. You just need a portable integer serialization format.

which we already have.

...

...

I'm don't have enough time to review the suggestion in the detail it probably deserves. But a cursory look shows a lot of imagination. Robert Ramey

Matthias Troyer

28 Aug 28 Aug

12:52 a.m.

On 27 Aug 2008, at 15:29, Robert Ramey wrote:

...

Daryle Walker wrote:

...
I have an alternate suggestion: what about continued fractions? Turn the floating point value into a list of integers. This works no matter what f.p. systems the source and destination use. You just need a portable integer serialization format.

which we already have.

...
...

I'm don't have enough time to review the suggestion in the detail it probably deserves. But a cursory look shows a lot of imagination.

It will just be horribly inefficient. A text representation will be faster and more compact. Matthias

Daryle Walker

9:18 a.m.

On Aug 27, 2008, at 8:52 PM, Matthias Troyer wrote:

...

On 27 Aug 2008, at 15:29, Robert Ramey wrote:

...
Daryle Walker wrote:

...
I have an alternate suggestion: what about continued fractions? Turn the floating point value into a list of integers. This works no matter what f.p. systems the source and destination use. You just need a portable integer serialization format.

which we already have. [SNIP] I'm don't have enough time to review the suggestion in the detail it probably deserves. But a cursory look shows a lot of imagination.

It will just be horribly inefficient. A text representation will be faster and more compact.

But we're trying to avoid conversion rounding errors, which could happen even under text conversion. If you don't like continued fractions, then serialize the f.p. radix, the virtual numerator of the binary (or whatever) fraction (include any implicit leading 1 or whatever), and the power of the radix used for the virtual denominator. They're all integers, so feel free to use text if you want. -- Daryle Walker Mac, Internet, and Video Game Junkie darylew AT hotmail DOT com

Pfligersdorffer, Christian

2 Sep 2 Sep

2:57 p.m.

Robert on Monday, August 25, 2008 7:04 PM:

...

This should make it apparent why I've never wanted to make a "portable binary archive" but left it as a demo or example. There is now way I could do this without making choice what some people would view as not being what they envision a "portable binary archive" to be. This would leave me with a life time task of defending these choices forever on this and other lists.

I think I understand now... the issue is a boiling pot :) I'm counting 50 replies to the topic already and there's no concensus. I do not think I want to be the one who tries to satisfy everybody. That's a daunting task... This is boon and bane of the boost libraries. Regards, -- Christian Pfligersdorffer Software Engineering http://www.eos.info

Bo Peng

3:08 p.m.

...

I do not think I want to be the one who tries to satisfy everybody. That's a daunting task...

I will use your version of portable binary archive as long as there is a way to test its compatibility with the underlying system... Just something like "whoops, this system is not IEEE754 compatible so archives created on this system will not be portable". Bo

Pfligersdorffer, Christian

5 Sep 5 Sep

1:45 p.m.

Bo Peng on Tuesday, September 02, 2008 5:09 PM:

...

...
I do not think I want to be the one who tries to satisfy everybody. That's a daunting task...

I will use your version of portable binary archive as long as there is a way to test its compatibility with the underlying system... Just something like "whoops, this system is not IEEE754 compatible so archives created on this system will not be portable".

It's not straight forward to do such a test but I will have a look at it for the next release. Johan Rade does a classication of floating point formats in his fp_utilities. I'll see if I can use that. Regards, -- Christian Pfligersdorffer Software Engineering http://www.eos.info

Johan Råde

6 Sep 6 Sep

10:06 a.m.

Pfligersdorffer, Christian wrote:

...

Bo Peng on Tuesday, September 02, 2008 5:09 PM:

...
...
I do not think I want to be the one who tries to satisfy everybody. That's a daunting task... I will use your version of portable binary archive as long as there is a way to test its compatibility with the underlying system... Just something like "whoops, this system is not IEEE754 compatible so archives created on this system will not be portable".

It's not straight forward to do such a test but I will have a look at it for the next release. Johan Rade does a classication of floating point formats in his fp_utilities. I'll see if I can use that.

In practice almost all platforms have float and double implementations that are enough IEEE 754 compliant to make it ok to save the bytes and load them again. The only exceptions I know of are: 1. some compilers have a setting where denormals, infinity and Nan are not used 2. on VMS there is still support for the VAX floating point format Condition 1 can be detected by numeric_limits<T>::has_denorm etc. Condition 2 can be detected as follows #if defined(__vms) && defined(__DECCXX) && !__IEEE_FLOAT Also note that the bit patterns that represent quiet NaN on one platform can represent signaling NaN on other platforms. But very few C++ developers seem to care about the difference between quiet and signaling NaN. And forget about portable binary serialization of long double (unless you want to do a lot of work). --Johan

Pfligersdorffer, Christian

17 Sep 17 Sep

12:43 p.m.

So, I drew the consequences from the quite lengthy discussion and released a new version of my portable binary archives on the boost vault: http://www.boostpro.com/vault/index.php?directory=serialization In case of integer serialization I removed the warnings and floating point serialization I complemented with a couple of checks whether preconditions are met on the platform at hand. Thanks to everybody for your tipps and sharing your wisdom! Remember that the archives were tested only against boost 1.33 (and 1.34 a little) but so far I have not had time to look into the newer versions. Let me know if it works for you, -- Christian Pfligersdorffer Software Engineering http://www.eos.info Johan Rade on Saturday, September 06, 2008 12:07 PM:

...

Pfligersdorffer, Christian wrote:

...
Bo Peng on Tuesday, September 02, 2008 5:09 PM:

...
...
I do not think I want to be the one who tries to satisfy everybody. That's a daunting task... I will use your version of portable binary archive as long as there is a way to test its compatibility with the underlying system... Just something like "whoops, this system is not IEEE754 compatible so archives created on this system will not be portable".

It's not straight forward to do such a test but I will have a look at it for the next release. Johan Rade does a classication of floating point formats in his fp_utilities. I'll see if I can use that.

In practice almost all platforms have float and double implementations that are enough IEEE 754 compliant to make it ok to save the bytes and load them again.

The only exceptions I know of are: 1. some compilers have a setting where denormals, infinity and Nan are not used 2. on VMS there is still support for the VAX floating point format

Condition 1 can be detected by numeric_limits<T>::has_denorm etc. Condition 2 can be detected as follows #if defined(__vms) && defined(__DECCXX) && !__IEEE_FLOAT

Also note that the bit patterns that represent quiet NaN on one platform can represent signaling NaN on other platforms. But very few C++ developers seem to care about the difference between quiet and signaling NaN.

And forget about portable binary serialization of long double (unless you want to do a lot of work).

Frank Mori Hess

2 Sep 2 Sep

3:45 p.m.

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Tuesday 02 September 2008 10:57 am, Pfligersdorffer, Christian wrote:

...

Robert on Monday, August 25, 2008 7:04 PM:

...
This should make it apparent why I've never wanted to make a "portable binary archive" but left it as a demo or example. There is now way I could do this without making choice what some people would view as not being what they envision a "portable binary archive" to be. This would leave me with a life time task of defending these choices forever on this and other lists.

I think I understand now... the issue is a boiling pot :) I'm counting 50 replies to the topic already and there's no concensus. I do not think I want to be the one who tries to satisfy everybody. That's a daunting task...

This is boon and bane of the boost libraries.

I don't see how adding archive support for an existing standard portable binary format (like XDR) would be controversial. It's not like its existence would preclude the addition of another end-all beat-all portable binary format, if someone really decides they are determined to invent one. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFIvV+S5vihyNWuA4URAgvCAJ96EPwy15H4QAY5gQ+iZCBelzcBHwCeOGky f6k+RmW+i7l2XJZjfakjS4c= =8fmQ -----END PGP SIGNATURE-----

Andrea Denzler

6 p.m.

Since we like to add wood to the fire.... I think that the main problem of the serialization issues are due to the fact that we use incompatible cross platform datatypes in our code. If int is 32 bit in one system and 64 bit in another system then of course we run into issues. The existing workarounds will work, but they may give failures on other/new systems. What if int is 128bit or 16bit? The starting point of using incompatible data types is wrong. Using internal compatible datatypes is the first issue to solve, instead of defining something like int foo; It is better if we can use something like: int<min_value,max_value> foo; On different platforms it will lead to different native c++ types, but I don't care about. I want to be sure to have my range of values for foo. In that case serialization is relatively easy to solve. I may would add some minor options like performance versus low data size for a specific serialization. Sometimes I need fast code, other times I need a low data size. But this is just a extra option. The same should be used for floating point values, something like double<precision_digits,exponent_digits> foo; I define how many digits I need and want. When using foo I know the minimum precision available, maybe the c++ floating point type used has a greater precision, but I don't care about. Again when serializing I know how many digits are necessary, and probably the IEEE standard of the two integers is the best to use. But this a choice of the library write. I want a full portabile datatype/archive where a specific number of digits are guaranteed in any operation (+, -, power, serialization, etc). Of course an exception is thrown at compile time if the specific platform/library can't handle such requirements of example int<0,2^500> foo. IMHO... :) Andrea

Pfligersdorffer, Christian

25 Aug 25 Aug

8:06 a.m.

boost-users-bounces@lists.boost.org on Sunday, August 24, 2008 11:51 PM:

...

Bijan wrote:

...
...
...
PS: The portable binary archive that comes with the library examples should be complete, maybe just the comment was not removed. Robert Ramey often pointed out that this was sponsored by someone and will be in 1.36. I did not look at it yet, though.

good to know :)

Under boost_1_36_0\libs\serialization\example there are some hpp and cpp files for portable binary format (portable_binary_archive.hpp, portable_binary_iarchive.cpp ...) and I am wondering how performant the code is.

This was actually tested on various platforms of different 32/64 endian combinations. And the test consisted of running ALL serialization tests as is done with the "official" archive implementations. To me, the main proble is that its missing support for floating point types. Including such support in a definitive, portable manner would be a significant effort which so far no one has deigned to undertake.

Robert Ramey

Hi Robert! I am confused. Are you saying your "sponsored extension to the portable binary archive example" does not include floating point support? Please clarify what is in 1.36 and what is planned as this is entirely not clear to me - and others I'd say from various peoples' contributions. I'm sorry if I add to the portable binary archive confusion by using the same names as your examples. I do not completely go against renaming my classes if it helps. What do you mean? Regards, -- Christian Pfligersdorffer Software Engineering http://www.eos.info

Roma

8:17 a.m.

New subject: Building Boost libraries

Hi! I need to compile Boost now and don't know what libraries I really need. "The above example session will build static and shared non-debug multi-threaded variations of the libraries. To build all variations use --build-type=complete" The question is - what do I really need as just Boost user? Will non-debug variations be suitable for me while I will debug my program under msvc (step-by-step execution)? (I have no need to debug deeply into Boost libs) When will I need single-threaded variations? Or I will never meet it even if my program will contain just one thread? Thanks!

David Abrahams

4:28 p.m.

New subject: Building Boost libraries

on Mon Aug 25 2008, Roma <shmromacs-AT-yandex.ru> wrote:

...

Hi! I need to compile Boost now and don't know what libraries I really need.

"The above example session will build static and shared non-debug multi-threaded variations of the libraries. To build all variations use --build-type=complete"

The question is - what do I really need as just Boost user? Will non-debug variations be suitable for me while I will debug my program under msvc (step-by-step execution)? (I have no need to debug deeply into Boost libs) When will I need single-threaded variations? Or I will never meet it even if my program will contain just one thread?

Did you read http://boost.org/more/getting_started ? -- Dave Abrahams BoostPro Computing http://www.boostpro.com

Roma

5:46 p.m.

New subject: Building Boost libraries

yes. I've even re-read it now. But it stays unclear for me if I can make non-complete build without hurting the way I create and debug programs. 25.08.08, 20:28, "David Abrahams" <dave@boostpro.com>:

...

on Mon Aug 25 2008, Roma <shmromacs-AT-yandex.ru> wrote:

...
Hi! I need to compile Boost now and don't know what libraries I really need.

"The above example session will build static and shared non-debug multi-threaded variations of the libraries. To build all variations use --build-type=complete"

The question is - what do I really need as just Boost user? Will non-debug variations be suitable for me while I will debug my program under msvc (step-by-step execution)? (I have no need to debug deeply into Boost libs) When will I need single-threaded variations? Or I will never meet it even if my program will contain just one thread? Did you read http://boost.org/more/getting_started ?

Pfligersdorffer, Christian

8 a.m.

Hi Ákos, I like your style of critics. Always put a reference to your opinion so it gets more weight. However I do not agree with everything you write. Ákos Maróy on Sunday, August 24, 2008 10:42 AM:

...

Christian,

...
And gcc-4, which also supports the #pragma once. The use of this

But the #pragma keyword is not for this. Even the MSDN page for #pragma says:

"Each implementation of C and C++ supports some features unique to its host machine or operating system ... The #pragma directives offer a way for each compiler to offer machine- and operating system-specific features"

see http://msdn.microsoft.com/en-us/library/d9x1s805(VS.80).aspx

having to guard a header file from inclusion is in no way a machine or OS dependent 'feature', thus it's not something you'd want to solve with #pragma.

I think in this point you're mistaken, since the pragma is simply a communication channel directly to your compiler. GCC offers not only machine or architecture pragmas but also e.g. diagnostic pragmas, the #pragma message or means to change optimization level per compilation unit.

...

also see chapter 24 from C++ Coding Standards by Herb Sutter and Andrei Alexandrescu, titled "Always write internal #include guards. Never write external #include guards.", http://www.gotw.ca/publications/c++cs.htm

I agree! Buut that's out of question.

...

basically using #pragma once is bad style.

I strongly disagree :) It's shorter, thus more elegant, does not induce myriads of names like the preprocessor guards and is even faster. GCC also saw that and de-deprecated it in versions 4.x.

...

...
I include enough boost headers which implicitly include version.hpp and the like so I need not bother making this explicit.

see chapter 23 from C++ Coding Standards by Herb Sutter and Andrei Alexandrescu, titled "Make header files self-sufficient.". the fact that you include version.hpp frequently doesn't mean it shouldn't be explicit in this header file. if a header uses a feature, it should include the header for that feature.

You got a point here. It probably doesn't hurt and makes the whereabouts of the BOOST_VERSION macro evident in my case.

...

if you're interested I can send you a version of your implementation I changed, along the following:

- added #ifdef guards / removed #pragma once - made the files self-sufficient - removed signed / unsigned comparision / conversion warnings from the save() functions of both iarchive and oarchive - solved portability issue with right-shifting signed values in the save() function of portable_binary_oarchive (right-shifting signed values is implementation dependent)

I'd be glad to send it the changed code over, if interested.

Sure! I'm curious how you changed my code, send it right over please! :) Especially the right-shifting issue caught my interest! Do you have examples where right shifting does not repeat the sign bit? Let me see how you solved it. Regards, -- Christian Pfligersdorffer Software Engineering http://www.eos.info

Ákos Maróy

26 Aug 26 Aug

12:44 p.m.

Christian,

...

I like your style of critics. Always put a reference to your opinion so it gets more weight. However I do not agree with everything you write.

We don't have to :) But it's good to see the point of other people as well :)

...

Sure! I'm curious how you changed my code, send it right over please! :) Especially the right-shifting issue caught my interest! Do you have examples where right shifting does not repeat the sign bit? Let me see how you solved it.

please see the sources attached. as for the right-shifting, I only tried it on intel-based CPUs, so of course here it always works the same. but the C++ is quite clear: right-shifting signed values is implementation dependent, regarding the fill bit on the left, so one shouldn't count on it :) Akos

Pfligersdorffer, Christian

28 Aug 28 Aug

11:29 a.m.

boost-users-bounces@lists.boost.org on Tuesday, August 26, 2008 2:45 PM:

...

Christian,

...
I like your style of critics. Always put a reference to your opinion so it gets more weight. However I do not agree with everything you write.

We don't have to :) But it's good to see the point of other people as well :)

...
Sure! I'm curious how you changed my code, send it right over please! :) Especially the right-shifting issue caught my interest! Do you have examples where right shifting does not repeat the sign bit? Let me see how you solved it.

please see the sources attached.

as for the right-shifting, I only tried it on intel-based CPUs, so of course here it always works the same. but the C++ is quite clear: right-shifting signed values is implementation dependent, regarding the fill bit on the left, so one shouldn't count on it :)

Akos, I see, you use make_signed and make_unsigned - didn't know those small gems :) Thanks for pointing that out. The workarounds in the oarchive are a bit clumsy but hey, if they root out all of the warnings with little effort. temp = negative ? (typename boost::make_unsigned<T>::type) -((typename boost::make_signed<T>::type)t) : t; Now *that's* a nut :) Quite tricky! But I understand now that both casts are neccessary. (After trying to remove the second one and rethinking ;) Regards, -- Christian Pfligersdorffer Software Engineering http://www.eos.info

6174

Age (days ago)

6224

Last active (days ago)

List overview

Download

55 comments

19 participants

participants (19)

Andrea Denzler
Bijan
Bo Peng
Daryle Walker
David Abrahams
dizzy
Frank Mori Hess
François Mauger
Jeff Flinn
Johan Råde
Mathieu Peyréga
Matthias Troyer
Nat Goodspeed
Pfligersdorffer, Christian
Robert Mecklenburg
Robert Ramey
Roma
Roman Perepelitsa
Ákos Maróy