[serialization] fast array serialization (10x speedup)

Hi Robert, Over the past week I got around to doing what I wanted to do for a long time, and implemented an improved serialization of contiguous arrays of fundamental types. The motivation was two-fold: i) to speed up the serialization of large data sets by factors of up to 10 ii) to allow implementation of serialization by MPI The problem with the serialization of large contiguous arrays of fundamental types (be it a C-array, a std::vector, a std::valarray, boost::multi_array, ...) is that the serialization function is called for each (of possible billions) elements, instead of only once for the whole array. I have attached a suite of three benchmark programs and timings I ran on a PowerbookG4 using Apple's version of the gcc-4 compiler. The benchmarks are a) vectortime: reading and writing a std::vector<double> with 10^7 elements to/from a file b) arraytime: reading and writing 1000 arrays double[10000] to/ from a file c) vectortime_memory: reading and writing a std::vector<double> with 10^7 elements to/from a memory buffer The short summary of the benchmarks is that Boost.Serialization is 5-10 times slower than direct reading or writing! With the fast array serialization modifications, discussed below, this slowdown is removed. Note that the codes were compiled with -O2. Without -O2 I have observed another factor of 10 in slowdown in some cases In order to implement the fast array serialization, I made the following changes to the serialization library: i) a new traits class template <class Archive, class Type> has_fast_array_serialization<Archive,Type>; which specifies whether an Archive has fast array serialization for a Type. The default implementation for this traits class is false, so that no change is needed for existing archives. ii) output archives supporting fast array serialization for a given Type T provide an additional member function save_array(T const * address, std:;size_t length); to save a contiguous array of Ts, containing length elements starting at the given address, and a similar function load_array(T * address, std:;size_t length); for input archives iii) serialization of C-arrays and std::vector<T> was changed to use fast array serialization for those archives and types where it is supported. I'm still working on serialization for std::valarray and boost::multi_array using the same features. iv) in addition, to support an MPI serialization archive (which is essentially done but still being tested), and to improve portability of archives, I introduced a new "strong" type BOOST_STRONG_TYPEDEF(std::size_t, container_size_type) for the serialization of the size of a container. The current implementation uses an unsigned int to store the size, which is problematic on machines with 32-bit int but 64 bit size_type . To stay compatible with old archives, the serialization into binary archives converts the size to an unsigned int, but this should be changed to another type, and the file version number bumped up to allow containers with more than 2^32 elements. The second motivation was MPI serialization, for which I need the size type of containers to be a type distinct from any other integer. The explanation is lengthy and I will provide the reason once the MPI archives are finished. v) also the polymporphic archives were changed, by adding save_array and load_array functions. Even for archives not supporting fast array serialization per se this should improve performance, since now only a single virtual function call is required for arrays, instead of one per element. The modifications are on the branch tagged "fast_array_serialization", and I have attached the diffs with respect to the main trunk. I have performed regression tests under darwin, using Apple's version of gcc-4. None of the changes should lead to any incompatibility with archives written with the current version of the serialization library, nor should it break any existing archive implementation. Regarding compatibility with non-conforming compilers the only issue I see is that I have used boost::enable_if to dispatch to either the standard or fast array serialization. We should discuss what to do for compilers that do not support SFINAE. My preferred solution would be to just disable fast array serialization for these compilers, to keep the changed to the code minimal. The other option would be to add another level of indirection and implement the dispatch without using SFINAE. Robert, could you take a look at the modifications, and would it be possibly the merge these modifications with the main trunk once you have finished your work for the 1.33.1 release? Best regards Matthias

I only took a very quick look at the diff file. I have a couple of questions: It looks like that for certain types, (C++ arrays, vector<int>, etc) we want to use binary_save/load to leverage on the fact the fact that we can assume in certain situations that storage is contiguous. Note that there is an example in the package - demo_fast_archive which does exactly this for C++ arrays. It could easily extended to cover any other desired types. I believe that using this as a basis would achieve all you desire and more which a much smaller investment of effort. Also it would not require changing the serialization library in any way. Robert Ramey Matthias Troyer wrote:
Hi Robert,
Over the past week I got around to doing what I wanted to do for a long time, and implemented an improved serialization of contiguous arrays of fundamental types. The motivation was two-fold:
i) to speed up the serialization of large data sets by factors of up to 10 ii) to allow implementation of serialization by MPI
The problem with the serialization of large contiguous arrays of fundamental types (be it a C-array, a std::vector, a std::valarray, boost::multi_array, ...) is that the serialization function is called for each (of possible billions) elements, instead of only once for the whole array. I have attached a suite of three benchmark programs and timings I ran on a PowerbookG4 using Apple's version of the gcc-4 compiler. The benchmarks are
a) vectortime: reading and writing a std::vector<double> with 10^7 elements to/from a file b) arraytime: reading and writing 1000 arrays double[10000] to/ from a file c) vectortime_memory: reading and writing a std::vector<double> with 10^7 elements to/from a memory buffer
The short summary of the benchmarks is that Boost.Serialization is 5-10 times slower than direct reading or writing!
With the fast array serialization modifications, discussed below, this slowdown is removed. Note that the codes were compiled with -O2. Without -O2 I have observed another factor of 10 in slowdown in some cases
In order to implement the fast array serialization, I made the following changes to the serialization library:
i) a new traits class
template <class Archive, class Type> has_fast_array_serialization<Archive,Type>;
which specifies whether an Archive has fast array serialization for a Type. The default implementation for this traits class is false, so that no change is needed for existing archives.
ii) output archives supporting fast array serialization for a given Type T provide an additional member function
save_array(T const * address, std:;size_t length);
to save a contiguous array of Ts, containing length elements starting at the given address, and a similar function
load_array(T * address, std:;size_t length);
for input archives
iii) serialization of C-arrays and std::vector<T> was changed to use fast array serialization for those archives and types where it is supported. I'm still working on serialization for std::valarray and boost::multi_array using the same features.
iv) in addition, to support an MPI serialization archive (which is essentially done but still being tested), and to improve portability of archives, I introduced a new "strong" type
BOOST_STRONG_TYPEDEF(std::size_t, container_size_type)
for the serialization of the size of a container. The current implementation uses an unsigned int to store the size, which is problematic on machines with 32-bit int but 64 bit size_type . To stay compatible with old archives, the serialization into binary archives converts the size to an unsigned int, but this should be changed to another type, and the file version number bumped up to allow containers with more than 2^32 elements.
The second motivation was MPI serialization, for which I need the size type of containers to be a type distinct from any other integer. The explanation is lengthy and I will provide the reason once the MPI archives are finished.
v) also the polymporphic archives were changed, by adding save_array and load_array functions. Even for archives not supporting fast array serialization per se this should improve performance, since now only a single virtual function call is required for arrays, instead of one per element.
The modifications are on the branch tagged "fast_array_serialization", and I have attached the diffs with respect to the main trunk. I have performed regression tests under darwin, using Apple's version of gcc-4. None of the changes should lead to any incompatibility with archives written with the current version of the serialization library, nor should it break any existing archive implementation.
Regarding compatibility with non-conforming compilers the only issue I see is that I have used boost::enable_if to dispatch to either the standard or fast array serialization. We should discuss what to do for compilers that do not support SFINAE. My preferred solution would be to just disable fast array serialization for these compilers, to keep the changed to the code minimal. The other option would be to add another level of indirection and implement the dispatch without using SFINAE.
Robert, could you take a look at the modifications, and would it be possibly the merge these modifications with the main trunk once you have finished your work for the 1.33.1 release?
Best regards
Matthias
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

On Oct 9, 2005, at 6:44 PM, Robert Ramey wrote:
I only took a very quick look at the diff file. I have a couple of questions:
It looks like that for certain types, (C++ arrays, vector<int>, etc) we want to use binary_save/load to leverage on the fact the fact that we can assume in certain situations that storage is contiguous.
Exactly.
Note that there is an example in the package - demo_fast_archive which does exactly this for C++ arrays. It could easily extended to cover any other desired types. I believe that using this as a basis would achieve all you desire and more which a much smaller investment of effort. Also it would not require changing the serialization library in any way.
This would lead to code duplication since we would need to overload the serialization of array vector multi_array valarray Blitz array ublas dense vectors ublas dense matrices MTL vectors MTL matrices ... not only for the demo_fast_archive, but for all archives that need such an optimization. Archives that immediately come to my mind are binary archives (as in your example) all possible portable binary archives MPI serialization PVM serialization all possible polymorphic archives .... Thus we have the problem that M types of data structures can profit from the fast array serialization in N types of archives. Instead of providing MxN overloads for the serialization library, I propose to introduce just one traits class, and implement just M overloads for the serialization and N implementations of save_array/load_array. Your example is just M=1 (array) and N=1 (binary archive). If I understand you correctly, what you propose needs M*N overloads. With minor extensions to the serialization library, the same result can be achieved with a coding effort of M+N. Matthias

Matthias Troyer wrote:
On Oct 9, 2005, at 6:44 PM, Robert Ramey wrote:
I only took a very quick look at the diff file. I have a couple of questions:
It looks like that for certain types, (C++ arrays, vector<int>, etc) we want to use binary_save/load to leverage on the fact the fact that we can assume in certain situations that storage is contiguous.
Exactly.
Note that there is an example in the package - demo_fast_archive which does exactly this for C++ arrays. It could easily extended to cover any other desired types. I believe that using this as a basis would achieve all you desire and more which a much smaller investment of effort. Also it would not require changing the serialization library in any way.
This would lead to code duplication since we would need to overload the serialization of
array vector multi_array valarray Blitz array ublas dense vectors ublas dense matrices MTL vectors MTL matrices ...
I don't think this would lead to code duplication. Since the save/load functions are implemented as templates, code is only emitted for those overloads actually invoked by the user program.
not only for the demo_fast_archive, but for all archives that need such an optimization. Archives that immediately come to my mind are
binary archives (as in your example) all possible portable binary archives MPI serialization PVM serialization all possible polymorphic archives ....
Thus we have the problem that M types of data structures can profit from the fast array serialization in N types of archives. Instead of providing MxN overloads for the serialization library, I propose to introduce just one traits class, and implement just M overloads for the serialization and N implementations of save_array/load_array.
By making and "archive wrapper" similar to the one in demo_fast_archive one can make special provisions for M special data types. These will then automatically be applicable to all existing archives including the polymorphic versions. Note that I realise that demo_fast_archive uses a class rather than a template. I did this to make the example more clear. But this could have been easily be recast as a template.
Your example is just M=1 (array) and N=1 (binary archive). If I understand you correctly, what you propose needs M*N overloads. With minor extensions to the serialization library, the same result can be achieved with a coding effort of M+N.
I didn't mean to suggest that demo_fast_archive be used as is. My intention is to show that any existing archive can be extended with overloads for specific types. There is no need to alter the core of the library in order to do this. Archive classes "know" about specific types in only a few very special cases: NVP and some types used internally by the archive implementations. A key goal of the the serialization library has been to maintain this so as to avoid MxN issues in the library itself. I believe: a) that by using derivation similar to demo_fast_archive you can achieve all the goals you desire without modification of the library itself. b) that this approach will result in smallest amount of additional coding effort. c) that the result will be applicable to current and future archives without any other coding changes. Robert Ramey

On Oct 9, 2005, at 7:36 PM, Robert Ramey wrote:
I believe:
a) that by using derivation similar to demo_fast_archive you can achieve all the goals you desire without modification of the library itself. b) that this approach will result in smallest amount of additional coding effort. c) that the result will be applicable to current and future archives without any other coding changes.
Sorry, but I still do not see how this can avoid an MxN problem, since for each of M archive class I will need to overload the serialization of N classes. Maybe I am just not seeing a trick that you have in mind, but I would think that I - need to overload the serialization for all N classes (array, vector, valarray, .....) for the demo_fast_archive. - again overload the serialization for all N classes for the fast portable binary archive - again overload the serialization for all N classes for the MPI archive - again overload the serialization for all N classes for the polymorphic archives and so on... Can you tell me what I'm missing here? Matthias

Attached is a sketch of what I have in mind. It does compile without error on VC 7.1 With this approach you would make one fast_oarchive adaptor class and one small and trivial *.hpp file for each archive it is adapted to. So that Robert Ramey Matthias Troyer wrote:
On Oct 9, 2005, at 7:36 PM, Robert Ramey wrote:
I believe:
a) that by using derivation similar to demo_fast_archive you can achieve all the goals you desire without modification of the library itself. b) that this approach will result in smallest amount of additional coding effort. c) that the result will be applicable to current and future archives without any other coding changes.
Sorry, but I still do not see how this can avoid an MxN problem, since for each of M archive class I will need to overload the serialization of N classes. Maybe I am just not seeing a trick that you have in mind, but I would think that I
- need to overload the serialization for all N classes (array, vector, valarray, .....) for the demo_fast_archive. - again overload the serialization for all N classes for the fast portable binary archive - again overload the serialization for all N classes for the MPI archive - again overload the serialization for all N classes for the polymorphic archives and so on...
Can you tell me what I'm missing here?
Matthias
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
begin 666 fast_oarchive.hpp.cpp M+R\@9F%S=%]O87)C:&EV92YH<' -"@T*+R\@861A<'1O<B!T96UP;&%T92!T M;R!C:&%N9V4@86YY(&%R8VAI=F4@8VQA<W,@:6YT;R!A(&9A<W0@87)C:&EV M92!C;&%S<PT*#0HC:6YC;'5D92 \=F5C=&]R/@T*(VEN8VQU9&4@/&]S=')E M86T^#0H-"B-I;F-L=61E(#QB;V]S="]P9G1O+FAP<#X-"B-I;F-L=61E(#QB M;V]S="]S=&%T:6-?87-S97)T+FAP<#X-"@T*;F%M97-P86-E(&)O;W-T('L- M"FYA;65S<&%C92!A<F-H:79E('L-"@T*=&5M<&QA=&4\8VQA<W,@0F%S93X- M"F-L87-S(&9A<W1?;V%R8VAI=F5?:6UP;" Z#0H@(" @<'5B;&EC($)A<V4- M"GL-"B @("!T>7!E9&5F(&9A<W1?;V%R8VAI=F5?:6UP;#Q"87-E/B!D97)I M=F5D7W0[#0HC:68@," O+T)/3U-47TY/7TU%34)%4E]414U03$%415]&4DE% M3D13("T@17AE<F-I<V4@9F]R('1H92!R96%D97(-"B @("!F<FEE;F0@8VQA M<W,@8F]O<W0Z.F%R8VAI=F4Z.F1E=&%I;#HZ:6YT97)F86-E7V]A<F-H:79E M/&1E<FEV961?=#X[#0H@(" @9G)I96YD(&-L87-S(&)A<VEC7V)I;F%R>5]O M87)C:&EV93QD97)I=F5D7W0^.PT*(" @(&9R:65N9"!C;&%S<R!B87-I8U]B M:6YA<GE?;W!R:6UI=&EV93QD97)I=F5D7W0L('-T9#HZ;W-T<F5A;3X[#0H@ M(" @9G)I96YD(&-L87-S(&)O;W-T.CIA<F-H:79E.CIS879E7V%C8V5S<SL- M"B-E;'-E#0IP=6)L:6,Z#0HC96YD:68-"B @(" O+R!F86QL('1H<F]U9V@@ M=&\@0F%S92!F;W(@86YY(&]V97)R:61E<R!N;W0@<W!E8VEF:65D(&AE<F4- M"B @("!T96UP;&%T93QC;&%S<R!4/@T*(" @('9O:60@<V%V95]O=F5R<FED M92A4("8@="P@0D]/4U1?4$943R!I;G0I>PT*(" @(" @("!"87-E.CIS879E M7V]V97)R:61E*'0L(# I.PT*(" @('T-"B @(" O+R!C=7-T;VT@<W!E8VEA M;&EZ871I;VYS#0H@(" @=F]I9"!S879E7V]V97)R:61E*&-O;G-T('-T9#HZ M=F5C=&]R/&EN=#X@)B!T+"!I;G0I>PT*(" @(" @("!S879E7V)I;F%R>2AT M+"!S:7IE;V8H:6YT*2 J('0N<VEZ92@I*3L-"B @("!]#0H@(" @=F]I9"!S M879E7V]V97)R:61E*&-O;G-T('-T9#HZ=F5C=&]R/&1O=6)L93X@)B!T+"!I M;G0I>PT*(" @(" @("!S879E7V)I;F%R>2AT+"!S:7IE;V8H9&]U8FQE*2 J M('0N<VEZ92@I*3L-"B @("!]#0H@(" @+R\@+BXN#0H-"B @(" O+R!H97)E M)W,@82!W87D@=&\@9&\@:70@9F]R(&%L;"!V96-T;W)S(&EN(&]N92!S:&]T M"0T*(" @('1E;7!L871E/&-L87-S(%0^#0H@(" @=F]I9"!S879E7V]V97)R M:61E*&-O;G-T('-T9#HZ=F5C=&]R/%0^("8@="P@:6YT*7L-"B @(" @(" @ M<V%V95]B:6YA<GDH="P@<VEZ96]F*%0I("H@="YS:7IE*"DI.PT*(" @(" @ M(" O+R!T:&ES('9E<G-I;VX@;F]T(&-E<G1I9FEE9"!F;W(@;6]R92!C;VUP M;&5X('1Y<&5S("$A(0T*(" @(" @("!"3T]35%]35$%424-?05-315)4*&)O M;W-T.CII<U]P<FEM:71I=F4\5#XZ.G9A;'5E*3L-"B @(" @(" @+R\@;W(@ M<&]I;G1E<G,@96ET:&5R("$A(0T*(" @(" @("!"3T]35%]35$%424-?05-3 M15)4*&)O;W-T.CII<U]P;VEN=&5R/%0^.CIV86QU92D[#0H@(" @?0T*<'5B M;&EC.@T*(" @(&9A<W1?;V%R8VAI=F5?:6UP;"AS=&0Z.F]S=')E86T@)B!O M<RP@=6YS:6=N960@9FQA9W,@/2 P*2 Z#0H@(" @(" @($)A<V4H;W,L(&9L M86=S*0T*(" @('M]#0I].PT*#0I]?0T*#0HO+R!N;W<@;6%K92!A('-M86QL M(&9I;&4@9F]R(&5A8V@@87!P;&EC871I;VX@#0HO+R!O9B!T:&4@861A<'1O M<BX@5&AI<R!F:6QE('=I;&P@8V]N=&%I;B!T:&4@9F]L;&]W:6YG#0H-"B\O M(&9A<W1?=&5X=%]O87)C:&EV92YH<' -"@T*+R\@(VEN8VQU9&4@9F%S=%]O M87)C:&EV92YH<' -"B-I;F-L=61E(#QB;V]S="]A<F-H:79E+W1E>'1?;V%R M8VAI=F4N:'!P/@T*#0HO+R!U;F9O<G1U;F%T96QY+"!O;F4@8V%N)W0@=7-E M(&$@='EP961E9B!W:71H(&$@=&5M<&QA=&4N($1A;6X@(2$A#0HO+R!S;R!) M(&-A;G0@9&\@=&AE(&9O;&QO=VEN9SH-"B\O('1Y<&5D968@9F%S=%]O87)C M:&EV95]I;7!L/'1E>'1?;V%R8VAI=F5?:6UP;#QT97AT7V]A<F-H:79E/B ^ M(&9A<W1?=&5X=%]O87)C:&EV93L-"@T*+R\@0G5T($D@8V%N(&1O('1H92!N M97AT(&)E<W0@=&AI;F<-"@T*;F%M97-P86-E(&)O;W-T('L-"FYA;65S<&%C M92!A<F-H:79E('L-"@T*8VQA<W,@9F%S=%]T97AT7V]A<F-H:79E(#H-"B @ M("!P=6)L:6,@9F%S=%]O87)C:&EV95]I;7!L/'1E>'1?;V%R8VAI=F5?:6UP M;#QF87-T7W1E>'1?;V%R8VAI=F4^(#X-"GL-"B @("!T>7!E9&5F(&9A<W1? M;V%R8VAI=F5?:6UP;#QT97AT7V]A<F-H:79E7VEM<&P\9F%S=%]T97AT7V]A M<F-H:79E/B ^(&)A<V5?=#L-"B @("!F87-T7W1E>'1?;V%R8VAI=F4H<W1D M.CIO<W1R96%M("8@;W,L('5N<VEG;F5D(&EN="!F;&%G<R ](# I(#H-"B @ M(" @(" @8F%S95]T*&]S+"!F;&%G<RD-"B @("![?0T*(" @('YF87-T7W1E M>'1?;V%R8VAI=F4H*7M]#0I].PT*#0I]?0T*#0HO+R!R97%U:7)E9"!B>2!S M;6%R=%]C87-T(&9O<B!C;VUP:6QE<G,@;F]T(&EM<&QE;65N=&EN9R -"B\O M('!A<G1I86P@=&5M<&QA=&4@<W!E8VEA;&EZ871I;VX-"D)/3U-47T)23TM% M3E]#3TU024Q%4E]465!%7U1204E44U]34$5#24%,25I!5$E/3BAB;V]S=#HZ M87)C:&EV93HZ9F%S=%]T97AT7V]A<F-H:79E*0T*#0HO+R!F87-T7V)I;F%R :>5]O87)C:&EV92YH<' -"B\O("XN+@T*#0H` ` end

On Sun, Oct 09, 2005 at 02:15:44PM -0700, Robert Ramey wrote:
With this approach you would make one fast_oarchive adaptor class and one small and trivial *.hpp file for each archive it is adapted to.
Hey, Quick question. The fast_archive example has a function template save_override() that sends calls that aren't otherwise overridden in derived_t back to base_t: template<class Base> class fast_oarchive_impl : public Base { // fall through to Base for any overrides not specified here template<class T> void save_override(T & t, BOOST_PFTO int){ Base::save_override(t, 0); } // custom specializations void save_override(const std::vector<int> & t, int){ save_binary(t, sizeof(int) * t.size()); } }; And I've done this for a few archive types. Works fine. The portable_binary_archive example, does the same thing, except it does so for save(), like this: class portable_binary_oarchive : public boost::archive::binary_oarchive_impl<portable_binary_oarchive> { typedef portable_binary_oarchive derived_t; typedef boost::archive::binary_oarchive_impl<portable_binary_oarchive> base_t; // default fall through for any types not specified here template<class T> void save(const T & t){ base_t::save(t); } void save(const unsigned int t){ save_impl(t); } }; Which I've also used. AFAICT, I could do what I've needed to do so far either way. What's the difference? Apologies in advance if I've missed something in the docs. -t

LOL - I'm amazed that I don't know the answer to this question off hand. After some reflection I think the answer is the following( for loading) Take for example text_iarchive. It inherits from basic_text_iprimitive.hpp (for input of prmitive types) and it inherits from basic_text_iarchive for all other types. basic_text_iprimitives contains load functions for all prmitive types. basic_iarchive.hpp contains load_override functions for all "special" types not handled by iserializer.hpp. interface_iarchive contains the interface described in the manual under Loading Archive Concept and direct calls the the interface to the most derived class of the implementation. The chain of calls for ar >> x where x is a non-primitive type will be: interface_iarchive::operator>> -> text_iarchive::load_override -> basic_text_iarchive<Archive>::load_override(t, 0) -> archive::load(* this->This(), t); // in iserializer.hpp ... The chain of calls for ar >> i where i is a primititve data type will be: interface_iarchive::operator>> -> text_iarchive::load_override -> basic_text_iarchive<Archive>::load_override(t, 0) -> archive::load(* this->This(), t); // in iserializer.hpp -> load_non_pointer ->load_primitive ->ar.load(...) // back to archive to load primitive ->text_iarchive::load ->basic_text_iprimitive::load if not overridden So load is overridden for primitives while load_override is used for others. Now looking at this, it seems that this could be made shorter, more efficient and more transparent not to mention better explain in the Archive Implementation section of the documents. Maybe we'll do that when we have nothing else to do. Robert Ramey

Robert Ramey wrote:
LOL - I'm amazed that I don't know the answer to this question off hand.
Actually, since I wrote that response I went back and looked at the documentation. I found that the differing roles of save and save_override are explained (sort of). I'll try to clarify the better when I get to improving this section. Robert Ramey

On Oct 10, 2005, at 7:21 AM, troy d. straszheim wrote:
On Sun, Oct 09, 2005 at 02:15:44PM -0700, Robert Ramey wrote:
With this approach you would make one fast_oarchive adaptor class and one small and trivial *.hpp file for each archive it is adapted to.
And I've done this for a few archive types. Works fine. The portable_binary_archive example, does the same thing, except it does so for save(), like this:
class portable_binary_oarchive : public boost::archive::binary_oarchive_impl<portable_binary_oarchive> { typedef portable_binary_oarchive derived_t; typedef boost::archive::binary_oarchive_impl<portable_binary_oarchive> base_t;
// default fall through for any types not specified here template<class T> void save(const T & t){ base_t::save(t); } void save(const unsigned int t){ save_impl(t); } };
Which I've also used. AFAICT, I could do what I've needed to do so far either way. What's the difference? Apologies in advance if I've missed something in the docs.
For a portable binary archive this solution is perfect. For fast array serialization a similar approach has problems, as I outlined under point 3 and 4 in my response to Robert Ramey: you need to specifically overload _in the archive_ for all classes that want to make use of fast array serialization, thus introducing a tight coupling between archive and class to be serialized, as well as making it hard to extend. Matthias

Matthias Troyer wrote:
For a portable binary archive this solution is perfect. For fast array serialization a similar approach has problems, as I outlined under point 3 and 4 in my response to Robert Ramey: you need to specifically overload _in the archive_ for all classes that want to make use of fast array serialization, thus introducing a tight coupling between archive and class to be serialized, as well as making it hard to extend.
Here is where were on different pages. you only have to override serialization of vector etc in only one place - in the fast_archive_adaptor class. Then the behavior is available to any class that the adaptor is applied to. Robert Ramey

On Oct 15, 2005, at 10:33 PM, Robert Ramey wrote:
Matthias Troyer wrote:
For a portable binary archive this solution is perfect. For fast array serialization a similar approach has problems, as I outlined under point 3 and 4 in my response to Robert Ramey: you need to specifically overload _in the archive_ for all classes that want to make use of fast array serialization, thus introducing a tight coupling between archive and class to be serialized, as well as making it hard to extend.
Here is where were on different pages. you only have to override serialization of vector etc in only one place - in the fast_archive_adaptor class. Then the behavior is available to any class that the adaptor is applied to.
There are two severe problems with this approach: 1. Most of my archives using fast array serialization would not be written as archive adaptors, since for MPI archives, PVM archives, and many others, it does not make any sense to write an archive without fast array serialization. These archives have to support fast array serialization from the start. 2. and here is the main problem, you propose that all serialization code (not only for std::vector but for all future classes, such as valarray,multi_array, ublas and MTL matrices) be written without concern for fast array serialization, and that I then provide overload for all these classes in an adaptor. There are a number of reasons why this is either not good, or will not work at all: a) it leads to a tight coupling of the archive classes to implementation details of all these libraries. The code to serialize a boost::multi_array should be with the multi_array library and not in my archive class. b) the user of my library will have to include hundreds of lines of serialization code for all these classes, even if he never needs them. Contrast that with the inclusion of a few lines for save_array and load_array. c) even worse: in the cases I referred to this usually cannot be implemented without being intrusive on the library whose datatype is being serialized. E.g. the Boost.multi_array, MTL or Blitz will have to be modified to allow serialization, since serialization is intrusive for most classes. The "adaptor" you are proposing will then also have to be intrusive! Please note here, that the "non intrusive" serialization you show in the tutorial is still intrusive. You had to make the data members public to be able to implement the "non intrusive" serialization. For classes that have getter and setter functions for all members, or where I can extract the constructor arguments from the class in a non- intrusive way it is possible to write non-intrusive serialization. But for all other cases, there is no such thing as non-intrusive serialization. It is even worse in the case of Blitz arrays, which have their own built-in reference counted memory allocation scheme. Views in multi_arrays are similar. There is no non-intrusive way to serialize these data structures! Since Boost.Serialization support has to be intrusive for these data structures, I believe that the intrusiveness should be kept to a minimum and only one serialization function be provided. Thus your statement
In genearl, I want no more coupling than is absolutly necessary. I don't think its necessary here. You can get every thing you want and more by using an archive adaptor.
is clearly incorrect. If the serialization library documentation tells, e.g. the MTL authors to serialize their arrays by looping over all elements, I will have to, after they implement their version, be intrusive on the MTL library to get direct array serialization in. Better to have them support it directly! And to answer:
Actually the cost is minimal if the archive does not support fast save/load_array. The hasfast_array_serialization.hpp header only consists of the default traits:
One still has to include the header. This violates the boost principle - don't pay for what you don't use. Actually the boost principle would be - don't even know about what you don't use - which is better anyway.
You already violate this principle much more severely in the serialization library. If I do not want object tracking and versioning for a text_oarchive of some objects, the code for tracking and versioning is still included by the serialization library. Robert, to focus the discussion and not get stuck in details let me stress a point that I had previously made at both reviews of your library, and that I still believe in: * A serialization library without built-in support for serialization of arrays is fundamentally flawed * I believe that this is the main issue we need to get sorted out first, since it is the fundamental point of disagreement. You write (I quote from your other e-mail):
default serialization of C array .... does have a universal default implementation.
and your default implementation of saving is to save the size as an unsigned int and then loop over all the elements, saving each one. My opinion is that this is the wrong approach! Instead a save_array function should be called, for which the default implementation would be just what you describe above, but the archive can overload it. Here are the reasons: 1.) 10x speedup, or more 2.) no need to provide intrusive overloads for all classes that want to use save_array 3.) prior art. There is a reason why - MPI, PVM and other message passing libraries support array serialization - XDR, used for remote procedure calls under Unix, has special support for arrays - HDF5, a standard for large scientific data sets operates directly on large arrays To interface to all these libraries, and to achieve reasonable performance with them and with binary archives, the direct support for array serialization by the serialization library is essential. In the past you have claimed that there is no need for a special array serialization and wanted to see benchmarks. We now have benchmark numbers (not only from me), that show roughly 10x or more penalty. Furthermore we have cases where serialization is impossible without it. Matthias

Matthias Troyer wrote:
On Oct 15, 2005, at 10:33 PM, Robert Ramey wrote:
Matthias Troyer wrote:
For a portable binary archive this solution is perfect. For fast array serialization a similar approach has problems, as I outlined under point 3 and 4 in my response to Robert Ramey: you need to specifically overload _in the archive_ for all classes that want to make use of fast array serialization, thus introducing a tight coupling between archive and class to be serialized, as well as making it hard to extend.
Here is where were on different pages. you only have to override serialization of vector etc in only one place - in the fast_archive_adaptor class. Then the behavior is available to any class that the adaptor is applied to.
There are two severe problems with this approach:
1. Most of my archives using fast array serialization would not be written as archive adaptors, since for MPI archives, PVM archives, and many others, it does not make any sense to write an archive without fast array serialization. These archives have to support fast array serialization from the start.
2. and here is the main problem, you propose that all serialization code (not only for std::vector but for all future classes, such as valarray,multi_array, ublas and MTL matrices) be written without concern for fast array serialization, and that I then provide overload for all these classes in an adaptor. There are a number of reasons why this is either not good, or will not work at all:
a) it leads to a tight coupling of the archive classes to implementation details of all these libraries. The code to serialize a boost::multi_array should be with the multi_array library and not in my archive class.
b) the user of my library will have to include hundreds of lines of serialization code for all these classes, even if he never needs them. Contrast that with the inclusion of a few lines for save_array and load_array.
c) even worse: in the cases I referred to this usually cannot be implemented without being intrusive on the library whose datatype is being serialized. E.g. the Boost.multi_array, MTL or Blitz will have to be modified to allow serialization, since serialization is intrusive for most classes. The "adaptor" you are proposing will then also have to be intrusive!
I don't really agree with this but it doesn't really matter. In this case just write your own fast_binary archive and derive variations from it. You can skip the binary_archive and basic_binary archive all together. This would be very easy.
Please note here, that the "non intrusive" serialization you show in the tutorial is still intrusive. You had to make the data members public to be able to implement the "non intrusive" serialization. For classes that have getter and setter functions for all members, or where I can extract the constructor arguments from the class in a non- intrusive way it is possible to write non-intrusive serialization. But for all other cases, there is no such thing as non-intrusive serialization. It is even worse in the case of Blitz arrays, which have their own built-in reference counted memory allocation scheme. Views in multi_arrays are similar. There is no non-intrusive way to serialize these data structures!
I've come to realize that some classes do not provide an iterface sufficient to support serialization. share_ptr has this problem as does boost::any. I'm sure there are others for as well. Its out of my hands.
Since Boost.Serialization support has to be intrusive for these data structures, I believe that the intrusiveness should be kept to a minimum and only one serialization function be provided.
If the serialization library documentation tells, e.g. the MTL authors to serialize their arrays by looping over all elements, I will have to, after they implement their version, be intrusive on the MTL library to get direct array serialization in. Better to have them support it directly!
fine - just make your own archive. I'm perfectly happy with this. The documentation can easily be changed so that for the archives included with the package the default serialization of arrays is ...
One still has to include the header. This violates the boost principle - don't pay for what you don't use. Actually the boost principle would be - don't even know about what you don't use - which is better anyway.
You already violate this principle much more severely in the serialization library. If I do not want object tracking and versioning for a text_oarchive of some objects, the code for tracking and versioning is still included by the serialization library.
The headers are included but the code isn't instantiated.
Robert, to focus the discussion and not get stuck in details let me stress a point that I had previously made at both reviews of your library, and that I still believe in:
* A serialization library without built-in support for serialization of arrays is fundamentally flawed *
I believe that this is the main issue we need to get sorted out first, since it is the fundamental point of disagreement.
You write (I quote from your other e-mail):
default serialization of C array .... does have a universal default implementation.
and your default implementation of saving is to save the size as an unsigned int and then loop over all the elements, saving each one.
My opinion is that this is the wrong approach! Instead a save_array function should be called, for which the default implementation would be just what you describe above, but the archive can overload it.
Here are the reasons:
1.) 10x speedup, or more
2.) no need to provide intrusive overloads for all classes that want to use save_array
3.) prior art. There is a reason why
- MPI, PVM and other message passing libraries support array serialization - XDR, used for remote procedure calls under Unix, has special support for arrays - HDF5, a standard for large scientific data sets operates directly on large arrays
To interface to all these libraries, and to achieve reasonable performance with them and with binary archives, the direct support for array serialization by the serialization library is essential.
This is not an issue of efficiency. The instanctiated code is the same regardless of where you put it. I have the general case in the core library and any one is free to make his own archive for more specific cases - which this is. The native binary archive is actually quite small. Its only as big as it is because it supports a wide character interface. The idea of a wide char interface for the binary archive is dubiuos anyway. So its simple just to make your own version of binary archives. The library supports and encourages that and you don't have to change anything in the core to do that. I'm looking forward to seeing the final result. Robert Ramey

On Oct 19, 2005, at 8:26 AM, Robert Ramey wrote:
Matthias Troyer wrote:
On Oct 15, 2005, at 10:33 PM, Robert Ramey wrote:
re on different pages. you only have to override serialization of vector etc in only one place - in the fast_archive_adaptor class. Then the behavior is available to any class that the adaptor is applied to.
[snip - discussion why serialization of multi_array and similar classes is always intrusive]
I've come to realize that some classes do not provide an iterface sufficient to support serialization. share_ptr has this problem as does boost::any. I'm sure there are others for as well. Its out of my hands.
Since Boost.Serialization support has to be intrusive for these data structures, I believe that the intrusiveness should be kept to a minimum and only one serialization function be provided.
If the serialization library documentation tells, e.g. the MTL authors to serialize their arrays by looping over all elements, I will have to, after they implement their version, be intrusive on the MTL library to get direct array serialization in. Better to have them support it directly!
fine - just make your own archive. I'm perfectly happy with this. The documentation can easily be changed so that for the archives included with the package the default serialization of arrays is ...
Robert, it seems that maybe I did not make myself clear enough, so let me stress the important point: For classes like multi_array, MTL matrices, Blitz arrays, ublas matrices, ... in almost all cases serialization has to be intrusive. So the question is, what should we recommend to the authors of this library: should we encourage them to use the save_array/load_array interface whenever it is possible? The only sensible answer is yes, because a) it will give them much faster archives (large matrices are almost always stored in binary format) b) overloading the serialization in my own archive class, as you suggest, is impossible, since serialization has to be intrusive for these classes. c) it places all the serialization code of these classes with the library, and not inside my archive. d) it is easily extensible (unlike placing all serialization codes using save_array, load_array in my archive classes.
One still has to include the header. This violates the boost principle - don't pay for what you don't use. Actually the boost principle would be - don't even know about what you don't use - which is better anyway.
You already violate this principle much more severely in the serialization library. If I do not want object tracking and versioning for a text_oarchive of some objects, the code for tracking and versioning is still included by the serialization library.
The headers are included but the code isn't instantiated.
Oh, and how about the explicit instantiations in the *.cpp files? if you are concerned with compile time for the instantiated functions thanthe save_array/load_aray code is actually better. For example, for the binary archives serializing an array with my version is a single call to save_array or load array, while in the default version it is a for loop.
This is not an issue of efficiency. The instanctiated code is the same regardless of where you put it.
Except that for multi_array, Blitz array and similar classes the serialization code is intrusive, and I cannot just put an overload into my archive class. Providing a save_array/load_array function the authors of those libraries can provide efficient serialization whenever it makes sense at negligible cost.
So its simple just to make your own version of binary archives. The library supports and encourages that and you don't have to change anything in the core to do that. I'm looking forward to seeing the final result.
Actually, the archives are available, but we have to change the serialization library: - we have to add the fast array serialization traits class - we have to encourage library authors to use save_array/load_array wherever possible, and has_fast_array_serialization.hpp thus will regularly be included - we have to provide fast array serialization for the STL contiguous array-like classes such as std::vector and std::valarray, as well as for C-style arrays Matthias

Robert, It has been nearly two weeks since I sent my last reply and I'm still waiting for an answer. To summarize, I explained why we should recommend to authors of serialization functions to use the proposed array serialization mechanism and why this cannot be done in a fast array serialization archive. Matthias

Robert Ramey wrote:
Matthias Troyer wrote:
For a portable binary archive this solution is perfect. For fast array serialization a similar approach has problems, as I outlined under point 3 and 4 in my response to Robert Ramey: you need to specifically overload _in the archive_ for all classes that want to make use of fast array serialization, thus introducing a tight coupling between archive and class to be serialized, as well as making it hard to extend.
Here is where were on different pages. you only have to override serialization of vector etc in only one place - in the fast_archive_adaptor class. Then the behavior is available to any class that the adaptor is applied to.
IIUC (apologies if I am missing something), Matthias' argument is that 1. serialization of std::vector needs to be overridden in the fast_archive_adaptor class. 2. serialization of builtin arrays needs to be overridden in the fast_archive_adaptor class. 3. serialization of ublas::vector needs to be overridden in the fast_archive_adaptor class. 4. serialization of ublas::matrix needs to be overridden in the fast_archive_adaptor class. 5. serialization of mtl::vector needs to be overridden in the fast_archive_adaptor class. 6. serialization of mtl::matrix needs to be overridden in the fast_archive_adaptor class. 7. serialization of blitz::array needs to be overridden in the fast_archive_adaptor class. 8. serialization of custom_lib::fast_matrix needs to be overridden in the fast_archive_adaptor class. Unfortunately this is not so easy as custom_lib::fast_matrix is maintained by someone else and the serialization functions need access to some classes buried deep in some private implementation header. 9 ..... So, N functions, most of which are trivial forwards to a save_array / load_array function. Cheers, Ian

On Oct 9, 2005, at 11:15 PM, Robert Ramey wrote:
Attached is a sketch of what I have in mind. It does compile without error on VC 7.1
With this approach you would make one fast_oarchive adaptor class and one small and trivial *.hpp file for each archive it is adapted to.
---- SUMMARY --------- If I may summarize this solution as follows: template<class Base> class fast_oarchive_impl : public Base { public: ... // custom specializations void save_override(const std::vector<int> & t, int){ save_binary(t, sizeof(int) * t.size()); } // here's a way to do it for all vectors in one shot template<class T> void save_override(const std::vector<T> & t, int){ save_binary(t, sizeof(T) * t.size()); // this version not certified for more complex types !!! BOOST_STATIC_ASSERT(boost::is_primitive<T>::value); // or pointers either !!! BOOST_STATIC_ASSERT(boost::is_pointer<T>::value); } ... }; then I see several major disadvantages of this approach: 1.) it fixes the value types for which fast array serialization can be done 2.) for M types types to be serialized and N archives there is an MxN problem in this approach. 3.) it leads to a tight coupling between archives and all classes that can profit from fast array serialization (called "array-like classes" below), and makes the archive depend on implementation details of the array-like classes 4.) it is not easily extensible to new array-like classes Let me elaborate on these points below and provide a possible solution to each of them. The simplest solution, as I see it will be to - provide an additional traits class has_fast_array_serialization - archives offering (the optional) fast array serialization provide a save_array member function in addition to save and save_binary - The dispatch to either save() or save_array() is the responsibility of the serialization code of the class, and not the responsibility of the archive These are minor extensions to the serialization library, that do not break any existing code, that do not make it harder to write a new archive or a new serialize function, but they allow new types of archives and can give huge speedups for large data sets. ------- DETAILS ----------- Now the details ad 1.: you need to fix the types for which the fast version is used, either by providing explicit overloads for K types (such as int above), which would give a KxMxN problem, or by using a template- based approach. You are probably aware that the BOOST_STATIC_ASSERT in your example will cause this archive to fail working with std::vectors of more complex types. One easy way to solve this is by restricting the applicability of the template, using boost::enable_if: template<class T> void save_override ( const std::vector<T> & t, int, typename boost::enable_if<has_fast_array_serialization<Base,T> >::type *=0 ); where the traits class has_fast_array_serialization<Base,T> specifies whether the fast version should be used for the type T with the archive Base. The reason to provide a traits class instead of hard-coding the types, and restricting them to primitive non-pointer types is that the set of types that can use an optimized serialization depends on the archive type. A non-portable binary archive could support all POD- types that contain no pointer members (e.g. the gps_position class in your example), while an MPI archive can support fast serialization for all default-constructible types. Hence a traits class depending on both the archive type and the value_type of the vector. ad 2.: there is still an MxN problem: you propose to dispatch to save_binary: void save_override(const std::vector<T> & t, int) { save_binary(t, sizeof(T) * t.size()); } where the signature of save_binary is void save_binary(void const *, std::size_t *) This is an acceptable solution for binary archives, and maybe a few others, but is NOT a general solution. To illustrate this, let me show how the fast saving is or might be implemented for some other archives. a) a potential portable binary archive might need to do byte reordering: void save_override(const std::vector<T> & t, int) { save_with_reordering(&(t[0]), t.size()); } where the save_with_reordering function will need type information to do the byte reordering, and might have a signature template <class T> void save_with_reordering(void T *, std::size_t *) b) an XDR archive, using XDR streams needs to make a call to an XDR function, and pass type information as well, as in class xdr_oarchive { ... void save_override(const std::vector<int> & t, int) { xdr_vector(&stream,&(t[0]),t.size(),sizeof(T),&xdr_int); } XDR* stream; } and a templated version could also be provided easily. Note that again I need the address, size and type information and cannot make this call from within save_binary. I have an archive implementation based on the UNIX xdr calls, and this is thus no hypothetical example/ c) let's next look at a packed MPI archive (of which I also have an implementation), where the override would be // simplified version of MPI archive class packed_mpi_oarchive { ... void save_override(const std::vector<int> & t, int) { MPI::Datatype datatype(MPI::INTEGER); datatype.Pack(&(t[0]), t.size(), buffer, buffer_size, position, communicator); } char* buffer; int buffer_size; int position; MPI::Comm& communicator; } and again, I need type information and cannot just call save_binary. d) as a fourth example I want to mention that MPI allows for serialization by message passing without the need to pack the data into a buffer first, but only the addresses and types of all data members need to be stored, to create a custom MPI type. An incomplete implementation (I have a complete implementation, based on the original idea by Daniel Egloff) would be: class mpi_oarchive { ... template <class T> void save_override(const std::vector<T> & t, int) { register_member(&(t[0]),t.size()); } template <class T> void register_member(T const* t, std::size_t l) { addresses.push_back(MPI::Get_address(t)); sizes.push_back(l); types.push_back(mpi_type<T>::value); } std::vector<MPI::Aint> addresses; std::vector<int> sizes; std::vector<MPI::Datatype> types; } Note that again save_binary does not do the trick, since we need type information. For this reason my proposed solution is to dispatch to a save_array function for those types and archives supporting it: template<class Base> class fast_oarchive_impl : public Base { public: // here's a way to do it for all vectors in one shot template<class T> void save_override ( const std::vector<T> & t, int, typename boost::enable_if<has_fast_array_serialization<Base,T> >::type *=0 ) { save_binary(&(t[0]),t.size()); } ... }; where all archive classes provide a function like void Archive::save_array(Type const *, std::size_t) for all types for which the traits has_fast_array_serialization<Archive,Type> is true. That way a single overload suffices for all the N=5 archive types presented above, and the MxN problem is solved. Note also, that archives not supporting this fast array serialization do not need to implement anything, as the default for has_fast_array_serialization<Archive,Type> is false. ad 3.: your proposal leads to a tight coupling between archives and classes to be serialized. Consider what I would need to do to add support for some future MTL matrix type. Again I present a simplified example showing the problem: template<class T> void save_override( const mtl_dense_matrix & m, int) { T const * data = implemetation_dependent_function_to_get_pointer(m); std::size_t length = implemetation_dependent_function_to_get_size(m) save_binary(data,length); } This introduces implementation details of the mtl_dense_matrix class into the archive, breaks orthogonality, and leads to a tight coupling. Change in these implementation details of the mtl_dense_matrix might require changes to the archive classes. The solution is easy: - some archives provide fast array serialization through the save_array member function - let the MTL be responsible for serialization of it own classes, and use save_array where appropriate ad 4.: in order to use fast array serialization with other classes such as - std::vector - std::valarray - boost::multi_array - uBlas vectors and matrices - blitz::Array save_override functions for ALL of these classes have to be added to the archive. This means, that to support any new class, be it a new ublas matrix, future MTL matrices, Blitz++ arrays, ...., the archive class needs to be modified. This is clearly not a scalable design. To summarize, with three minor extensions to the serialization library, none of which breaks any existing code, we can get 10x speedups for serialization of large date sets, enable new types of archives such as MPI archives, and all of that without introducing any of the four problems discussed here. Matthias

---- SUMMARY ---------
If I may summarize this solution as follows:
template<class Base> class fast_oarchive_impl : public Base { public: ... // custom specializations void save_override(const std::vector<int> & t, int){ save_binary(t, sizeof(int) * t.size()); }
// here's a way to do it for all vectors in one shot template<class T> void save_override(const std::vector<T> & t, int){ save_binary(t, sizeof(T) * t.size()); // this version not certified for more complex types !!! BOOST_STATIC_ASSERT(boost::is_primitive<T>::value); // or pointers either !!! BOOST_STATIC_ASSERT(boost::is_pointer<T>::value); }
... };
then I see several major disadvantages of this approach:
1.) it fixes the value types for which fast array serialization can be done
I worked around this by introducing an intermediary type ArchiveByteArray as follows struct BinaryArchiveByteArrayI { int count; void *ptr; }; template<class T> BinaryArchiveByteArrayI MakeArchiveInputByteArray(int count, T *t) { BinaryArchiveByteArrayI res; res.count = count * sizeof(T); res.ptr = t; return res; } struct BinaryArchiveByteArrayO { int count; const void *ptr; }; template<class T> BinaryArchiveByteArrayO MakeArchiveOutputByteArray(int count, const T *t) { BinaryArchiveByteArrayO res; res.count = count * sizeof(T); res.ptr = t; return res; } BOOST_CLASS_IMPLEMENTATION(BinaryArchiveByteArrayI, primitive_type); BOOST_CLASS_IMPLEMENTATION(BinaryArchiveByteArrayO, primitive_type); Then adding in the input / output archive void load(BinaryArchiveByteArrayI &ba); void save(const BinaryArchiveByteArrayO &ba); This should be able to cope with array types and anything else that takes a contiguous block of memory. Then there is one set of overloads per type , ie for std::vector template<class U, class Allocator> inline void load( BinaryInputArchive & ar, std::vector<U, Allocator> &t, const unsigned int /* file_version */ ){ boost::mpl::if_<boost::is_pod<U>, Detail::VectorLoadPodImp, Detail::VectorLoadImp>::type(ar, t); } template<class U, class Allocator> inline void save( BinaryOutputArchive& ar, const std::vector<U, Allocator> &t, const unsigned int /* file_version */ ){ boost::mpl::if_<boost::is_pod<U>, Detail::VectorSavePodImp, Detail::VectorSaveImp>::type(ar, t); } This could probably have a dispatch mechanism above it checking for an archive trait to dispatch to either the fast or default implemention. Martin -- No virus found in this outgoing message. Checked by AVG Anti-Virus. Version: 7.0.344 / Virus Database: 267.11.14/127 - Release Date: 10/10/2005

On Oct 10, 2005, at 2:09 PM, Martin Slater wrote:
then I see several major disadvantages of this approach:
1.) it fixes the value types for which fast array serialization can be done
I worked around this by introducing an intermediary type ArchiveByteArray as follows
... struct BinaryArchiveByteArrayO { int count; const void *ptr; };
template<class T> BinaryArchiveByteArrayO MakeArchiveOutputByteArray(int count, const T *t) { BinaryArchiveByteArrayO res;
res.count = count * sizeof(T); res.ptr = t;
return res; }
There is a problem here: the type information gets lost once you create an BinaryArchiveByteArrayO, but the implementation of save (BinaryArchiveByteArrayO const&) will require that information in the case of XDR, MPI and other archives.
...
template<class U, class Allocator> inline void save( BinaryOutputArchive& ar, const std::vector<U, Allocator> &t, const unsigned int /* file_version */ ){ boost::mpl::if_<boost::is_pod<U>, Detail::VectorSavePodImp, Detail::VectorSaveImp>::type(ar, t); }
This is similar to the way I propose to implement vector serialization. Just look at the diffs I attached to my mail. The main difference is that instead of the mpl::if_ I use boost::enable_if.
This could probably have a dispatch mechanism above it checking for an archive trait to dispatch to either the fast or default implemention.
Exactly, a trait will be more flexible here, since your dispatch based on boost::is_pod<U> might be too narrow for some archives, and too broad for others (some such as XML archives might never want to support it). Matthias

Matthias Troyer wrote:
On Oct 9, 2005, at 11:15 PM, Robert Ramey wrote:
Attached is a sketch of what I have in mind. It does compile without error on VC 7.1
With this approach you would make one fast_oarchive adaptor class and one small and trivial *.hpp file for each archive it is adapted to.
---- SUMMARY ---------
If I may summarize this solution as follows:
template<class Base> class fast_oarchive_impl : public Base { public: ... // custom specializations void save_override(const std::vector<int> & t, int){ save_binary(t, sizeof(int) * t.size()); }
// here's a way to do it for all vectors in one shot template<class T> void save_override(const std::vector<T> & t, int){ save_binary(t, sizeof(T) * t.size()); // this version not certified for more complex types !!! BOOST_STATIC_ASSERT(boost::is_primitive<T>::value); // or pointers either !!! BOOST_STATIC_ASSERT(boost::is_pointer<T>::value); }
... };
A fair characterization. But my point isn't to suggest or promote a specific override. Rather my point is to show that the library can be extended without altering the internals of the library itself. All the included archive classes have a similar structure ///////////////////////////////////////////////////////////////////////// // class basic_text_iarchive - read serialized objects from a input text stream template<class Archive> class basic_text_iarchive : public detail::common_iarchive<Archive> { ... // intermediate level to support override of operators // fot templates in the absence of partial function // template ordering template<class T> void load_override(T & t, BOOST_PFTO int){ archive::load(* this->This(), t); } ... }; where archive::load is declared and defined in the file iserializer.hpp. This latter file includes all the "basic" functionality required for prmitive types supported by C++. I have made huge efforts not to couple the code in iserializer.hpp to any other types. (nvp might be an exception). Within iserializer.hpp the function archive::load dispatches to different implementation depending on traits of the type being serialized. Now if one wants to handle a particular type in a special way (e.g. vector<T> where T is not pointer. then one could aument serializer.hpp. But one could just as well do ///////////////////////////////////////////////////////////////////////// // class basic_text_iarchive - read serialized objects from a input text stream template<class Archive> class basic_text_iarchive : public detail::common_iarchive<Archive> { ... // intermediate level to support override of operators // fot templates in the absence of partial function // template ordering template<class T> void load_override(T & t, BOOST_PFTO int){ // your own dispatch code here for particular cases. // fall through to default/universal implementation archive::load(* this->This(), t); } ... }; There is no need to alter the default/universal/basic serialization implemenation. Of course one doesn't have to do the above. Since the code uses the CRTP to call load_overload in the most derived class, then the class above can be unchanged and the following can be included in the most derived class. template<class T> void load_override(T & t, BOOST_PFTO int){ // your own dispatch code here for particular cases. // fall through to default/universal implementation basic_text_iarchive<Archive>::load_override(t, 0); } Adding your own dispatch code in the indicated place will be exactly the same as incorporating your code into iserializer.hpp - Except that your special dispatch code will only be included when requested and won't have to be bypassed conditionally with an new type trait. The problem with the above is that it applies only to one specific archive class. So my proposal was to make an "archive adaptor" to permit your overrides to be added to any functioning archive class.
then I see several major disadvantages of this approach:
1.) it fixes the value types for which fast array serialization can be done
2.) for M types types to be serialized and N archives there is an MxN problem in this approach.
3.) it leads to a tight coupling between archives and all classes that can profit from fast array serialization (called "array-like classes" below), and makes the archive depend on implementation details of the array-like classes
4.) it is not easily extensible to new array-like classes
I believe that the above points really refer to the specific override I used in my example. I have no issue at all with your particular overrides. In fact, I'm pleased that people are finding that the library can be extended to handle more specific cases. I just want to keep these add-ins as exactly that - optional additions. Your override can be as elaborat as you want including your own trait - is_contiguous or whatever. We're doing the same code - its just placed in different source modules. Yours places it in parts of the library that everyone uses, mine places it in separate header modules. The point is that we would have two orthogonal components to maintain.
Let me elaborate on these points below and provide a possible solution to each of them. The simplest solution, as I see it will be to
- provide an additional traits class has_fast_array_serialization - archives offering (the optional) fast array serialization provide a save_array member function in addition to save and save_binary - The dispatch to either save() or save_array() is the responsibility of the serialization code of the class, and not the responsibility of the archive
That sounds very good to me. Maybe I spoke too soon. I don't see how this would require and changes at all to the serialization library. I don't see why has_fast_array_serialization has to be part of the serialization library. Maybe all the code can be included in boost/serialization/fast_arrary.hpp ? This header would be included by all the classes that use it and no others.
These are minor extensions to the serialization library, that do not break any existing code, that do not make it harder to write a new archive or a new serialize function, but they allow new types of archives and can give huge speedups for large data sets.
Its clear I'm missing something here. I'll have to look more deeply into this when I get a couple of other monkeys of my back.
------- DETAILS -----------
Now the details
ad 1.:
... no problem here.
ad 2.: ... a) a potential portable binary archive might need to do byte reordering: ... no problem
b) an XDR archive, using XDR streams needs to make a call to an XDR function,
... no problem
c) .. and again, I need type information and cannot just call save_binary.
.. no problem
d) .. no problem
For this reason my proposed solution is to dispatch to a save_array function for those types and archives supporting it:
template<class Base> class fast_oarchive_impl : public Base { public:
// here's a way to do it for all vectors in one shot template<class T> void save_override ( const std::vector<T> & t, int, typename boost::enable_if<has_fast_array_serialization<Base,T> >::type *=0 ) { save_binary(&(t[0]),t.size()); }
... };
I'm quite satisfied with this. My point is that none of this has to be part of the serialization library itself. It can be a separate module like serialization/variant.hpp is.
where all archive classes provide a function like
void Archive::save_array(Type const *, std::size_t)
for all types for which the traits has_fast_array_serialization<Archive,Type> is true.
Now here is where we're going to part company.
That way a single overload suffices for all the N=5 archive types presented above, and the MxN problem is solved. Note also, that archives not supporting this fast array serialization do not need to implement anything, as the default for has_fast_array_serialization<Archive,Type> is false.
But maybe not. If has_fast_array_serialization<Archive,Type> is defined boost/serialization/fast_arrary.hpp I'm still OK with it.
ad 3.:
This introduces implementation details of the mtl_dense_matrix class into the archive, breaks orthogonality, and leads to a tight coupling. Change in these implementation details of the mtl_dense_matrix might require changes to the archive classes.
we certainly want to avoid that!!!
The solution is easy:
- some archives provide fast array serialization through the save_array member function - let the MTL be responsible for serialization of it own classes, and use save_array where appropriate
Just great !!!
ad 4.: ...
I'll agree with that also
To summarize, with three minor extensions to the serialization library, none of which breaks any existing code, we can get 10x speedups for serialization of large date sets, enable new types of archives such as MPI archives, and all of that without introducing any of the four problems discussed here.
The only thing I'm missing here is why the serialization library itself has to be modified to support all this. It seems that all this could easily be encapulated in one (or more) separate optional headers. This would be the best of all possible worlds. Robert Ramey

On Oct 10, 2005, at 6:48 PM, Robert Ramey wrote:
The only thing I'm missing here is why the serialization library itself has to be modified to support all this. It seems that all this could easily be encapulated in one (or more) separate optional headers. This would be the best of all possible worlds.
It seems we are very close to a consensus. I agree with you that above would be the ideal solution. I believe that I fully understand now where the one or two small issues are that prevent this ideal solution, but as I said I want to explain this in detail when I'm awake. Matthias

On Oct 10, 2005, at 6:48 PM, Robert Ramey wrote:
That sounds very good to me. Maybe I spoke too soon. I don't see how this would require and changes at all to the serialization library. I don't see why has_fast_array_serialization has to be part of the serialization library. Maybe all the code can be included in boost/serialization/ fast_arrary.hpp ? This header would be included by all the classes that use it and no others.
[snip]
If has_fast_array_serialization<Archive,Type> is defined boost/serialization/fast_arrary.hpp I'm still OK with it.
That's where it is defined in my proposal.
The only thing I'm missing here is why the serialization library itself has to be modified to support all this. It seems that all this could easily be encapulated in one (or more) separate optional headers. This would be the best of all possible worlds.
There are actually only very few modifications: 1. boost/archive/detail/oserializer.hpp and iserializer.hpp require modifications for the serialization of C-arrays of fixed length. In my version, the class save_array_type is modified to dispatch to save_array when fast_array_serialization is possible. The underlying problem here is that oserializer.hpp implements the serialization of a type here (the C array!). The optimal solution to this problem would be to move the array serialization to a separate header boost/ serialization/array.hpp, as is done for all C++ classes. 2. boost/serialization/vector.hpp is also modified to dispatch to save_array and load_array where possible. I don't think that this is a problem? 3. I had to introduce a new strong typedef in basic_archive.hpp: BOOST_STRONG_TYPEDEF(std::size_t, container_size_type) BOOST_CLASS_IMPLEMENTATION(boost::archive::container_size_type, primitive_type) I remember that you suggested in the past that this should be done anyways. One reason is that using unsigned int for the size of a container, as you do it now will not work on platforms with 32 bit int and 64 bit std::size_t : the size of a container can be more than 2^32. I don't always want to serialize std::size_t as the integer chosen by the specific implementation either, since that would again not be portable. By introducing a strong typedef, the archive implementation can decide how to serialize the size of a container. The further modifications to the library in boost/serialization/collections_load_imp.hpp boost/serialization/collections_save_imp.hpp boost/serialization/vector.hpp were to change the collection serialization to use the container_size_type. I don't think that you will object to this. There is actually another hidden reason for this strong typedef: Efficient MPI serialization without need to copy into a buffer requires that I can distinguish between special types used to describe the data structure (class id, object id, pointers, container sizes, ...) and plain data members. Next I have done a few changes to archive implemenations, the only important one of which is: 4. boost/archive/basic_binary_[io]archive.hpp serialize container_size_type as an unsigned int as done till now. It might be better to bump the file version and serialize them as std::size_t. All the other changes were to modify the binary archives and the polymorphic archive to support fast array serialization. In contrast to the above points this is optional. Instead we could provide fast_binary_[io]archive and fast_polymporphic_[io]archive, that differ from their normal versions just by supporting fast array serialization. I could live with this as well, although it makes more sense in my opinion to just addd the save_array/load_array features to the existing archives. Of all the points above, I believe that you will not have anything against points 3 and 4 since you proposed something similar already in the past if I remember correctly. Issue 2 should also be noncontroversial, and the main discussion should thus be on issue 1, on how one can improve the design of [io] serializer to move the the implementation of array serialization into a separate header. Matthias

Just to chime in, the modifications I came up with for size_t are very similar. One must be able to treat size_t separately, #ifdefs and overloads of save() in the derived archive type won't cut it. I gather that this isn't big news. These portability testing mods might come in handy if you're going to make these modifications. Robert, you want this stuff? I'm worried that the integration could start to become a hassle. It's a lot of trivial changes to test modules, Jamfile tweaks, a modified tmpnam() and a modified remove(), the use of boost::random instead of std::rand() so that A is the same on all architectures, reseeding of the rng's in certain places, and the ubiquitous class "A" is either a portable one or a nonportable one depending on what archive you're testing. If you don't specify the "I want to test portability" flag, the tests run as they do now, except for that they're unit tests, not test_main() tests. The portability bit comes in when you specify --serialization-testdata-dir=/some/where. I've changed tmpnam() to macro TESTFILE("unit_test_name") which returns /some/where/platform/version/compiler/archivetype.unit_test_name (for instance /path/to/Mac_OS/103300/GNU_C__version_4.0.0/portable_binary_archive.hpp.variant_A), and remove(), now finish("unit_test_name"), is a no-op when testing portability. This allows you to afterwards run a little utility to walk the filesystem at /some/where and compare the checksums of the corresponding archivetype.unittestnames. So one just points /some/where to network disk, or writes a little script to do a remote copy, and then runs the comparison. My hunch is that a checksum won't be a good comparison for xml and text archives due to variances in the underlying implementations of << for primitive types, (I've only been hammering on a portable binary archive), but one could easily use a whitespace-ignorant "diff" or something. It isn't ideal, as whitespace differences could still concievably trip things up, but this would require extensive modifications to every unit test, and I wasn't going to do them if there was a reasonable chance the changes wouldn't be used. Another problem is that it isn't easy to plug in your own archive type. One must add files to libs/serialization/test and hack around with the jamfiles. Needs a better interface. Matthias, I'm curious as to what your testing strategy has been, how automated it is, and if you see such a scheme as being useful... -t On Tue, Oct 11, 2005 at 09:57:04AM +0200, Matthias Troyer wrote:
There are actually only very few modifications:
1. boost/archive/detail/oserializer.hpp and iserializer.hpp require modifications for the serialization of C-arrays of fixed length. In my version, the class save_array_type is modified to dispatch to save_array when fast_array_serialization is possible. The underlying problem here is that oserializer.hpp implements the serialization of a type here (the C array!). The optimal solution to this problem would be to move the array serialization to a separate header boost/ serialization/array.hpp, as is done for all C++ classes. 2. boost/serialization/vector.hpp is also modified to dispatch to save_array and load_array where possible. I don't think that this is a problem?
3. I had to introduce a new strong typedef in basic_archive.hpp:
BOOST_STRONG_TYPEDEF(std::size_t, container_size_type) BOOST_CLASS_IMPLEMENTATION(boost::archive::container_size_type, primitive_type)
I remember that you suggested in the past that this should be done anyways. One reason is that using unsigned int for the size of a container, as you do it now will not work on platforms with 32 bit int and 64 bit std::size_t : the size of a container can be more than 2^32. I don't always want to serialize std::size_t as the integer chosen by the specific implementation either, since that would again not be portable. By introducing a strong typedef, the archive implementation can decide how to serialize the size of a container.
The further modifications to the library in
boost/serialization/collections_load_imp.hpp boost/serialization/collections_save_imp.hpp boost/serialization/vector.hpp
were to change the collection serialization to use the container_size_type.
I don't think that you will object to this.
There is actually another hidden reason for this strong typedef: Efficient MPI serialization without need to copy into a buffer requires that I can distinguish between special types used to describe the data structure (class id, object id, pointers, container sizes, ...) and plain data members.
Next I have done a few changes to archive implemenations, the only important one of which is:
4. boost/archive/basic_binary_[io]archive.hpp serialize container_size_type as an unsigned int as done till now. It might be better to bump the file version and serialize them as std::size_t.
All the other changes were to modify the binary archives and the polymorphic archive to support fast array serialization. In contrast to the above points this is optional. Instead we could provide fast_binary_[io]archive and fast_polymporphic_[io]archive, that differ from their normal versions just by supporting fast array serialization. I could live with this as well, although it makes more sense in my opinion to just addd the save_array/load_array features to the existing archives.
Of all the points above, I believe that you will not have anything against points 3 and 4 since you proposed something similar already in the past if I remember correctly.
Issue 2 should also be noncontroversial, and the main discussion should thus be on issue 1, on how one can improve the design of [io] serializer to move the the implementation of array serialization into a separate header.
Matthias
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

troy d. straszheim wrote:
Just to chime in, the modifications I came up with for size_t are very similar. One must be able to treat size_t separately, #ifdefs and overloads of save() in the derived archive type won't cut it. I gather that this isn't big news.
These portability testing mods might come in handy if you're going to make these modifications. Robert, you want this stuff? I'm worried that the integration could start to become a hassle. It's a lot of trivial changes to test modules, Jamfile tweaks, a modified tmpnam() and a modified remove(), the use of boost::random instead of std::rand() so that A is the same on all architectures, reseeding of the rng's in certain places, and the ubiquitous class "A" is either a portable one or a nonportable one depending on what archive you're testing. If you don't specify the "I want to test portability" flag, the tests run as they do now, except for that they're unit tests, not test_main() tests.
I'm very interested in this. I would like to see all the improvements you've made incorporated into the tests - Its just that I'm bogged down on other stuff. I also understand you've made headway in moving to the unit test platform which I also want to see. It sounds like you've addressed a lot of the pending issue regarding the tests and I want to see all this incorporated. Don't get discouraged. Of course we've got the problem that our consumption of testing resources has hit its limit. It seems to me that we're going to end up with a significantly enhanced Jamfile or better yet several Jamfiles. Of course this takes time but I would much like to see this done. When I get the current set of monkeys off my back this can be looked at. To summarize, I would like to see. a) tests to be converted to unit_tests b) tests improved to eliminate security issues - this really required making another new animal - a temporary C++ file stream - which might or might not currently exist somewhere. c) investigation of stringstream. It doesn't seem that stringsteam handles code_cvt facet. I don't know why this is but that's one reason I depend on temporary files as much as I do. So that's an interesting issue to investigate. d) a couple of different Jamfiles for selecting different kinds of testing i) core library ii) add-ons - e.g stl serializations, shared_ptr, etc. iii) portability - your name here iv) performance - only makes sense in release mode anyway so its handy to be separte. v) plug-in. I haven't been able to figure out how to make bjam do what I want here. But I haven't spent much time on it lately. So, don't get discouraged because I haven't really had time to get to know your stuff better. Help me out with this and we'll get your name up there "in lights" Robert Ramey
The portability bit comes in when you specify --serialization-testdata-dir=/some/where. I've changed tmpnam() to macro TESTFILE("unit_test_name") which returns /some/where/platform/version/compiler/archivetype.unit_test_name (for instance /path/to/Mac_OS/103300/GNU_C__version_4.0.0/portable_binary_archive.hpp.variant_A), and remove(), now finish("unit_test_name"), is a no-op when testing portability. This allows you to afterwards run a little utility to walk the filesystem at /some/where and compare the checksums of the corresponding archivetype.unittestnames. So one just points /some/where to network disk, or writes a little script to do a remote copy, and then runs the comparison. My hunch is that a checksum won't be a good comparison for xml and text archives due to variances in the underlying implementations of << for primitive types, (I've only been hammering on a portable binary archive), but one could easily use a whitespace-ignorant "diff" or something. It isn't ideal, as whitespace differences could still concievably trip things up, but this would require extensive modifications to every unit test, and I wasn't going to do them if there was a reasonable chance the changes wouldn't be used.
Keep this going - but let me have a chance to study it so we're on the same page. I want this stuff in there.
Another problem is that it isn't easy to plug in your own archive type. One must add files to libs/serialization/test and hack around with the jamfiles. Needs a better interface.
How about run_archive_test.sh ? doesn't that do it? Robert Ramey

Matthias Troyer wrote:
On Oct 10, 2005, at 6:48 PM, Robert Ramey wrote:
That sounds very good to me. Maybe I spoke too soon. I don't see how this would require and changes at all to the serialization library. I don't see why has_fast_array_serialization has to be part of the serialization library. Maybe all the code can be included in boost/serialization/ fast_arrary.hpp ? This header would be included by all the classes that use it and no others.
[snip]
If has_fast_array_serialization<Archive,Type> is defined boost/serialization/fast_arrary.hpp I'm still OK with it.
That's where it is defined in my proposal.
The only thing I'm missing here is why the serialization library itself has to be modified to support all this. It seems that all this could easily be encapulated in one (or more) separate optional headers. This would be the best of all possible worlds.
There are actually only very few modifications:
1. boost/archive/detail/oserializer.hpp and iserializer.hpp require modifications for the serialization of C-arrays of fixed length. In my version, the class save_array_type is modified to dispatch to save_array when fast_array_serialization is possible. The underlying problem here is that oserializer.hpp implements the serialization of a type here (the C array!). The optimal solution to this problem would be to move the array serialization to a separate header boost/ serialization/array.hpp, as is done for all C++ classes.
My intention was to include all types "built-in" to the C++ language in ?serializer.hpp so that's why it includes C++ arrays. Separating this into another header would break this "concept" and would result in a small additional *.hpp file that would have to be included explicitly. So I would be against it in this case. On the other hand, I do appreciate ideas that remove pieces from the "core" of the library and turn them into more independent modules so don't get discouraged here.
2. boost/serialization/vector.hpp is also modified to dispatch to save_array and load_array where possible. I don't think that this is a problem?
That would mean that users are including save/load_array even if they don't want them or want to use their own versions. Oh - then documentation has to be enhanced to explain all this internal behavior. I would prefer something like the following: class my_class { stl::vector<int> m_vi; ... }; template<class Archive> my_class::serialize(Archive &ar, const unsigned int version){ // standard way ar & m_vi; // or fast way which defaults to standard way in appropriate cases. save_array(ar, m_vi); } This a) keeps the stl portion of the library smaller b) leave the user in control of what's going on c) permits development of save/load array to be on an independent parallel track with everyting else. If we eventually discover that that everyone is always using save/load array then we can study whether we want to just enhance stl::vector, etc to include it as default functionality.
3. I had to introduce a new strong typedef in basic_archive.hpp:
BOOST_STRONG_TYPEDEF(std::size_t, container_size_type) BOOST_CLASS_IMPLEMENTATION(boost::archive::container_size_type, primitive_type)
...
I don't think that you will object to this.
... I looked at this and right away I noticed that, as written this would make all existing binary archives unreadable. The real fix for is: a) bump up the library version from 3 to 4 b) alter save/load collections to use unsigned int or size_t depending upon he library version. It has to be done this way since stl collections are marked as non-versioned. I was sort of reluctant to do this as it seemed to me that it would be yet a while before anyone starts to save more than 2,000,000,000 objects with the serialization library and it seemed to me wasteful for the native binary archive to included an extra 4 bytes containing 0's for every collection serialzed. But maybe now (I mean 1.34) is the time to bite the bullet on this. I don't have a strong prefererence for either changing it or leaving it the same. If you need it, I would say fine - we'll do it.
4. boost/archive/basic_binary_[io]archive.hpp serialize container_size_type as an unsigned int as done till now. It might be better to bump the file version and serialize them as std::size_t.
agreed - no point in using size_t in the collection code just to throw it away.
All the other changes were to modify the binary archives and the polymorphic archive to support fast array serialization. In contrast to the above points this is optional. Instead we could provide fast_binary_[io]archive and fast_polymporphic_[io]archive, that differ from their normal versions just by supporting fast array serialization. I could live with this as well, although it makes more sense in my opinion to just addd the save_array/load_array features to the existing archives.
sounds like you can go along with me on this.
Of all the points above, I believe that you will not have anything against points 3 and 4 since you proposed something similar already in the past if I remember correctly.
I don't see a problem here.
Issue 2 should also be noncontroversial, and the main discussion should thus be on issue 1, on how one can improve the design of [io] serializer to move the the implementation of array serialization into a separate header.
looks like we'll have to arm wrestle a little more here. In a way you've got me over a barrel. I continually advocate keeping the "core" of the library small by factoring out everything that can be factor out. Now you've got me on C++ array. sort of. I'll re-iterate my view that the "core" library address serialization of all "built-in" language types - and no others. The other part of your proposal really is an enhancement of the stl serializations which are not part of the "core" part of the library. I'm confident that your enhancement will be quite useful to many people and I'm very happy to see that you've done it. That's not the same as forcing it on all users of the library. So I prefer to see your enhancement as an option explicitly invoked by the user. This has a number of advantages: a) the user can see what he is doing. No hidden complex behavior. b) some users might want just minimal code since thier collections are small c) your enhancement will require significant documentation. This will be much easier if it is an optional add-on to the serialization library. d) parallel or layered developement is facilitated. So to summarize the issues a) should C++ array serialization be in a separate header? I say no, you say yes. b) should save/load array be incorporated into stl collection serialization to make its usage oblicatory? I say no, you say yes. Regardless of how these questions are answered, its clear to me that your enhancements to stl containers will be available to users. Assuming that this is packaged as an optional add-on as I would hope, the only questions remaining will be: a) Should this be a totally separate library with its own documentation/tests/directory tree etc? It should be separate but not totally so: Maybe a files or maybe directory within serialization for save/load array and within archive for fast...archive adaptor. A group of its own tests - just like we have tests for all other combinations of serializations and archives - I can hear the howling already. We'll have to see what to do about this. A separate documenation section in the documenation of the serialization library. Similar to the miscelleneas. But miscellaneas holds things that are really separate so we'll find a good place for it. Maybe a section titled something like "Special Considerations When Serializing Collections" (but shorter). Note that your fast...archive will really be a new creature - an "archive adaptor". This merits an new section in the documentation in any case. This is a cool and useful technique and will encourage future great ideas which enhance the serialization library without making the core bigger. b) Should such an optional enhancement be subject to some sort of review? I'm agonostic on this. I would be happy to just be the gate keeper and accept it or make you alter it to my taste. Of course, no guarentee that anyone else would be happy with this. I'm willing to go along with whatever the consensus is regarding if and/or how this thing should be reviewed. =============== A final note to others wishing to participate in the serialization library. Welcome and good luck. Having said that, I will give a little advice on how to be most successful in getting my cooperation. (encouragement you get for free) a) Notice that I make huge efforts to keep the "core" library from growing. Anything that adds to it makes my job bigger - I have to make this job smaller. b) I am out of the business of writing serializations for specific classes. Its up to those who need it to get it done. Of course I'm always full of advice - sorry about that. c) I am out of the business of writing new archive types. Instead, I want to improve the documentation to make this easier. d) My efforts will be focused on implemenation aspects of the "core" library. This includes: i) pending issue regarding dynamic loading/unloading of code that contains serialization for specific types. This includes testing of support for plug-ins. ii) making a good performance test and correcting any performance bottlenecks in the core library. iii) making testing more efficient. iv) fixing bugs I am getting a little "burned out" on the serialization library. Its only my obsessive nature that makes me continue to the bitter end which I'm hoping I am approaching - at least asymptotically. I am extremely gratified by your efforts and those of others to enhance the library and will do everything (subject to the above reservations) I can to encourage and support them. Nothing would make me happier to see people spinning off their own improved serializations and archive classes and making the package the default choice for C++ object serializaition. Robert Ramey

Hi Robert, Let me just answer a simple point tonight. You'll get my reply to the rest of the mail in the morning. On Oct 11, 2005, at 6:45 PM, Robert Ramey wrote:
3. I had to introduce a new strong typedef in basic_archive.hpp:
BOOST_STRONG_TYPEDEF(std::size_t, container_size_type) BOOST_CLASS_IMPLEMENTATION(boost::archive::container_size_type, primitive_type)
I looked at this and right away I noticed that, as written this would make all existing binary archives unreadable.
Have you looked at the following code I proposed to add for the binary_iarchives: void load_override(container_size_type & t, int){ // upto 2G objects unsigned int x; * this->This() >> x; t = container_size_type(x); } That, and the similar code for the output archives should do the trick and keep backward compatibility, or am I mistaken? However, I fully agree (and actually proposed so myself in the original mail to do:
The real fix for is:
a) bump up the library version from 3 to 4 b) alter save/load collections to use unsigned int or size_t depending upon the library version.
but I wanted to leave this to you, since this would be a change to the binary file format, which I believe should be done by you. More tomorrow when I'm awake again, Matthias

Whoops, I didn't see that. It would handle the backward compatibility issue. In any case I'd rather just get it over with and move to 64 bits on those platforms. We'll just implement the complete fix in the next version. Robert Ramey Matthias Troyer wrote:
Hi Robert,
Let me just answer a simple point tonight. You'll get my reply to the rest of the mail in the morning.
On Oct 11, 2005, at 6:45 PM, Robert Ramey wrote:
3. I had to introduce a new strong typedef in basic_archive.hpp:
BOOST_STRONG_TYPEDEF(std::size_t, container_size_type) BOOST_CLASS_IMPLEMENTATION(boost::archive::container_size_type, primitive_type)
I looked at this and right away I noticed that, as written this would make all existing binary archives unreadable.
Have you looked at the following code I proposed to add for the binary_iarchives:
void load_override(container_size_type & t, int){ // upto 2G objects unsigned int x; * this->This() >> x; t = container_size_type(x); }
That, and the similar code for the output archives should do the trick and keep backward compatibility, or am I mistaken?
However, I fully agree (and actually proposed so myself in the original mail to do:
The real fix for is:
a) bump up the library version from 3 to 4 b) alter save/load collections to use unsigned int or size_t depending upon the library version.
but I wanted to leave this to you, since this would be a change to the binary file format, which I believe should be done by you.
More tomorrow when I'm awake again,
Matthias
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

On Oct 11, 2005, at 11:48 PM, Robert Ramey wrote:
Matthias Troyer wrote:
I looked at this and right away I noticed that, as written this would make all existing binary archives unreadable.
Have you looked at the following code I proposed to add for the binary_iarchives:
[snip]
but I wanted to leave this to you, since this would be a change to the binary file format, which I believe should be done by you.
Whoops, I didn't see that. It would handle the backward compatibility issue. In any case I'd rather just get it over with and move to 64 bits on those platforms. We'll just implement the complete fix in the next version.
Great! Matthias

On Oct 11, 2005, at 6:45 PM, Robert Ramey wrote:
Matthias Troyer wrote:
There are actually only very few modifications:
1. boost/archive/detail/oserializer.hpp and iserializer.hpp require modifications for the serialization of C-arrays of fixed length. In my version, the class save_array_type is modified to dispatch to save_array when fast_array_serialization is possible. The underlying problem here is that oserializer.hpp implements the serialization of a type here (the C array!). The optimal solution to this problem would be to move the array serialization to a separate header boost/ serialization/array.hpp, as is done for all C++ classes.
My intention was to include all types "built-in" to the C++ language in ?serializer.hpp so that's why it includes C++ arrays. Separating this into another header would break this "concept" and would result in a small additional *.hpp file that would have to be included explicitly. So I would be against it in this case. On the other hand, I do appreciate ideas that remove pieces from the "core" of the library and turn them into more independent modules so don't get discouraged here.
Actually the way I see it you deal with the serialization of pointers, the class versioning and object tracking there. All of these get serialized through special types, created by strong typedefs, and the archives can then override the serialization of these classes. Please correct me if I'm wrong, but the built-in types like int, double, etc.. are actually all dispatched to the archive, and not serialized by the [io]serializer. The only exception seem to be pointers (which should be handled by the serialization library), and C-arrays. It would thus make sense to put the serialization of arrays into a separate header, just as you have done for std::vector, and as I will do soon for other classes. However there are also other options as you pointed out: the archive classes could override the serialization of arrays. As long as this stay limited to arrays there will be no MxN scaling problem, but there is still a problem of code duplication, since each archive type implementing fast array serialization has to override the serialization of arrays. This is also error-prone since we have to tell the implementors of archives supporting fast array serialization that they should not forget overriding serialization of built-in arrays.
2. boost/serialization/vector.hpp is also modified to dispatch to save_array and load_array where possible. I don't think that this is a problem?
That would mean that users are including save/load_array even if they don't want them or want to use their own versions. Oh - then documentation has to be enhanced to explain all this internal behavior.
Actually the cost is minimal if the archive does not support fast save/load_array. The hasfast_array_serialization.hpp header only consists of the default traits: template <class Archive, class Type> struct has_fast_array_serialization : public mpl::bool_<false> {}; and the serialization of a std:vector only contains this minimal extension: template<class Archive, class U, class Allocator> inline void save(Archive & ar,const STD::vector<U, Allocator> &t, const unsigned int , typename boost::enable_if<boost::archive::has_fast_array_serialization<Archive,U>
::type* =0 ){ const boost::archive::container_size_type count(t.size()); ar << BOOST_SERIALIZATION_NVP(count); if (count) ar.save_array(boost::detail::get_data(t),t.size()); }
The cost of parsing these few lines is negligible compared to the rest of the serialization library.
I would prefer something like the following:
class my_class { stl::vector<int> m_vi; ... }; template<class Archive> my_class::serialize(Archive &ar, const unsigned int version){ // standard way ar & m_vi; // or fast way which defaults to standard way in appropriate cases. save_array(ar, m_vi); }
This a) keeps the stl portion of the library smaller b) leave the user in control of what's going on c) permits development of save/load array to be on an independent parallel track with everyting else.
I find this proposal unacceptable for the following reasons: - it breaks the orthogonality between serialization and archives - how the array representation of the vector gets serialized should be a concern of the archive and not the user - the user has to remember to always call save_array(ar,m_vi) instead of just serializing the vector directly. This is quite error-prone and will easily lead to sub-optimal code. - the user has to know for which classes to call save_array and for which it will not be needed. For vector it might be intuitive, but what about ublas matrix types: do you know which ublas matrix types can use fast array serialization and which ones cannot? Or, even worse, if the matrix type is a template parameter you have a worse problem. Now to address your issues: a) keeping the STL portion small: I don't see this as a valid point since, as you can see above it increases the size of the STL serialization code only by a few lines. b) "leave the user in control of what's going on": actually this is breaking the orthogonality. The user should not influence the archives internal behavior. The archive class should decide how to serialize the array, not the user. The user can pick between fast and slow array serialization by choosing a different archive class. c) development on an independent track: the only interference we have is this one file vector.hpp.
All the other changes were to modify the binary archives and the polymorphic archive to support fast array serialization. In contrast to the above points this is optional. Instead we could provide fast_binary_[io]archive and fast_polymporphic_[io]archive, that differ from their normal versions just by supporting fast array serialization. I could live with this as well, although it makes more sense in my opinion to just addd the save_array/load_array features to the existing archives.
sounds like you can go along with me on this.
I'll do that but I still think that it is the wrong way to go for the binary archives. The only difference are these five lines in basic_binary_iprimitive.hpp template<class T> void load_array(T *address, std::size_t count) { load_binary(address, count*sizeof(T)); } but they give you a factor of 10 in performance! Why should anyone still use the other version? To save the compile time for 5 lines of code?
The other part of your proposal really is an enhancement of the stl serializations which are not part of the "core" part of the library. I'm confident that your enhancement will be quite useful to many people and I'm very happy to see that you've done it. That's not the same as forcing it on all users of the library. So I prefer to see your enhancement as an option explicitly invoked by the user. This has a number of advantages:
a) the user can see what he is doing. No hidden complex behavior.
the fast array serialization is an implementation detail of the archive, and need not be visible to the user at all. Note that none of the archive modifications I suggested change anything that is visible to the user, aside from a speedup in execution time. The syntax and semantics of the serialization is unchanged, as is the archive itself. Only the process of serialization is faster.
b) some users might want just minimal code since thier collections are small
we're talking about a few lines of code that can improve performance by a factor of 10 in a library that is many thousands of lines. Is that really an issue?
c) your enhancement will require significant documentation. This will be much easier if it is an optional add-on to the serialization library.
It requires one or two pages of documentation for archive developers, and none for users unless they implement serialization of array-like containers.
So to summarize the issues
a) should C++ array serialization be in a separate header? I say no, you say yes.
b) should save/load array be incorporated into stl collection serialization to make its usage oblicatory? I say no, you say yes.
This point b) is where I will not budge, for the reasons explained in earlier e-mails. While I could maybe live with the fact that I have to override the C-style array serialization in all archives supporting fast array serialization, I will never do that for other classes, since this again opens the can of worms discussed previously. Let me outline it again: if the vector.hpp serialization stays unchanged, I will have to override it in the archive. Next we'll implement the std::valarray serialization. What should we do? Support fast array serialization out of the box or leave it to the archive implementor to override. We'll probably follow the example of std::vector and do not support it. Now the archive also has to provide overrides for std::valarray, which can still be done. After that we'll implement serialization of ublas matrices. Following the above examples we will again not implement support for fast array serialization directly, to save a few lines of code. The consequence is even worse now: the archive implementation has to override the serialization of all ublas matrices, and will either be inefficient, or has to have knowledge of implementation details of the ublas matrices. We would be back at both an MxN problem, and will have tight coupling between archives and the implementation details of the classes to be serialized. We should avoid this at all cost! So the real question here is: "Shall we recommend that the serialization of array-like data structures uses fast array serialization by calling save/load_array when possible?" My clear answer is yes, and I will not budge on that. The serialization library is useless to me with a 10x performance hit. And many people I talked to do not use Boost.Serialization but their own (otherwise inferior) solutions for that reason. I just want to mention that vectors with billions elements are typical sizes for many of our problems. The real question is where to draw the line between using fast array serialization and not using it? - I think we can agree that classes like multi_array or ublas matrices and vectors should be recommended to use it wherever possible - The same should be true for std::valarray. - To stay consistent we should then also use it for std::vector - What about C-arrays? since I rarely actually use them in raw form in my code, and never for large sizes, I have no strong personal preference. It would just be consistent and speed up serialzation at negligible compile time cost to also use the fast option there, but if you veto it I could live with it.
Regardless of how these questions are answered, its clear to me that your enhancements to stl containers will be available to users. Assuming that this is packaged as an optional add-on as I would hope, the only questions remaining will be:
a) Should this be a totally separate library with its own documentation/tests/directory tree etc? It should be separate but not totally so:
Maybe a files or maybe directory within serialization for save/load array and within archive for fast...archive adaptor.
A group of its own tests - just like we have tests for all other combinations of serializations and archives - I can hear the howling already. We'll have to see what to do about this.
A separate documenation section in the documenation of the serialization library. Similar to the miscelleneas. But miscellaneas holds things that are really separate so we'll find a good place for it. Maybe a section titled something like "Special Considerations When Serializing Collections" (but shorter).
Note that your fast...archive will really be a new creature - an "archive adaptor". This merits an new section in the documentation in any case. This is a cool and useful technique and will encourage future great ideas which enhance the serialization library without making the core bigger.
Actually it might be an adaptor only in the case of the binary archive. Other archives I mentioned (such as the MPI archive) will have to support fast array serialization directly to have any chance of being usable.
b) Should such an optional enhancement be subject to some sort of review? I'm agonostic on this. I would be happy to just be the gate keeper and accept it or make you alter it to my taste.
I have no strong opinion on this. For now I need a mechanism for fast array serialization in place to be able to implement serialization of large arrays and matrices. Optimized archives could come later and could go through some kind of review, e.g. an MPI archive together with a possible future parallel programming library. Regarding the binary archives: if your concern is that it will make it harder for you to maintain, then I could, if you want, propose to submit the fast array version as a replacement for the existing one and take over its maintenance. That will make your task smaller and in the review of the submission we can hear if someone wants to keep the existing version. Matthias

On Wed, Oct 12, 2005 at 02:17:06PM +0200, Matthias Troyer wrote:
Oct 11, 2005, at 6:45 PM, Robert Ramey wrote:
I would prefer something like the following:
class my_class { stl::vector<int> m_vi; ... }; template<class Archive> my_class::serialize(Archive &ar, const unsigned int version){ // standard way ar & m_vi; // or fast way which defaults to standard way in appropriate cases. save_array(ar, m_vi); }
This a) keeps the stl portion of the library smaller b) leave the user in control of what's going on c) permits development of save/load array to be on an independent parallel track with everyting else.
I find this proposal unacceptable for the following reasons:
- it breaks the orthogonality between serialization and archives
- how the array representation of the vector gets serialized should be a concern of the archive and not the user
- the user has to remember to always call save_array(ar,m_vi) instead of just serializing the vector directly. This is quite error-prone and will easily lead to sub-optimal code.
- the user has to know for which classes to call save_array and for which it will not be needed. For vector it might be intuitive, but what about ublas matrix types: do you know which ublas matrix types can use fast array serialization and which ones cannot? Or, even worse, if the matrix type is a template parameter you have a worse problem.
I'm not real fond of adding save_array() to the archive's interface either. When you first see the examples for the serialization library: struct Something { double x, y, z; doesnt_matter_what_it_is_t d; template <class Archive> void serialize(Archive & ar, unsigned version) { ar & x; ar & y; ar & z; ar & d; } }; You think Wow. That's cool. It's so clean. And you can pass *anything* to the archive? It tracks the pointers and everything? Wow. When you later get on to nvp() stuff, the base_object<> and export macros, you react "ah, well, it can't all be magic. It's still supercool", and despite these base_object<>-type caveats you can still teach a monkey to put serialization routines into his classes (for me this is essential). To the monkey it makes sense that you have to explain things like your base classes to the serialization library. It won't make sense that you can pass the archive an int, map, or a pointer to a variant, but for arrays you have to do something special. If you forget an nvp() or a base_object(), your data isn't serialized correctly, or the code won't compile. The problems are easy to locate as the problems appear early. save_array() wouldn't be like that. Things will serialize correctly but slowly, and then you have to go digging. Most importantly, template <class Archive> void serialize(Archive & ar, unsigned version) { ar & make_nvp("x", x); ar & make_nvp("y", y); ar & make_nvp("z", z); save_array(ar, make_nvp("some_array",some_array)); } is just ugly. Sorry, but it is. It's a big wart on an otherwise extremely finely crafted interface. (I think the operator&() is elegant, for the record.)
A group of its own tests - just like we have tests for all other combinations of serializations and archives - I can hear the howling already. We'll have to see what to do about this.
I'll volunteer (well, I already am) to help with testing. I'll help out with maintenance as well (I'm all gcc/linux/mac, no overlap with your testing). Whatever it takes to not have to save_array(). :) I'll also provide tests that verify that these changes are backwards compatible.
A separate documenation section in the documenation of the serialization library. Similar to the miscelleneas. But miscellaneas holds things that are really separate so we'll find a good place for it. Maybe a section titled something like "Special Considerations When Serializing Collections" (but shorter).
I'll volunteer to help with docs, as well, though hopefully the "special considerations for collections" would be focused on archive authors. I think this would be a useful exercise. After all this has gone through and I've delivered some kind of portable binary archive to my client. -t

troy d. straszheim wrote:
struct Something { double x, y, z; doesnt_matter_what_it_is_t d; template <class Archive> void serialize(Archive & ar, unsigned version) { ar & x; ar & y; ar & z; ar & d; } };
You think Wow. That's cool. It's so clean. And you can pass *anything* to the archive? It tracks the pointers and everything? Wow. When you later get on to nvp() stuff, the base_object<> and export macros, you react "ah, well, it can't all be magic. It's still supercool", and despite these base_object<>-type caveats you can still teach a monkey to put serialization routines into his classes (for me this is essential). To the monkey it makes sense that you have to explain things like your base classes to the serialization library. It won't make sense that you can pass the archive an int, map, or a pointer to a variant, but for arrays you have to do something special.
If you forget an nvp() or a base_object(), your data isn't serialized correctly, or the code won't compile. The problems are easy to locate as the problems appear early. save_array() wouldn't be like that. Things will serialize correctly but slowly, and then you have to go digging.
Most importantly,
template <class Archive> void serialize(Archive & ar, unsigned version) { ar & make_nvp("x", x); ar & make_nvp("y", y); ar & make_nvp("z", z); save_array(ar, make_nvp("some_array",some_array)); }
is just ugly. Sorry, but it is. It's a big wart on an otherwise extremely finely crafted interface. (I think the operator&() is elegant, for the record.)
This is a very convincing argument. That is - I'm convinced. I very much liked the monkey analogy. Not to say programmers are monkeys. But serialization is something I'm using so I can get on with the true topic at hand so its important to me that it "just works" without using up my precious brain stack space. Now take a look at my first idea - a fast archive adaptor which would overload serialization of stl vector and c array. Ideally application of the wrapper to inappropriate adaptees would result in a compile time assertion so as to preserve the monkey proof aspect of the libray. Damn, now I've forgotten what the objections were to it. I'll have to go back and check.
A group of its own tests - just like we have tests for all other combinations of serializations and archives - I can hear the howling already. We'll have to see what to do about this.
I'll volunteer (well, I already am) to help with testing. I'll help out with maintenance as well (I'm all gcc/linux/mac, no overlap with your testing). I'll also provide tests that verify that these changes are backwards compatible.
We will get to that. I'm interested in incorporating your improved testing. But I do have one concern. I test with windows platforms including borland and msvc. These can be quite different than just testing with gcc and can suck up a lot of time. It may not be a big issue here, but it means you'll have to be aware not to do anything toooo tricky. Since you're interested in this I would suggest making a few new directories in your personal boost/libs/serialization tree. I see each of these directories having its own Jamfile so we could just invoke runtest from any of the test suites just by locating to the desired directory. a) old_test - change the current test directory to this b) test - the current test with your changes to use the unit_test library. You might send me source to one of your changed test to see if I want to comment on it before too much effort is invested. c) test_compatibility. Included your back compatibility tests d) test_performance - I want to include a few tests to test times for thinks like time to serialize different primitives, opening/closing archives, etc. This would be similar to the current setup so I could sort of generate a table which shows which combinations of features and archives are bottlenecks. Its the hope that this would help detect really dumb oversights like recreating an xml character translation table for each xml character serialized !
A separate documenation section in the documenation of the serialization library. Similar to the miscelleneas. But miscellaneas holds things that are really separate so we'll find a good place for it. Maybe a section titled something like "Special Considerations When Serializing Collections" (but shorter).
I'll volunteer to help with docs, as well, though hopefully the "special considerations for collections" would be focused on archive authors. I think this would be a useful exercise. After all this has gone through and I've delivered some kind of portable binary archive to my client.
Its a tiny bit premature - Archive Implementation needs at least another pass. But I would envisage either one or two new sections a) Archive adaptors. This is a class that can be applied to any existing archive in order to modify some aspects of its behavior by hiding the base class functions with an overloaded implementation. Refers to fast array archive as an example. b) Fast array archve adaptor - description of how to use it. Just my thoughts Robert Ramey

Robert Ramey wrote:
troy d. straszheim wrote:
struct Something { double x, y, z; doesnt_matter_what_it_is_t d; template <class Archive> void serialize(Archive & ar, unsigned version) { ar & x; ar & y; ar & z; ar & d; } };
You think Wow. That's cool. It's so clean. And you can pass *anything* to the archive? It tracks the pointers and everything? Wow. When you later get on to nvp() stuff, the base_object<> and export macros, you react "ah, well, it can't all be magic. It's still supercool", and despite these base_object<>-type caveats you can still teach a monkey to put serialization routines into his classes (for me this is essential). To the monkey it makes sense that you have to explain things like your base classes to the serialization library. It won't make sense that you can pass the archive an int, map, or a pointer to a variant, but for arrays you have to do something special.
If you forget an nvp() or a base_object(), your data isn't serialized correctly, or the code won't compile. The problems are easy to locate as the problems appear early. save_array() wouldn't be like that. Things will serialize correctly but slowly, and then you have to go digging.
Most importantly,
template <class Archive> void serialize(Archive & ar, unsigned version) { ar & make_nvp("x", x); ar & make_nvp("y", y); ar & make_nvp("z", z); save_array(ar, make_nvp("some_array",some_array)); }
is just ugly. Sorry, but it is. It's a big wart on an otherwise extremely finely crafted interface. (I think the operator&() is elegant, for the record.)
This is a very convincing argument. That is - I'm convinced. I very much liked the monkey analogy. Not to say programmers are monkeys. But serialization is something I'm using so I can get on with the true topic at hand so its important to me that it "just works" without using up my precious brain stack space.
Hmm. If you want '&' notation, then why not ar & make_array(some_array); or ar & make_named_array("some_array", some_array); ? Anyway, can you show a few examples of serializing arrays WITHOUT save_array ? When would it ever be this simple anyway? For a start, unless the array has a known fixed size the save and load functions need to be split, and in the load function the array size needs to be read first and the array constructed. Mattias' proposal was for save_array(T const * address, std::size_t length); load_array(T * address, std::size_t length); which are very low-level operations acting on already existing contiguous arrays. That is, they would be used in (for example) the implementation of save() and load() for std::vector. If you are going to make comparisons as to what the resulting code looks like, you should at least compare against code that actually does the same thing as what Mattias proposes. Cheers, Ian

On Sun, Oct 16, 2005 at 06:34:22PM +0200, Ian McCulloch wrote:
Hmm. If you want '&' notation, then why not
ar & make_array(some_array);
[snip]
If you are going to make comparisons as to what the resulting code looks like, you should at least compare against code that actually does the same thing as what Mattias proposes.
Naturally one should compare apples to apples. Still seems to me that I was. Matthias argued succintly against putting a save_array() in the interface that archives expose to their clients: Matthias Troyer wrote:
- the user has to remember to always call save_array(ar,m_vi) instead of just serializing the vector directly. This is quite error-prone and will easily lead to sub-optimal code.
(see also neighboring material back in the thread). Matthias' proposal (as I understand it, and which I agree with) involves save/load_array(), but not where the user of an archive would see them. These calls to save/load_array() would be from the serialization library to the archive, not where somebody writing a routine to serialize their class could see them. -t

Matthias Troyer wrote:
On Oct 11, 2005, at 6:45 PM, Robert Ramey wrote:
Actually the way I see it you deal with the serialization of pointers, the class versioning and object tracking there. All of these get serialized through special types, created by strong typedefs, and the archives can then override the serialization of these classes. Please correct me if I'm wrong, but the built-in types like int, double, etc.. are actually all dispatched to the archive, and not serialized by the [io]serializer.
correct
The only exception seem to be pointers (which should be handled by the serialization library), and C-arrays.
and enums
It would thus make sense to put the serialization of arrays into a separate header, just as you have done for std::vector, and as I will do soon for other classes.
disagree - default serialization of C array is included here as its a built-in type and it does have a universal default implementation.
However there are also other options as you pointed out: the archive classes could override the serialization of arrays. As long as this stay limited to arrays there will be no MxN scaling problem, but there is still a problem of code duplication, since each archive type implementing fast array serialization has to override the serialization of arrays.
Disagree - all overrides can be placed in an archive adaptor class which takes the adaptee class as a template argument. This is written once as in my attachment above and than can be applied to any working archive. Using something like my attachment above: a) substitute your improved special overrides for vectors and arrays. b) if possible, add some compile time assertion that traps the cases where the adaptor is applied to a base class that isn't appropriate. c) by hand, make mini class declaration equivalent to templated typedefs (since C++ doesn't have them) of all the combinations that you now know will work.
This is also error-prone since we have to tell the implementors of archives supporting fast array serialization that they should not forget overriding serialization of built-in arrays.
nope, they can do one of the following: a) use one of your "templated typedef" classes above b) apply the fast_archive_adaptor to any other archive class they want. Ideally, it would trap if the application wasn't appropriate but that may not be worth implementing. Its just boiler plate code for any combination of adaptor/adaptee - you could even make it a macro if you wanted to.
2. boost/serialization/vector.hpp is also modified to dispatch to save_array and load_array where possible. I don't think that this is a problem?
I do. It buries knowledge about the archive in the serialization of a type. what happens when someone comes along with "checked_archive" or "thread_safe_archive", are we going to decorate the implementation of all the serializations for each one?
That would mean that users are including save/load_array even if they don't want them or want to use their own versions. Oh - then documentation has to be enhanced to explain all this internal behavior.
Actually the cost is minimal if the archive does not support fast save/load_array. The hasfast_array_serialization.hpp header only consists of the default traits:
One still has to include the header. This violates the boost principle - don't pay for what you don't use. Actually the boost principle would be - don't even know about what you don't use - which is better anyway.
template <class Archive, class Type> struct has_fast_array_serialization : public mpl::bool_<false> {};
and the serialization of a std:vector only contains this minimal extension:
template<class Archive, class U, class Allocator> inline void save(Archive & ar,const STD::vector<U, Allocator> &t, const unsigned int , typename boost::enable_if<boost::archive::has_fast_array_serialization<Archive,U>
::type* =0 ){ const boost::archive::container_size_type count(t.size()); ar << BOOST_SERIALIZATION_NVP(count); if (count) ar.save_array(boost::detail::get_data(t),t.size()); }
The cost of parsing these few lines is negligible compared to the rest of the serialization library.
All this can easily be move to the fast_array_archive_adaptor class. with no loss of generality or convenience or efficiency.
I would prefer something like the following:
class my_class { stl::vector<int> m_vi; ... }; ...
I find this proposal unacceptable for the following reasons ...
OK a bad idea - I take it back.
Now to address your issues:
a) keeping the STL portion small: I don't see this as a valid point since, as you can see above it increases the size of the STL serialization code only by a few lines.
It could conflict with someone else's extension/specialization. There is no downside to including it in the fast_...archive_adaptor class
b) "leave the user in control of what's going on": actually this is breaking the orthogonality. The user should not influence the archives internal behavior. The archive class should decide how to serialize the array, not the user. The user can pick between fast and slow array serialization by choosing a different archive class.
Just fine by me. Either he chooses the original plain vanilla one or he choose one to which your fast_...archive_adaptor has been applied. He can apply it himself or use your premade "templated typedef" classes if he's in a hurry.
c) development on an independent track: the only interference we have is this one file vector.hpp.
again, no downside in factoring your special features into your own special adaptor. Note that using the adaptor approach has another huge benefit. Suppose someone else comes up with another adaptor - checked_archive adaptor which check quadriple checks the save/load by trapping Nan for floats, and who knows what else. One could then apply either or both adaptors to create a new archive with all the features - all without anyone writing any new code.
Why should anyone still use the other version? To save the compile time for 5 lines of code?
LOL - believe me, someone will want to do it differently. I can't say how or why - but believe me it will happen. The adaptor approach lets everyone add their own thing and lets everyone else pick and choose which combination of things they want to add. Hate to tell you mattias, but someone, somewhere isn't going to like your changes for some reason. You can either debate the issue with them or you can factor your improvements so that they are optional. Believe me - the latter is going to save you a lot of time.
b) should save/load array be incorporated into stl collection serialization to make its usage oblicatory? I say no, you say yes.
This point b) is where I will not budge, for the reasons explained in earlier e-mails. While I could maybe live with the fact that I have to override the C-style array serialization in all archives supporting fast array serialization, I will never do that for other classes, since this again opens the can of worms discussed previously. Let me outline it again:
if the vector.hpp serialization stays unchanged, I will have to override it in the archive.
Next we'll implement the std::valarray serialization. What should we do? Support fast array serialization out of the box or leave it to the archive implementor to override. We'll probably follow the example of std::vector and do not support it. Now the archive also has to provide overrides for std::valarray, which can still be done.
After that we'll implement serialization of ublas matrices. Following the above examples we will again not implement support for fast array serialization directly, to save a few lines of code. The consequence is even worse now: the archive implementation has to override the serialization of all ublas matrices, and will either be inefficient, or has to have knowledge of implementation details of the ublas matrices.
all these should be in either one fast...adaptor or separate adaptors according to your taste.
We would be back at both an MxN problem, and will have tight coupling between archives and the implementation details of the classes to be serialized. We should avoid this at all cost!
Nope, we have at most one adaptor for each "special" type. The same adaptor applies to all archives (present and future) with which it is compatible.
So the real question here is:
"Shall we recommend that the serialization of array-like data structures uses fast array serialization by calling save/load_array when possible?"
My clear answer is yes, and I will not budge on that. The serialization library is useless to me with a 10x performance hit.
your adaptor will fix that for you.
And many people I talked to do not use Boost.Serialization but their own (otherwise inferior) solutions for that reason. I just want to mention that vectors with billions elements are typical sizes for many of our problems.
and your adaptor will fix it for them as well.
The real question is where to draw the line between using fast array serialization and not using it?
- I think we can agree that classes like multi_array or ublas matrices and vectors should be recommended to use it wherever possible
the user will want to decide whether or not to use which adaptors.
- The same should be true for std::valarray.
yep
- To stay consistent we should then also use it for std::vector - What about C-arrays? since I rarely actually use them in raw form in my code, and never for large sizes, I have no strong personal preference. It would just be consistent and speed up serialzation at negligible compile time cost to also use the fast option there, but if you veto it I could live with it.
you can include it exclude it from your adaptor as you wish.
Actually it might be an adaptor only in the case of the binary archive. Other archives I mentioned (such as the MPI archive) will have to support fast array serialization directly to have any chance of being usable.
I would disagree with that. MPI archive might have it built in but it could just as well use the adaptor. All the magic happens at compile time - there is no run-time overhead. So the only considerations are design and flexibility.
Regarding the binary archives: if your concern is that it will make it harder for you to maintain, then I could, if you want, propose to submit the fast array version as a replacement for the existing one and take over its maintenance. That will make your task smaller and in the review of the submission we can hear if someone wants to keep the existing version.
In genearl, I want no more coupling than is absolutly necessary. I don't think its necessary here. You can get every thing you want and more by using an archive adaptor. Robert Ramey

"Robert Ramey" <ramey@rrsd.com> writes:
Why should anyone still use the other version? To save the compile time for 5 lines of code?
LOL - believe me, someone will want to do it differently. I can't say how or why - but believe me it will happen. The adaptor approach lets everyone add their own thing and lets everyone else pick and choose which combination of things they want to add.
IIUC, you currently have some default implmentation that's 10x slower than the one Matthias is proposing. Is there any good reason that the fast implementation shouldn't be the default? -- Dave Abrahams Boost Consulting www.boost-consulting.com

David Abrahams wrote:
"Robert Ramey" <ramey@rrsd.com> writes:
Why should anyone still use the other version? To save the compile time for 5 lines of code?
LOL - believe me, someone will want to do it differently. I can't say how or why - but believe me it will happen. The adaptor approach lets everyone add their own thing and lets everyone else pick and choose which combination of things they want to add.
IIUC, you currently have some default implmentation that's 10x slower than the one Matthias is proposing. Is there any good reason that the fast implementation shouldn't be the default?
The current implemenation is universal. The fast ... archive will only make a difference on those collections whose storage is contiguous. Its not even clear to to me that std::vector is storage is guarenteed to be contiguouse. Its not clear that its even applicable to archives other than native binary Basically Mattias enhancement is an exploitation of a special cases. That's actually a very good idea. But that's not the same replace of the general solution with something more intricate and more fragil - at no improvment in performance. Robert Ramey

"Robert Ramey" <ramey@rrsd.com> writes:
David Abrahams wrote:
"Robert Ramey" <ramey@rrsd.com> writes:
Why should anyone still use the other version? To save the compile time for 5 lines of code?
LOL - believe me, someone will want to do it differently. I can't say how or why - but believe me it will happen. The adaptor approach lets everyone add their own thing and lets everyone else pick and choose which combination of things they want to add.
IIUC, you currently have some default implmentation that's 10x slower than the one Matthias is proposing. Is there any good reason that the fast implementation shouldn't be the default?
The current implemenation is universal. The fast ... archive will only make a difference on those collections whose storage is contiguous.
I don't care if the fast archive doesn't make things faster in some cases as long as it always works. Does it fail to work for some collections?
Its not even clear to to me that std::vector is storage is guarenteed to be contiguouse.
It is. http://www.open-std.org/jtc1/sc22/wg21/docs/lwg-defects.html#69
Its not clear that its even applicable to archives other than native binary
Basically Mattias enhancement is an exploitation of a special cases. That's actually a very good idea. But that's not the same replace of the general solution with something more intricate and more fragil -
I can't parse that. "That's not the same replace of the general solution...?"
at no improvment in performance.
No improvement in performance? Huh? -- Dave Abrahams Boost Consulting www.boost-consulting.com

I can't parse that. "That's not the same replace of the general solution...?"
Basically Mattias enhancement is an exploitation of a special cases. That's actually a very good idea. But that's not the same replacING the general solution with something more intricate and more fragil -
at no improvment in performance.
No improvement in performance? Huh?
All the options heretofore discussed result in exactly the same run-time performance. Robert Ramey

"Robert Ramey" <ramey@rrsd.com> writes:
I can't parse that. "That's not the same replace of the general solution...?"
Basically Mattias enhancement is an exploitation of a special cases. That's actually a very good idea. But that's not the same replacING the general solution with something more intricate and more fragil -
at no improvment in performance.
No improvement in performance? Huh?
All the options heretofore discussed result in exactly the same run-time performance.
?? Matthias is reporting a 10x speedup in some (rather important) cases. -- Dave Abrahams Boost Consulting www.boost-consulting.com

On Oct 19, 2005, at 8:29 PM, David Abrahams wrote:
"Robert Ramey" <ramey@rrsd.com> writes:
I can't parse that. "That's not the same replace of the general solution...?"
Basically Mattias enhancement is an exploitation of a special cases. That's actually a very good idea. But that's not the same replacING the general solution with something more intricate and more fragil -
at no improvment in performance.
No improvement in performance? Huh?
All the options heretofore discussed result in exactly the same run-time performance.
?? Matthias is reporting a 10x speedup in some (rather important) cases.
Robert means that implementing the fast array serialization as a special overload an archive instead of in the serialize function will give identical performance. But there are other problems with that approach. Matthias

On Oct 19, 2005, at 7:55 AM, Robert Ramey wrote:
David Abrahams wrote:
"Robert Ramey" <ramey@rrsd.com> writes:
Why should anyone still use the other version? To save the compile time for 5 lines of code?
LOL - believe me, someone will want to do it differently. I can't say how or why - but believe me it will happen. The adaptor approach lets everyone add their own thing and lets everyone else pick and choose which combination of things they want to add.
IIUC, you currently have some default implmentation that's 10x slower than the one Matthias is proposing. Is there any good reason that the fast implementation shouldn't be the default?
The current implemenation is universal. The fast ... archive will only make a difference on those collections whose storage is contiguous. Its not even clear to to me that std::vector is storage is guarenteed to be contiguouse.
It is guaranteed ito be so by the standard.
Its not clear that its even applicable to archives other than native binary
Robert, in one of my previous e-mails I listed FOUR very different types of archives that need this feature, and not only for performance, but also for memory reasons. The MPI archive without support for array serialization in addition need unacceptably large amounts of memory (up to 5 times the storage needed for the array when serializing arrays of int on 64 but machines).
Basically Mattias enhancement is an exploitation of a special cases. That's actually a very good idea. But that's not the same replace of the general solution with something more intricate and more fragil - at no improvment in performance.
It's not a special case, since many archive types will directly support serialization of contiguous arrays, and some need it to be feasible at all. Regarding fragility: having to implement serialization of multi_array, ublas arrays, MTL, Blitz ... inside a fast array archive is much more fragile (if it were feasible at all, see my other mail), since changes to small implementation details of those libraries will break the code. There is no such danger with the std::vector serialization and C-style arrays. Matthias

Robert Ramey wrote:
I only took a very quick look at the diff file. I have a couple of questions:
It looks like that for certain types, (C++ arrays, vector<int>, etc) we want to use binary_save/load to leverage on the fact the fact that we can assume in certain situations that storage is contiguous.
Note that there is an example in the package - demo_fast_archive which does exactly this for C++ arrays. It could easily extended to cover any other desired types. I believe that using this as a basis would achieve all you desire and more which a much smaller investment of effort. Also it would not require changing the serialization library in any way.
If you check the post I made last week I did just this for std::vectors of POD types, this went from 9.5seconds for a 50 000 000 element vector of int to ~ 0.5 seconds. Very worthwhile speedup for a lot of common use cases. Martin

On Oct 10, 2005, at 3:39 AM, Martin Slater wrote:
Robert Ramey wrote:
I only took a very quick look at the diff file. I have a couple of questions:
It looks like that for certain types, (C++ arrays, vector<int>, etc) we want to use binary_save/load to leverage on the fact the fact that we can assume in certain situations that storage is contiguous.
Note that there is an example in the package - demo_fast_archive which does exactly this for C++ arrays. It could easily extended to cover any other desired types. I believe that using this as a basis would achieve all you desire and more which a much smaller investment of effort. Also it would not require changing the serialization library in any way.
If you check the post I made last week I did just this for std::vectors of POD types, this went from 9.5seconds for a 50 000 000 element vector of int to ~ 0.5 seconds. Very worthwhile speedup for a lot of common use cases.
Indeed, this 20x speedup fits well with my observations of a 5-100x speedup depending on compiler optimization settings, archive types, etc.. Matthias
participants (6)
-
David Abrahams
-
Ian McCulloch
-
Martin Slater
-
Matthias Troyer
-
Robert Ramey
-
troy d. straszheim