Re: [boost] [serialization] fast array serialization (10x speedup)

9 Oct 2005

      I only took a very quick look at the diff file.  I have a couple of 
questions:

It looks like that for certain types, (C++ arrays, vector<int>, etc) we want 
to use
binary_save/load to leverage on the fact the fact that we can assume in 
certain
situations that storage is contiguous.

Note that there is an example in the package - demo_fast_archive which does
exactly this for C++ arrays.  It could easily extended to cover any other 
desired
types.  I believe that using this as a basis would achieve all you desire 
and more
which a much smaller investment of effort.  Also it would not require 
changing the
serialization library in any way.

Robert Ramey

Matthias Troyer wrote:
...
...
Hi Robert,
Over the past week I got around to doing what I wanted to do for a
long time, and implemented an improved serialization of contiguous
arrays of fundamental types. The motivation was two-fold:
i) to speed up the serialization of large data sets by factors of up
to 10
ii) to allow implementation of serialization by MPI
The problem with the serialization of large contiguous arrays of
fundamental types (be it a C-array, a std::vector, a std::valarray,
boost::multi_array, ...) is that the serialization function is called
for each (of possible billions) elements, instead of only once for
the whole array. I have attached a suite of three benchmark programs
and timings I ran on a PowerbookG4 using Apple's version of the gcc-4
compiler. The benchmarks are
a) vectortime: reading and writing a std::vector<double> with 10^7
elements to/from a file
  b) arraytime: reading and writing 1000 arrays double[10000] to/
from a file
  c) vectortime_memory: reading and writing a std::vector<double>
with 10^7 elements to/from a memory buffer
The short summary of the benchmarks is that Boost.Serialization is
5-10 times slower than direct reading or writing!
With the fast array serialization modifications, discussed below,
this slowdown is removed. Note that the codes were compiled with -O2.
Without -O2 I have observed another factor of 10 in slowdown in some
cases
In order to implement the fast array serialization, I made the
following changes to the serialization library:
i) a new traits class
template <class Archive, class Type>
    has_fast_array_serialization<Archive,Type>;
which specifies whether an Archive has fast array serialization for a
Type. The default implementation for this traits class is false, so
that no change is needed for existing archives.
ii) output archives supporting fast array serialization for a given
Type T provide an additional member function
save_array(T const * address, std:;size_t length);
to save a contiguous array of Ts, containing length elements starting
at the given address, and a similar function
load_array(T * address, std:;size_t length);
for input archives
iii) serialization of C-arrays and std::vector<T> was changed to use
fast array serialization for those archives and types where it is
supported. I'm still working on serialization for std::valarray and
boost::multi_array using the same features.
iv) in addition, to support an MPI serialization archive (which is
essentially done but still being tested), and to improve portability
of archives, I introduced a new "strong" type
BOOST_STRONG_TYPEDEF(std::size_t, container_size_type)
for the serialization of the size of a container. The current
implementation uses an unsigned int to store the size, which is
problematic on machines with 32-bit int but 64 bit size_type . To
stay compatible with old archives, the serialization into binary
archives converts the size to an unsigned int, but this should be
changed to another type, and the file version number bumped up to
allow containers with more than 2^32 elements.
The second motivation was MPI serialization, for which I need the
size type of containers to be a type distinct from any other integer.
The explanation is lengthy and I will provide the reason once the MPI
archives are finished.
v) also the polymporphic archives were changed, by adding save_array
and load_array functions. Even for archives not supporting fast array
serialization per se this should improve performance, since now only
a single virtual function call is required for arrays, instead of one
per element.
The modifications are on the branch tagged
"fast_array_serialization", and I have attached the diffs with
respect to the main trunk. I have performed regression tests under
darwin, using Apple's version of gcc-4. None of the changes should
lead to any incompatibility with archives written with the current
version of the serialization library, nor should it break any
existing archive implementation.
Regarding compatibility with non-conforming compilers the only issue
I see is that I have used boost::enable_if to dispatch to either the
standard or fast array serialization. We should discuss what to do
for compilers that do not support SFINAE. My preferred solution would
be to just disable fast array serialization for these compilers, to
keep the changed to the code minimal. The other option would be to
add another level of indirection and implement the dispatch without
using SFINAE.
Robert, could you take a look at the modifications, and would it be
possibly the merge these modifications with the main trunk once you
have finished your work for the 1.33.1 release?
Best regards
Matthias
...
_______________________________________________
Unsubscribe & other changes:
http://lists.boost.org/mailman/listinfo.cgi/boost