[serialization] fast array serialization (10x speedup)

9 Oct 2005

      Hi Robert,

Over the past week I got around to doing what I wanted to do for a  
long time, and implemented an improved serialization of contiguous  
arrays of fundamental types. The motivation was two-fold:

i) to speed up the serialization of large data sets by factors of up  
to 10
ii) to allow implementation of serialization by MPI

The problem with the serialization of large contiguous arrays of  
fundamental types (be it a C-array, a std::vector, a std::valarray,  
boost::multi_array, ...) is that the serialization function is called  
for each (of possible billions) elements, instead of only once for  
the whole array. I have attached a suite of three benchmark programs  
and timings I ran on a PowerbookG4 using Apple's version of the gcc-4  
compiler. The benchmarks are

   a) vectortime: reading and writing a std::vector<double> with 10^7  
elements to/from a file
   b) arraytime: reading and writing 1000 arrays double[10000] to/ 
from a file
   c) vectortime_memory: reading and writing a std::vector<double>  
with 10^7 elements to/from a memory buffer

The short summary of the benchmarks is that Boost.Serialization is  
5-10 times slower than direct reading or writing!

With the fast array serialization modifications, discussed below,  
this slowdown is removed. Note that the codes were compiled with -O2.  
Without -O2 I have observed another factor of 10 in slowdown in some  
cases

In order to implement the fast array serialization, I made the  
following changes to the serialization library:

i) a new traits class

     template <class Archive, class Type>
     has_fast_array_serialization<Archive,Type>;

which specifies whether an Archive has fast array serialization for a  
Type. The default implementation for this traits class is false, so  
that no change is needed for existing archives.

ii) output archives supporting fast array serialization for a given  
Type T provide an additional member function

     save_array(T const * address, std:;size_t length);

to save a contiguous array of Ts, containing length elements starting  
at the given address, and a similar function

     load_array(T * address, std:;size_t length);

for input archives

iii) serialization of C-arrays and std::vector<T> was changed to use  
fast array serialization for those archives and types where it is  
supported. I'm still working on serialization for std::valarray and  
boost::multi_array using the same features.

iv) in addition, to support an MPI serialization archive (which is  
essentially done but still being tested), and to improve portability  
of archives, I introduced a new "strong" type

     BOOST_STRONG_TYPEDEF(std::size_t, container_size_type)

for the serialization of the size of a container. The current  
implementation uses an unsigned int to store the size, which is  
problematic on machines with 32-bit int but 64 bit size_type . To  
stay compatible with old archives, the serialization into binary  
archives converts the size to an unsigned int, but this should be  
changed to another type, and the file version number bumped up to  
allow containers with more than 2^32 elements.

The second motivation was MPI serialization, for which I need the  
size type of containers to be a type distinct from any other integer.  
The explanation is lengthy and I will provide the reason once the MPI  
archives are finished.

v) also the polymporphic archives were changed, by adding save_array  
and load_array functions. Even for archives not supporting fast array  
serialization per se this should improve performance, since now only  
a single virtual function call is required for arrays, instead of one  
per element.

The modifications are on the branch tagged  
"fast_array_serialization", and I have attached the diffs with  
respect to the main trunk. I have performed regression tests under  
darwin, using Apple's version of gcc-4. None of the changes should  
lead to any incompatibility with archives written with the current  
version of the serialization library, nor should it break any  
existing archive implementation.

Regarding compatibility with non-conforming compilers the only issue  
I see is that I have used boost::enable_if to dispatch to either the  
standard or fast array serialization. We should discuss what to do  
for compilers that do not support SFINAE. My preferred solution would  
be to just disable fast array serialization for these compilers, to  
keep the changed to the code minimal. The other option would be to  
add another level of indirection and implement the dispatch without  
using SFINAE.

Robert, could you take a look at the modifications, and would it be  
possibly the merge these modifications with the main trunk once you  
have finished your work for the 1.33.1 release?

Best regards

Matthias

Matthias Troyer

Robert Ramey

Matthias Troyer

Robert Ramey

Matthias Troyer

Robert Ramey

troy d. straszheim

Robert Ramey

Robert Ramey

Matthias Troyer

Robert Ramey

Matthias Troyer

Robert Ramey

Matthias Troyer

Matthias Troyer

Ian McCulloch

Matthias Troyer

Martin Slater

Matthias Troyer

Robert Ramey

Matthias Troyer

Matthias Troyer

troy d. straszheim

Robert Ramey

Robert Ramey

Matthias Troyer

Robert Ramey

Matthias Troyer

Matthias Troyer

troy d. straszheim

Robert Ramey

Ian McCulloch

troy d. straszheim

Robert Ramey

David Abrahams

Robert Ramey

David Abrahams

Robert Ramey

David Abrahams

Matthias Troyer

Matthias Troyer

Martin Slater

Matthias Troyer

tags

participants (6)