Re: [boost] [serialization] fast array serialization (10x speedup)

23 Nov 2005

      Matthias Troyer wrote:
...
Indeed this sounds like a lot of work and that's why this mechanism
for message passing was rarely used in the past. The hard part is to
manually build up the custom MPI datatype, i.e. to inform MPI about
what the offsets and types of the various data members in a struct
are.
This is where the serialization library fits in and makes the task
extraordinarily easy. Saving a data member with such an MPI archive
will register its address, type (as well as the number of identical
consecutive elements in an array) with the MPI library. Thus the
serialization library does all the hard work already!
I still don't see the 10x speedup in the subject. For a X[], the two 
approaches are:

1. for each x in X[], "serialize" into an MPI descriptor
2. serialize X[0] into an MPI descriptor, construct an array descriptor from 
it

Conceptual issues with (2) aside (the external format of X is determined by 
X itself and you have no idea whether the structure of X[0] also describes 
X[1]), I'm not sure that there will be such a major speedup compared to the 
naive (1).

Robert's point also deserves attention; a portable binary archive that 
writes directly into a socket eliminates the MPI middleman and will probably 
achieve a similar performance as your two-pass MPI approach. It also 
supports versioned non-PODs and other nontrivial types. As an example, I 
have a type which is saved as

save( x ):

    save( x.anim.name() ); // std::string

and loaded as

load( x ):

    string tmp;
    load( tmp );
    x.set_animation( tmp );

Not everything is field-based.

Re: [boost] [serialization] fast array serialization (10x speedup)

Peter Dimov