Re: [boost] [serialization] fast array serialization (10x speedup)

10 Oct 2005

      On Oct 9, 2005, at 11:15 PM, Robert Ramey wrote:
...
Attached is a sketch of what I have in mind.  It does compile without
error on VC 7.1
With this approach you would make one fast_oarchive adaptor class
and one small and trivial *.hpp file for each archive it is adapted  
to.
---- SUMMARY ---------

If I may summarize this solution as follows:

template<class Base>
class fast_oarchive_impl :
     public Base
{
public:
...
     // custom specializations
     void save_override(const std::vector<int> & t, int){
         save_binary(t, sizeof(int) * t.size());
     }

     // here's a way to do it for all vectors in one shot
     template<class T>
     void save_override(const std::vector<T> & t, int){
         save_binary(t, sizeof(T) * t.size());
         // this version not certified for more complex types !!!
         BOOST_STATIC_ASSERT(boost::is_primitive<T>::value);
         // or pointers either !!!
         BOOST_STATIC_ASSERT(boost::is_pointer<T>::value);
     }

...
};

then I see several major disadvantages of this approach:

1.) it fixes the value types for which fast array serialization can  
be done
2.) for M types types to be serialized and N archives there is an MxN  
problem in this approach.
3.) it leads to a tight coupling between archives and all classes  
that can profit from fast array serialization (called "array-like  
classes" below), and makes the archive depend on implementation  
details of the array-like classes
4.) it is not easily extensible to new array-like classes

Let me elaborate on these points below and provide a possible  
solution to each of them. The simplest solution, as I see it will be to

- provide an additional traits class has_fast_array_serialization
- archives offering (the optional) fast array serialization provide a  
save_array member function in addition to save and save_binary
- The dispatch to either save() or save_array() is the responsibility  
of the serialization code of the class, and not the responsibility of  
the archive

These are minor extensions to the serialization library, that do not  
break any existing code, that do not make it harder to write a new  
archive or a new serialize function, but they allow new types of  
archives and can give huge speedups for large data sets.

------- DETAILS -----------

Now the details

ad 1.: you need to fix the types for which the fast version is used,  
either by providing explicit overloads for K types (such as int  
above), which would give a KxMxN problem, or by using a template- 
based approach. You are probably aware that the BOOST_STATIC_ASSERT  
in your example will cause this archive to fail working with  
std::vectors of more complex types. One easy way to solve this is by  
restricting the applicability of the template, using boost::enable_if:

     template<class T>
     void save_override
       (
         const std::vector<T> & t, int,
         typename  
boost::enable_if<has_fast_array_serialization<Base,T> >::type *=0
       );

where the traits class has_fast_array_serialization<Base,T> specifies  
whether the fast version should be used for the type T with the  
archive Base.

The reason to provide a traits class instead of hard-coding the  
types, and restricting them to primitive non-pointer types is that  
the set of types that can use an optimized serialization depends on  
the archive type. A non-portable binary archive could support all POD- 
types that contain no pointer members (e.g. the gps_position class in  
your example), while an MPI archive can support fast serialization  
for all default-constructible types. Hence a traits class depending  
on both the archive type and the value_type of the vector.

ad 2.: there is still an MxN problem: you propose to dispatch to  
save_binary:

     void save_override(const std::vector<T> & t, int)
     {
         save_binary(t, sizeof(T) * t.size());
     }

where the signature of save_binary is

     void save_binary(void const *, std::size_t *)

This is an acceptable solution for binary archives, and maybe a few  
others, but is NOT a general solution. To illustrate this, let me  
show how the fast saving is or might be implemented for some other  
archives.

a) a potential portable binary archive might need to do byte reordering:

     void save_override(const std::vector<T> & t, int)
     {
         save_with_reordering(&(t[0]), t.size());
     }

where the save_with_reordering function will need type information to  
do the byte reordering, and might have a signature

     template <class T>
     void save_with_reordering(void T *, std::size_t *)

b) an XDR archive, using XDR streams needs to make a call to an XDR  
function, and pass type information as well, as in

class xdr_oarchive
{
    ...
     void save_override(const std::vector<int> & t, int)
     {
         xdr_vector(&stream,&(t[0]),t.size(),sizeof(T),&xdr_int);
     }

   XDR* stream;
}

and a templated version could also be provided easily. Note that  
again I need the address, size and type information and cannot make  
this call from within save_binary. I have an archive implementation  
based on the UNIX xdr calls, and this is thus no hypothetical example/

c) let's next look at a packed MPI archive (of which I also have an  
implementation), where the override would be

// simplified version of MPI archive
class packed_mpi_oarchive
{
    ...
     void save_override(const std::vector<int> & t, int)
     {
         MPI::Datatype datatype(MPI::INTEGER);
         datatype.Pack(&(t[0]), t.size(), buffer, buffer_size,  
position, communicator);
     }

     char* buffer;
     int buffer_size;
     int position;
     MPI::Comm& communicator;
}

and again, I need type information and cannot just call save_binary.

d) as a fourth example I want to mention that MPI allows for  
serialization by message passing without the need to pack the data  
into a buffer first, but only the addresses and types of all data  
members need to be stored, to create a custom MPI type. An incomplete  
implementation (I have a complete implementation, based on the  
original idea by Daniel Egloff) would be:

class mpi_oarchive
{
    ...
     template <class T>
     void save_override(const std::vector<T> & t, int)
     {
         register_member(&(t[0]),t.size());
     }

     template <class T>
     void register_member(T const* t, std::size_t l)
     {
         addresses.push_back(MPI::Get_address(t));
         sizes.push_back(l);
         types.push_back(mpi_type<T>::value);
     }

     std::vector<MPI::Aint> addresses;
     std::vector<int> sizes;
     std::vector<MPI::Datatype> types;
}

Note that again save_binary does not do the trick, since we need type  
information.

For this reason my proposed solution is to dispatch to a save_array  
function for those types and archives supporting it:

template<class Base>
class fast_oarchive_impl :
     public Base
{
public:

     // here's a way to do it for all vectors in one shot
     template<class T>
     void save_override
       (
         const std::vector<T> & t, int,
         typename  
boost::enable_if<has_fast_array_serialization<Base,T> >::type *=0
       )
     {
         save_binary(&(t[0]),t.size());
     }

...
};

where all archive classes provide a function like

     void Archive::save_array(Type const *, std::size_t)

for all types for which the traits  
has_fast_array_serialization<Archive,Type>  is true. That way a  
single overload suffices for all the N=5 archive types presented  
above, and the MxN problem is solved. Note also, that archives not  
supporting this fast array serialization do not need to implement  
anything, as the default for  
has_fast_array_serialization<Archive,Type> is false.

ad 3.: your proposal leads to a tight coupling between archives and  
classes to be serialized. Consider what I would need to do to add  
support for some future MTL matrix type. Again I present a simplified  
example showing the problem:

     template<class T>
     void save_override( const mtl_dense_matrix & m, int)
     {
         T const * data =  
implemetation_dependent_function_to_get_pointer(m);
         std::size_t length =  
implemetation_dependent_function_to_get_size(m)
         save_binary(data,length);
     }

This introduces implementation details of the mtl_dense_matrix class  
into the archive, breaks orthogonality, and leads to a tight  
coupling. Change in these implementation details of the  
mtl_dense_matrix might require changes to the archive classes.

The solution is easy:

- some archives provide fast array serialization through the  
save_array member function
- let the MTL be responsible for serialization of it own classes, and  
use save_array where appropriate

ad 4.: in order to use fast array serialization with other classes  
such as

  - std::vector
  - std::valarray
  - boost::multi_array
  - uBlas vectors and matrices
  - blitz::Array

save_override functions for ALL of these classes have to be added to  
the archive. This means, that to support any new class, be it a new  
ublas matrix, future MTL matrices, Blitz++ arrays, ...., the archive  
class needs to be modified. This is clearly not a scalable design.

To summarize, with three minor extensions to the serialization  
library, none of which breaks any existing code, we can get 10x  
speedups for serialization of large date sets, enable new types of  
archives such as MPI archives, and all of that without introducing  
any of the four problems discussed here.

Matthias

Re: [boost] [serialization] fast array serialization (10x speedup)

Matthias Troyer