Re: [boost] [serialization] fast array serialization (10x speedup)

23 Nov 2005


      Matthias Troyer wrote:
...
Oh yes, there can be a huge difference. Let me just give a few
reasons:
1) in the applications we talk about we have to regularly send huge
contiguous arrays of numbers (stored e.g. in a matrix, vector,
valarray or multi_array) over the network. The typical size is 100
million numbers upwards. I'll stick to 100 million as a typical
number in the following. Storing these 100 million numbers already
takes up 800 MByte, and nearly fills the memory of the machine, and
this causes problems:
a) copying these numbers into a buffer using the serialization
library needs another 800 MB of memory that might not be available
b) creating MPI data types for each member separately mean storing
at least 12 bytes (4 bytes each for the address, type and count), for
a total of 1200 MBytes, instead of just 12 bytes. Again we will have
a memory problem
But the main issue is speed. Serializing 100 million numbers one by
one, requires 100 million access to the network interface, while
serializing the whole block at one just causes a single call, and the
rest will be done by the hardware.      The reason why we cannot
afford this overhead is that actually on modern high performance
networks
** the network bandwidth is the same as the memory bandwidth **
This makes sense, thank you. I just want to note that contiguous arrays of 
double are handled equally well by either approach under discussion; an 
mpi_archive will obviously include an overload for double[]. I was 
interested in the POD case. A large array of 3x3 matrices wrapped in 
matrix3x3 structs would probably be a good example that illustrates your 
point (c) above. (a) and (b) can be avoided by issuing multiple MPI_Send 
calls for non-optimized sequence writes.

Re: [boost] [serialization] fast array serialization (10x speedup)

Peter Dimov