
Matthias Troyer wrote:
Oh yes, there can be a huge difference. Let me just give a few reasons:
1) in the applications we talk about we have to regularly send huge contiguous arrays of numbers (stored e.g. in a matrix, vector, valarray or multi_array) over the network. The typical size is 100 million numbers upwards. I'll stick to 100 million as a typical number in the following. Storing these 100 million numbers already takes up 800 MByte, and nearly fills the memory of the machine, and this causes problems:
a) copying these numbers into a buffer using the serialization library needs another 800 MB of memory that might not be available
b) creating MPI data types for each member separately mean storing at least 12 bytes (4 bytes each for the address, type and count), for a total of 1200 MBytes, instead of just 12 bytes. Again we will have a memory problem
But the main issue is speed. Serializing 100 million numbers one by one, requires 100 million access to the network interface, while serializing the whole block at one just causes a single call, and the rest will be done by the hardware. The reason why we cannot afford this overhead is that actually on modern high performance networks
** the network bandwidth is the same as the memory bandwidth **
This makes sense, thank you. I just want to note that contiguous arrays of double are handled equally well by either approach under discussion; an mpi_archive will obviously include an overload for double[]. I was interested in the POD case. A large array of 3x3 matrices wrapped in matrix3x3 structs would probably be a good example that illustrates your point (c) above. (a) and (b) can be avoided by issuing multiple MPI_Send calls for non-optimized sequence writes.