Re: [Boost-users] Performance optimization in Boost using std::vector<>

12 Feb 2015

      Hello
...
There is a known performance problem with serializing a std::vector
  over MPI.
Basically, this prevents you from ever reaching the performance of C.
The problem is on the receive side. When you receive a vector, if you
don't know the size,
the receive side has to:
- get the number of elements of the vector
- resize the vector (which initializes elements)
- receive the elements in the vector data (reinitialize the elements)
The C version of the idiom:
- gets the number of elements
- reserves (as opposed to resize) the memory for the elements
- receive the element in the vector (initialize elements once).
This might make a small or a large performance difference, profile!
According to the attached program there seems to be a much larger 
performance problem than initializing vector elements. The program first 
sends a vector of doubles using MPI, then sends another identical vector 
with boost::mpi and prints how long these took in seconds. Note that 
boost::mpi also sends two messages for run-time sized containers. For 
vectors of 1e6 items the program prints (mpi rank is the first number):

mpi
0 resize: 0.0126891, send: 0.00988925, recv: 0
1 resize: 0.0131643, send: 0, recv: 0.00955247
boost::mpi
0 resize: 0.0096425, send: 0.279135, recv: 0
1 resize: 0, send: 0, recv: 0.295702

For vectors of 1e7 items:

mpi
0 resize: 0.0974027, send: 0.0538886, recv: 0
1 resize: 0.105708, send: 0, recv: 0.0456324
boost::mpi
0 resize: 0.0517177, send: 2.70333, recv: 0
1 resize: 0, send: 0, recv: 2.82339

And vectors of 5e7 items:

mpi
0 resize: 0.590099, send: 0.226269, recv: 0
1 resize: 0.440719, send: 0, recv: 0.375706
boost::mpi
0 resize: 0.198448, send: 13.5335, recv: 0
1 resize: 0, send: 0, recv: 14.0518

Boost::mpi version is always at least 10 times slower. It also seems to 
run out of memory with smaller number of items implying that unnecessary 
copies of data are created somewhere. Based on experience with more 
complex programs (e.g. http://dx.doi.org/10.1016/j.jastp.2014.08.012) I 
wouldn't recommend boost::mpi for high performance computing. Or in case 
of user error at least high performance is easier to get with pure MPI...

I used boost-1.57.0, g++ (GCC) 4.8.3 20140911 (Red Hat 4.8.3-7) and 
mpirun (Open MPI) 1.6.5.

Ilja

Re: [Boost-users] Performance optimization in Boost using std::vector<>

Ilja Honkonen