Re: [boost] [MPI] Boost.MPI Review

16 Sep 2006

      Hi Matthias,

On Fri, Sep 15, 2006 at 10:40:25PM +0200, Matthias Troyer wrote:
...
On Sep 15, 2006, at 9:12 PM, Markus Blatt wrote:
...
One thing that struck me at once is the missing of tags for the
communication. In the MPI standard they are everywhere and usually are
used to distinguish different communications (as they might get to the
target process in arbitrary order).
I don't fully understand that. There are tags in all the point-to- 
point communications, which is where the MPI standard supports them.
Just forget about it. I was missing the tags in the collective
communication where they definitely are none in the MPI
standard. Probably I should have gotten more sleep. Sorry.
...
...
...
- What is your evaluation of the implementation?
One thing I do not understand is why the collective operations use
std:vector to collect the data. I always thought that there is now
garanty by the standard that the values are stored in contiguous
memory (which is what MPI needs).
Section 23.2.4 of C++03 states "The elements of a vector are stored  
contiguously"
Thanks. That's a relieve. I should definitely get a newer standard.
...
...
As I see the implementation is using the boost serialization
framework to send data types that do not correspond to a standard
MPI_Datatype. While in this way it is still possible to send values
between different architectures, I fear that there is (in some cases)
a great efficiency loss:
E. g. with none-PODs broadcasts (and probably other collective
operations) do not use MPI's collective operations, but independent
asynchronous MPI send and receives. This might circumvent applications
of special optimization based on the underlying network topology
(e. g. on a hypercube the broadcast is just O(ld P) where P is the
number of processes).
I believe that your comments are based on a misunderstanding. The  
goal of the library design was to actually make it efficient and  
useful for high performance applications. The library allows the use  
of custom MPI_Datatypes and actually creates them automatically  
whenever possible, using the serialization library. For those data  
types for which this is not possible (e.g. the pointers in a linked  
list) the serialization library is used to serialize the data  
structure, which is then packed into a message using MPI_Pack.
The  skeleton&content mechanism is essentially a simple way to create  
a custom MPI datatype. The "content" is just an MPI_Datatype created  
for the data members of the object you want to send.
Another case is when sending, e.g. a std::vector or other array-like  
data structure of a type for which a custom MPI_Datatype can be  
constructed. In that case the serialization library is called once to  
construct the needed MPI_Datatype for one element in the array, and  
then communication is done using that MPI_Datatype.
You might be worried that the use of the serialization library leads  
to inefficiencies since in the released versions each element of an  
array was serialized independently. The recent changes, which are in  
the CVS HEAD, actually address this issue by providing optimizations  
for array-like data structures.
I hope these answers address the issues you had in mind. I can  
elaborate if you want.
The question came up when I looked into mpi/collectives/broadcast.hpp:

  // We're sending a type that does not have an associated MPI
  // datatype, so we'll need to serialize it. Unfortunately, this
  // means that we cannot use MPI_Bcast, so we'll just send from the
  // root to everyone else.
  template<typename T>
  void
  broadcast_impl(const communicator& comm, T& value, int root,
  mpl::false_)

If this function gets called the performance will definitely be
suboptimal as the root will send to all others. It this just the case
if no MPI_Datatype was constructed (like for the linked list) or is it
called whenever the boost serialization is used?

Regards,

Markus Blatt
-- 
DUNE -- The Distributed And Unified Numerics Environment 
<http://www.dune-project.org>