
Hi Matthias, On Fri, Sep 15, 2006 at 10:40:25PM +0200, Matthias Troyer wrote:
On Sep 15, 2006, at 9:12 PM, Markus Blatt wrote:
One thing that struck me at once is the missing of tags for the communication. In the MPI standard they are everywhere and usually are used to distinguish different communications (as they might get to the target process in arbitrary order).
I don't fully understand that. There are tags in all the point-to- point communications, which is where the MPI standard supports them.
Just forget about it. I was missing the tags in the collective communication where they definitely are none in the MPI standard. Probably I should have gotten more sleep. Sorry.
- What is your evaluation of the implementation?
One thing I do not understand is why the collective operations use std:vector to collect the data. I always thought that there is now garanty by the standard that the values are stored in contiguous memory (which is what MPI needs).
Section 23.2.4 of C++03 states "The elements of a vector are stored contiguously"
Thanks. That's a relieve. I should definitely get a newer standard.
As I see the implementation is using the boost serialization framework to send data types that do not correspond to a standard MPI_Datatype. While in this way it is still possible to send values between different architectures, I fear that there is (in some cases) a great efficiency loss: E. g. with none-PODs broadcasts (and probably other collective operations) do not use MPI's collective operations, but independent asynchronous MPI send and receives. This might circumvent applications of special optimization based on the underlying network topology (e. g. on a hypercube the broadcast is just O(ld P) where P is the number of processes).
I believe that your comments are based on a misunderstanding. The goal of the library design was to actually make it efficient and useful for high performance applications. The library allows the use of custom MPI_Datatypes and actually creates them automatically whenever possible, using the serialization library. For those data types for which this is not possible (e.g. the pointers in a linked list) the serialization library is used to serialize the data structure, which is then packed into a message using MPI_Pack.
The skeleton&content mechanism is essentially a simple way to create a custom MPI datatype. The "content" is just an MPI_Datatype created for the data members of the object you want to send.
Another case is when sending, e.g. a std::vector or other array-like data structure of a type for which a custom MPI_Datatype can be constructed. In that case the serialization library is called once to construct the needed MPI_Datatype for one element in the array, and then communication is done using that MPI_Datatype.
You might be worried that the use of the serialization library leads to inefficiencies since in the released versions each element of an array was serialized independently. The recent changes, which are in the CVS HEAD, actually address this issue by providing optimizations for array-like data structures.
I hope these answers address the issues you had in mind. I can elaborate if you want.
The question came up when I looked into mpi/collectives/broadcast.hpp: // We're sending a type that does not have an associated MPI // datatype, so we'll need to serialize it. Unfortunately, this // means that we cannot use MPI_Bcast, so we'll just send from the // root to everyone else. template<typename T> void broadcast_impl(const communicator& comm, T& value, int root, mpl::false_) If this function gets called the performance will definitely be suboptimal as the root will send to all others. It this just the case if no MPI_Datatype was constructed (like for the linked list) or is it called whenever the boost serialization is used? Regards, Markus Blatt -- DUNE -- The Distributed And Unified Numerics Environment <http://www.dune-project.org>