
Hi, sorry for opening a new thread but I just joined the list. I got pointed to this review process at the beginning of this week. Unfortunately I did not find time to evaluate the design until now. Yesterday I took a short glance at the documentation and today I investigated some issues that struck me at once. I was not on the boost list until today and am not really familiar with the whole boost project. Therefore I apologize if I state some things about boost wrong because of lack of insight. So lets start with my actual evaluation.
- What is your evaluation of the design?
The design is pretty straight forward. Therefore I think everybody familiar with the C-Interface of MPI will feel at home at once. One thing that struck me at once is the missing of tags for the communication. In the MPI standard they are everywhere and usually are used to distinguish different communications (as they might get to the target process in arbitrary order). How can the programmer make sure that the message that the process is receiving is the message that he wants to receive at this point of the program? If there is no such means I would consider this a serious flaw in the design.
- What is your evaluation of the implementation?
One thing I do not understand is why the collective operations use std:vector to collect the data. I always thought that there is now garanty by the standard that the values are stored in contiguous memory (which is what MPI needs). What if the user wants to comunicate from/to a given data structure represented by a standard C-Array? This might be the case very often in scientific computing where you tend to reuse existent libraries (probably implemented C or by a novice C++ programmer). As I see the implementation is using the boost serialization framework to send data types that do not correspond to a standard MPI_Datatype. While in this way it is still possible to send values between different architectures, I fear that there is (in some cases) a great efficiency loss: E. g. with none-PODs broadcasts (and probably other collective operations) do not use MPI's collective operations, but independent asynchronous MPI send and receives. This might circumvent applications of special optimization based on the underlying network topology (e. g. on a hypercube the broadcast is just O(ld P) where P is the number of processes). This is a knock-out criterium for people doing High-Performance-Computing where efficiency is everything unless there is a way of sending at least 1-dimensional arrays the usual way. But I fear that even then what they gain by using Boost.MPI is not enough to persuade them to use it. Maybe it would be possible to use MPI_Pack or create custom MPI_Datatypes? I have not tested my statement. This should probably be done to see whether I am right.
- What is your evaluation of the documentation?
The documentation is really good. It was possible to understand the usage of the framework at once.
- What is your evaluation of the potential usefulness of the library?
At the current state it will probably not be used by people doing High Performance Computing due to the lack of efficiency stated above. If people do not need best performance then the library is useful to them as it is by far easier to use as MPI for a C++ programmer.
- Did you try to use the library? With what compiler?
No, unfortunately there was no time.
- How much effort did you put into your evaluation? A glance? A quick reading? In-depth study?
I read through the documentation and then studied issues that struck me in the source.
- Are you knowledgeable about the problem domain?
I have been programming in High Performance Computing seriously since 2001. Currently I am doing research and implementation in parallel iterative solvers (especially parallel algebraic multigrid methods). Regarding whether I think if boost mpi should be included into boost: I don't think I am in the position to argue about that as I do not have enough background knowledge about boost, but: If you want to adress people who are into High Performance Computing, you should address the issues above before you incorporate it into boost. If you just want people, to get started with parallel programming more easily, go for it. But unless effiency is granted by Boost.MPI those people have to hand code and deal with MPI directly sooner or later to get the efficiency they need. I hope I could help in the evaluation process. Best regards, Markus Blatt -- DUNE -- The Distributed And Unified Numerics Environment <http://www.dune-project.org>