
Hi John, On Tue, Apr 21, 2009 at 7:46 AM, John Phillips <phillips@mps.ohio-state.edu> wrote:
What MPI implementation are you using?
I'm using OpenMPI version 1.2.6 on Gentoo Linux.
I ask because I believe that at least some of them already use optimizations of this sort when implementing reduce.
Oh, that's good to know.
That is really more the right place for a communications scheme of this sort.
You may be right, I haven't looked at it that way. :-)
Boost.MPI just uses the facilities for this that are provided by MPI itself. In principle, the MPI implementation should do the reduce in as efficient a way as it can. What does yours currently do?
Right. I have noticed that reduce takes more time on the root node when you use the 'world' communicator than the communication does. This led me to think that the cost of reducing the data coming from the other nodes is dominating the whole execution time -- if the reductions happened in an hierarchical fashion I'm thinking this should limit the sequential parts of the computation and leverage the available parallelism in the system. I haven't looked at whether OpenMPI does what I'm looking for that's why I was asking if it might be worth implementing at the Boost.MPI layer. Although it would be nice if OpenMPI did it for me in the first place. ;-) -- Dean Michael Berris | Software Engineer, Friendster, Inc. blog.cplusplus-soup.com | twitter.com/mikhailberis | linkedin.com/in/mikhailberis | profiles.friendster.com/mikhailberis | deanberris.com