
Hi Guys, I'm writing quite a number of parallel-distributed applications in the day job, and I'm missing a facility which allows for "automagically" nesting my reductions. The reduction strategy I'm looking for is one of a networked approach where (if you can imagine a tree): 0 |-1 | |-2 | |-3 | |-4 |-5 |-6 Nodes 2 and 3 send their data to node 1, node 5 and 6 send their data to node 4, and eventually node 1 and 4 send their (reduced) data to node 0. My idea is that it should be simple to implement this without having to go through too much trouble with communicators -- or that it could be automagically done by hiding the communicator splits in the special reduce implementation. Would something like this be best implemented within Boost.MPI? Or is there a way of implementing this by composing the functionality already available in Boost.MPI? The reason I ask is that sometimes the reduction step dominates the amount of time the application spends especially when you have quite a number of nodes (around 90ish). Being able to parallelize (or nest) the reduction would definitely help, but the cost of supporting that routine over a number of applications seems to warrant a special implementation of the Boost.MPI reduce. TIA -- Dean Michael Berris | Software Engineer, Friendster, Inc. blog.cplusplus-soup.com | twitter.com/mikhailberis | linkedin.com/in/mikhailberis | profiles.friendster.com/mikhailberis | deanberris.com