
At Wed, 10 Nov 2010 02:05:43 -0500, Sid Sacek wrote:
On Sat, Nov 6, 2010 at 3:37 AM, Matthias Troyer <troyer@phys.ethz.ch> wrote:
Nice summary! Dave Abrahams
I've been meaning to ask you... ( after reading your web page in a rushed manner )
How is your library different from RPC? Unless I missed the point, the library efficiently delivers data to a receiver so that it can process it offline. If you can, please, point out some important points that I may not have grokked, (I mean other than the network and OS abstraction-layer).
RPC is a *paradigm* in which you make Remote Procedure Calls. The idea is that you call some function over here foo(bar) and it invokes a function called foo on some representation of bar's value on a remote machine. MPI (which is not my library) is an *API* with a bunch of available implementations that offers *Message Passing*. That's just about movement of data, and the computation patterns aren't limited to those you can create with an imperative interface like RPC. Massively parallel computations tend to do well with a Bulk Synchronous Parallel (BSP) model, and MPI supports that paradigm very nicely. Boost.MPI is a library that makes it much nicer to do MPI programming in C++.
Also, do you believe that efficiency in data transmission is important?
I'm sure it depends on the underlying technology and the problem you're applying it to.
Usually, headers of serialized data adds about 5% overhead, unless data packets are small. The internet and local networks are about 10-gigabit, and are pushing into the 100-gigabit range now. By the time you get a lot of programmers coding to the library, the networks and CPU's will be so fast, I hardly think the small overheads will make any difference. One thing I've noticed for a decade now is that networks are an order or two magnitudes faster than computers, I mean, networks deliver data way faster than a computer can process it.
That's not commonly the case with systems built on MPI. Communication costs tend to be very significant in a system where every node needs to talk to an arbitrary set of other nodes, and that's a common pattern for HPC problems. -- Dave Abrahams BoostPro Computing http://www.boostpro.com