Re: [boost] Boost.MapReduce: what next?

31 Aug 2009


      Cory Nelson said
...
People have been splitting tasks
into multiple operations and combining the results on single machines
for ages -- MapReduce doesn't really offer any innovation there.
Well, it provides a very easy framework for implementing parallel algorithms. Mulithreading is hard and often done very badly - MR simplifies the task tremendously.
...
The
innovation, and the buzz about it, is that it offers a reliable,
general-purpose, and large-scale distributed implementation of this
very basic idea. If you can accomplish that in this library, I think
there will be _a lot_ more interest.
I think a lot of the MapReduce buzz also has to do with the services
tied to it that further ease common scalability bottlenecks, the big
ones being Google File System and BigTable.  It's really just part of
the bigger ecosystem.
Agreed - the difficulty is in defining where a library ends and the infrastructure begins. This library cannot (and should not, IMO) explode into a distributed file system (extension to Boost.FileSystem) & communications library (Boost.MPI or Boost.ASIO based). This is the MapReduce algorithm to sit upon other infrastructure to provide an overall solution.

-- Craig

Re: [boost] Boost.MapReduce: what next?

Craig Henderson