Re: [boost] Boost.MapReduce: what next?

31 Aug 2009

      On Mon, Aug 31, 2009 at 4:19 AM, Craig
Henderson<cdm.henderson@googlemail.com> wrote:
...
I am thinking now about how to progress my MapReduce library that is in the
Boost SandBox. I have completed the single-machine implementation; it's
performance is comparable to other libraries such as Phoenix
(http://mapreduce.stanford.edu) and has been tested by a few people on this
list.
There has, however, been little interest in the library so far from Boost
users/developers which surprises me. I don't know if that is because of the
single-machine limitation and people don't see any value that MR can bring
to multi-threaded programming?
So where do I go next with the library? Options that I see are:
...
2. Continue to develop the library in the sandbox to  multi-machine
implementation and work towards submitting that for formal review. Is there
interest for this?
I would say 2 is the best option.  People have been splitting tasks
into multiple operations and combining the results on single machines
for ages -- MapReduce doesn't really offer any innovation there.  The
innovation, and the buzz about it, is that it offers a reliable,
general-purpose, and large-scale distributed implementation of this
very basic idea.  If you can accomplish that in this library, I think
there will be _a lot_ more interest.

I think a lot of the MapReduce buzz also has to do with the services
tied to it that further ease common scalability bottlenecks, the big
ones being Google File System and BigTable.  It's really just part of
the bigger ecosystem.

-- 
Cory Nelson
http://int64.org