
Hi Phil,
Quoting from the start of your docs:
"The Boost.MapReduce library is a MapReduce implementation across a plurality of CPU cores rather than machines."
Isn't that rather missing the point of what MapReduce is supposed to be about? If I'm limited to one machine, I can write parallel code using the full repertoire of techniques.
You can, and this is basically just another alternative technique. Writing multithreaded applications can be difficult, and is often done badly, so this library provides a framework to do the donkey-work and allow the developer to concentrate on solving their problem. Other libraries already exist for single-machine map/reduce (google for "phoenix mapreduce"), and there's an evaluation paper on it at http://csl.stanford.edu/~christos/publications/2007.cmp_mapreduce.hpca.pdf
By re-designing my application to fit into the MapReduce pattern I can potentially scale it over multiple machines. But if I can't scale over multiple machines, why bother?
In this scenario, then don't bother, indeed. But if you want easily to implement low-lock-contention multithreaded processing, then you might take a look.
Are you planning to support scaling over multiple machines in the future?
Yes, I am designing and developing a distributed file system that is aimed to achieve this (see http://craighenderson.co.uk/blog/index.php/tag/distributed-file-system/) or integration to any other DFS could do the same. The library is very much in its infancy, but I believe is useful enough to be a part of Boost in its single-machine state. Regards -- Craig