Re: [boost] Proposal: MapReduce library (single machine)

Hi , I've just joined this mailing list because of this particular discussion :-)
2) have a plan so that the library can eventually range from working on 1 multi-core system to distributed
This is the eventual goal, and I have designed the library to be extensible to be able to do this in the future. With the policy based designed, making use of the open source projects that you mention should be non-intrusive; the library allows the user to supply a handler for the intermediate files (to use kosmosfs or hypertable, for example)
I'm going to give it a spin as it happens I'm already having a half-baked distributed backend lacking a mapreduce library for both hypertable and KFS and report back. But this is interesting! Can't stop thinking about it now :-) best, Mateusz

I've just joined this mailing list because of this particular discussion :-)
2) have a plan so that the library can eventually range from working on 1 multi-core system to distributed
This is the eventual goal, and I have designed the library to be extensible to be able to do this in the future. With the policy based designed, making use of the open source projects that you mention should be non- intrusive; the library allows the user to supply a handler for the intermediate files (to use kosmosfs or hypertable, for example)
I'm going to give it a spin as it happens I'm already having a half-baked distributed backend lacking a mapreduce library for both hypertable and KFS and report back. But this is interesting! Can't stop thinking about it now :-)
Mateusz, did you make any progress? I have completed the single-machine implementation; it's performance is comparable to other libraries such as Phoenix (http://mapreduce.stanford.edu) and has been tested by a few people on this list. I am thinking now about how to progress the library and am interested in your experiences. Thanks -- Craig
participants (2)
-
Craig Henderson
-
Mateusz Berezecki