
Hi , I've just joined this mailing list because of this particular discussion :-)
2) have a plan so that the library can eventually range from working on 1 multi-core system to distributed
This is the eventual goal, and I have designed the library to be extensible to be able to do this in the future. With the policy based designed, making use of the open source projects that you mention should be non-intrusive; the library allows the user to supply a handler for the intermediate files (to use kosmosfs or hypertable, for example)
I'm going to give it a spin as it happens I'm already having a half-baked distributed backend lacking a mapreduce library for both hypertable and KFS and report back. But this is interesting! Can't stop thinking about it now :-) best, Mateusz