[accumulators] combine/parallel_reduce

Hi, I've just started to use the accumulators library. I must say that I like it a lot. However, I found one feature that I'm missing in the interface. In the time of multicore processors and Thread Building Blocks at hand, it would be nice to be able to use accumulators in parallel computations. To do so, one would need to combine partial computations. In other words, we need accumulator_set::combine(). This would combine the partial computation state from another accumulator_set. This operation would enable the accumulators to be used with parallel_reduce. I'm not expert in statistical computations, but as far as I can see, this should be doable quite easily. Best Regards, Pavol.

Pavol Droba wrote:
Hi,
I've just started to use the accumulators library. I must say that I like it a lot.
Great, thanks!
However, I found one feature that I'm missing in the interface. In the time of multicore processors and Thread Building Blocks at hand, it would be nice to be able to use accumulators in parallel computations.
To do so, one would need to combine partial computations. In other words, we need accumulator_set::combine(). This would combine the partial computation state from another accumulator_set.
This operation would enable the accumulators to be used with parallel_reduce.
I'm not expert in statistical computations, but as far as I can see, this should be doable quite easily.
Makes sense, and if I recall this has been discussed before. I even seem to recall Matthias saying that something like this is planned. Matthias? At any rate, could you open a feature request trac ticket for this so we don't lose it? Thanks. -- Eric Niebler BoostPro Computing http://www.boostpro.com

n 18 Aug 2008, at 07:47, Eric Niebler wrote:
Pavol Droba wrote:
Hi, I've just started to use the accumulators library. I must say that I like it a lot.
Great, thanks!
However, I found one feature that I'm missing in the interface. In the time of multicore processors and Thread Building Blocks at hand, it would be nice to be able to use accumulators in parallel computations. To do so, one would need to combine partial computations. In other words, we need accumulator_set::combine(). This would combine the partial computation state from another accumulator_set. This operation would enable the accumulators to be used with parallel_reduce. I'm not expert in statistical computations, but as far as I can see, this should be doable quite easily.
Makes sense, and if I recall this has been discussed before. I even seem to recall Matthias saying that something like this is planned. Matthias?
Yes, it is planned to happen in the next year. While this is doable for simple estimators (sum, mean, variance, moments, min, max) it is complicated for others (binning analysis for correlated samples: there we need a different data structure for combined accumulators) and impossible for others (e.g. some median estimators). Since we plan to use the accumulator library potentially on up to 100'000 or more cores we need to think of a good scalable design before starting the implementation.
At any rate, could you open a feature request trac ticket for this so we don't lose it? Thanks.
Matthias

Matthias Troyer wrote:
n 18 Aug 2008, at 07:47, Eric Niebler wrote:
Pavol Droba wrote:
Hi, I've just started to use the accumulators library. I must say that I like it a lot.
Great, thanks!
However, I found one feature that I'm missing in the interface. In the time of multicore processors and Thread Building Blocks at hand, it would be nice to be able to use accumulators in parallel computations. To do so, one would need to combine partial computations. In other words, we need accumulator_set::combine(). This would combine the partial computation state from another accumulator_set. This operation would enable the accumulators to be used with parallel_reduce. I'm not expert in statistical computations, but as far as I can see, this should be doable quite easily.
Makes sense, and if I recall this has been discussed before. I even seem to recall Matthias saying that something like this is planned. Matthias?
Yes, it is planned to happen in the next year. While this is doable for simple estimators (sum, mean, variance, moments, min, max) it is complicated for others (binning analysis for correlated samples: there we need a different data structure for combined accumulators) and impossible for others (e.g. some median estimators).
I see, yet even if it won't work for all of the accumulators, the feature could be very important.
Since we plan to use the accumulator library potentially on up to 100'000 or more cores we need to think of a good scalable design before starting the implementation.
This looks realy interesting. Best regards, Pavol

Eric Niebler wrote: <snip>
To do so, one would need to combine partial computations. In other words, we need accumulator_set::combine(). This would combine the partial computation state from another accumulator_set.
This operation would enable the accumulators to be used with parallel_reduce.
I'm not expert in statistical computations, but as far as I can see, this should be doable quite easily.
Makes sense, and if I recall this has been discussed before. I even seem to recall Matthias saying that something like this is planned. Matthias?
At any rate, could you open a feature request trac ticket for this so we don't lose it? Thanks.
Done. Regards, Pavol
participants (3)
-
Eric Niebler
-
Matthias Troyer
-
Pavol Droba