Re: [boost] [Review] Review of the Accumulators library begins today Jan 29

31 Jan 2007


      On Jan 31, 2007, at 7:47 PM, Eric Niebler wrote:
...
james.jones@firstinvestors.com wrote:
...
From: Matthias Troyer <troyer@phys.ethz.ch>
...
These were easy, trickier are robust estimators of variance and
moments, even trickier will be median estimators where I currently
even do not see how this should be done without storing most of the
time series.
Isn't this already a problem with sequences? Suppose you're  
storing the current median, and a new value comes along. What's  
the new median - without checking all the previous values? There  
are statistical estimates for this, but I don't know any exact way  
other than essentially resorting the data and checking the new  
median.
To calculate an exact median, you're correct. Of course, nothing stops
you from writing an accumulator that calculates the exact median by
storing all the samples seen so far. There are approximate median
algorithms that don't need to do that, though, and that's what  
Matthias
is referring to. Those might be tricky to combine.
Correct. These approximate (e.g. the p_square methods) algorithms are  
impossible to combine. Also many of the algorithms for correlated  
sequences of data are hard or impossible to combine.

Matthias