
On 1/31/07, james.jones@firstinvestors.com <james.jones@firstinvestors.com> wrote:
From: Matthias Troyer <troyer@phys.ethz.ch>
These were easy, trickier are robust estimators of variance and moments, even trickier will be median estimators where I currently even do not see how this should be done without storing most of the time series.
Isn't this already a problem with sequences? Suppose you're storing the current median, and a new value comes along. What's the new median - without checking all the previous values? There are statistical estimates for this, but I don't know any exact way other than essentially resorting the data and checking the new median.
Good point. Combining partial results may not be possible for all accumulated statistics or there may be a cost to making a statistic "combinable". Perhaps a refinement of the Accumulator concept would be needed. The idea of the Accumulator library is to implement online algorithms; is there a corresponding word for online algorithms for which partial results can be combined? —Ben PS For reference, these two pages may be of use: http://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Algorithm_I... http://en.wikipedia.org/wiki/Sum_of_normal_distributions