On 16 Jan 2008, at 16:49, Milosz Marian Hulboj wrote:
Hello,
I was looking through the documentation of boost.accumulators for calculating the variance in one pass. It seems that for so called 'immediate variance' some approximation is being introduced. However I think that it is not necessary (unless there is a hidden purpose?) and incremental calculation can be done according to the substitution schema described here: http://planetmath.org/encyclopedia/OnePassAlgorithmToComputeSampleVariance.h...
If I remember the original design discussion correctly the reason was the following: the recurrence relation you refer to is more complex and in addition needs a special case for he first sample. One thus needs to test each time whether the sample count is 1 or larger, and do more computations. Since the key requirement of the library was to have the highest possible efficiency, the simplified approximate equation was used that does not require an if and is faster. This is actually okay since the error introduced is small compared to the unavoidable sampling error. You should have no problem, however, to implement your own variant of the variance, e.g. variance_(accurate_immediate), using the equation you refer to, if you should for some reason need the extra accuracy. Matthias