Re: [boost] Review Request: Accumulators Framework and Statistical Accumulators Library

20 Nov 2006

      Michael Stevens wrote:
...
I an very interested in this framework so I have started to take a look.
General Accumulators are something I could make use of myself. I really like 
the conceptual design of the framework, and how it allows accumulators to be 
inter-dependant..
Great! Glad you like it.
...
After a quick browse through the documentation I decided to take a look at the 
code. In particular I was interested in the numerics.
I think the current implementation has some serious numerical weaknesses. I 
looked at 2 algorithms 'sum' and 'variance':
In 'sum' I expected to see a compensated summation, this is numerically a lot 
better then just adding the numbers together.
'sum' is one I implemented. I'm not surprised to learn there are better 
approaches.

The framework allows for different implementation strategies for the 
statistics, though. Using the extensibility features, you can define 
your own "compensated_sum" accumulator and declare that it satisfies the 
"sum" feature (so that "compensated_sum" and "sum" are indistinguishable 
from the POV of dependency resolution), and even come up with clever 
syntax for it, like:

   accumulator_set< double, features< sum(compensated) > > acc;

You might even try writing "compensated_sum" yourself and submitting it, 
just to see what happens. :-)

The questions of what the default "sum" should do, and what alternate 
implementations should be provided, are open.
...
The 'variance' accumulator has a lazy calculation of variance using the 
formula     \sigma_n^2 = M_n^{(2)} - \mu_n^2. This formal is specifically 
cited for it poor performance in the presence of rounding error. Indeed it 
may even return negative results.
Any chance of getting your statistics guys to take a look at the numerics of 
the solutions? If people were to use library as is they would be in for nasty 
surprised!
I'll forward this message off to the stats guys. This would certainly be 
a good issue to re-raise once the review starts.

--
Eric Niebler
Boost Consulting
www.boost-consulting.com