
Cromwell Enage wrote:
What is your evaluation of the design?
Simple and straightforward. Its extensibility is a big plus. I am not a statistician, however, so I cannot judge its usability in that regard.
I'm curious about the design of weighted samples in an accumulator_set. Can operations other than multiplication be applied to the weight?
Each accumulator is free to do anything with the weight parameter that it sees fit, including ignore it completely. But weighted samples have a well understood meaning in statistics, and giving it a different meaning would probably lead to confusion.
I also echo John Maddock's request for pushing the elements of a sequence to an accumulator. By logical extension, the ability to add objects of the same accumulator_set type together should be considered as well.
The first is a simple extension. The second, no less useful, is less simple, but not impossible, AFAIK.
What is your evaluation of the implementation?
Nicely done. I'm sure it's first-rate.
What is your evaluation of the documentation?
Aack.
In addition to the gentler introductory tutorial previously suggested by others, I would also like to see a motivation and/or rationale that is more explanatory than the "old adage", e.g. "Why Not Just A for Loop?" or "Going Beyond std::accumulate".
Agreed.
Usually, I expect the reference documentation to be categorized by class and/or function instead of by header. Staring at a long list of #includes, even as well organized as they are, does not raise my confidence in my ability to comprehend the inner workings of a library like this one.
I actually agree with this, but it's not a problem specific to the Accumulators reference section. I'm using the standard Doxygen/BoostBook integration that is part of Boost's documentation tool chain. It has been observed before that the header-based categorization is less than ideal, but nobody has stepped up to improve it. This would be an opportunity for someone to make a huge contribution to Boost. Takers?
What is your evaluation of the potential usefulness of the library?
Even outside the field of statistics, I sense its great value in large-scale applications. However, for small-scale programs (like the neural network example I recently added to my as-yet-unannounced automata library), it's hard to beat the equivalent for loops in terms of readability and efficiency.
Hand-coded loops are the gold-standard for performance, that's true, but ideally a higher-level abstraction should be more readable, not less. And templates let us have the abstraction without the penalty. IMO, it's a matter of familiarity. People new to STL might feel that a for-loop is more readable that a call to std::transform(), for instance, but not me.
Did you try to use the library?
Tried and succeeded.
With what compiler?
GCC 3.4.5 (MinGW special)
Did you have any problems?
Not at this time, no.
How much effort did you put into your evaluation? A glance? A quick reading? In-depth study?
I'd say a quick reading. Enough to comprehend the tutorials, then a few excursions within the reference material.
Are you knowledgeable about the problem domain?
I am familiar with the basics, so I'm not that intimidated by "kurtosis" and other finer details.
Do you think the library should be accepted as a Boost library?
The shape of its documentation is its biggest weakness right now, but it is outweighed by the robustness of its design. I know that the documentation can be improved and I trust that it will be improved. I vote yes.
Thanks, Cromwell. -- Eric Niebler Boost Consulting www.boost-consulting.com