Re: [boost] Review of the Accumulators Library

The design is complex, but not needlessly so. Take the "pass an array to a function" design, and think about how you might answer the following questions:
- What if you wanted the function to calculate 5 different statistics, specified by you, and you want to call the function once, not 5 times.
- What if some of those statistics shared some intermediate partial results? You would want the function to only calculate them once, right?
Yes, I see your point. Maybe you could show and compare the two styles and where each is appropriate in the "getting started" portion of the documentation. John Maddock's statistic's library, which is also in the "review queue" presumably will have some overlap with the "accumulators" statistics library. Do you see this as an issue that would need to be addressed. How familiar are you with John's library? I wonder what the result would be if two of boost's best statisticians/mathmaticians got together on a design (one can only dream). I'm also interested your "data-series" library. I download it a few weeks ago, but it had some dependencies on libraries not currently in the boost distribution and I was unable to compile it. I did quickly read the documentation however and it looks to be built upon some of the same concepts of "accumulators" library. I suspect that the idea behind the two libraries (accumulators and data-series) evolved together, given their similar characteristics of how they handle intermediate values. (5 day-moving average of a 10-day moving average kind-of stuff). I have a background in data-series analysis and I'll make an effort to contribute in a more meaningful way when that one comes up for review.

Tom Brinkman wrote:
John Maddock's statistic's library, which is also in the "review queue" presumably will have some overlap with the "accumulators" statistics library. Do you see this as an issue that would need to be addressed. How familiar are you with John's library? I wonder what the result would be if two of boost's best statisticians/mathmaticians got together on a design (one can only dream).
There's no overlap between the two: Eric's library concentrate on data-processing, Paul and mine's on the tests you do once you have the results from that processing. The obvious thing at some point in the future is a "bridge" between the two that let's you do the tests straight from the raw data. In other words these two libraries are each building blocks upon which higher level libraries can be built. John.

Tom Brinkman wrote:
The design is complex, but not needlessly so. Take the "pass an array to a function" design, and think about how you might answer the following questions:
- What if you wanted the function to calculate 5 different statistics, specified by you, and you want to call the function once, not 5 times.
- What if some of those statistics shared some intermediate partial results? You would want the function to only calculate them once, right?
Yes, I see your point. Maybe you could show and compare the two styles and where each is appropriate in the "getting started" portion of the documentation.
Yes, I think a discussion along these lines would be a good addition to the documentation.
John Maddock's statistic's library, which is also in the "review queue" presumably will have some overlap with the "accumulators" statistics library. Do you see this as an issue that would need to be addressed. How familiar are you with John's library? I wonder what the result would be if two of boost's best statisticians/mathmaticians got together on a design (one can only dream).
As John already mentioned, the potential for synergy between our libraries is big. Stay tuned.
I'm also interested your "data-series" library. I download it a few weeks ago, but it had some dependencies on libraries not currently in the boost distribution and I was unable to compile it. I did quickly read the documentation however and it looks to be built upon some of the same concepts of "accumulators" library.
Strange, I thought I got Time_series to compile with Boost 1.33.1. I'll look into that. You're right, the two libraries share design elements. They even share some code.
I suspect that the idea behind the two libraries (accumulators and data-series) evolved together, given their similar characteristics of how they handle intermediate values. (5 day-moving average of a 10-day moving average kind-of stuff). I have a background in data-series analysis and I'll make an effort to contribute in a more meaningful way when that one comes up for review.
Both libraries were the brain-child of Daniel Egloff of ZKB, and they were both designed by Daniel, Dave Abrahams, Matthias Troyer and myself. It's no small surprise that they are similar. There was good feedback about Time_series here -- looks like I should get it in the review queue before the queue gets any longer. -- Eric Niebler Boost Consulting www.boost-consulting.com
participants (3)
-
Eric Niebler
-
John Maddock
-
Tom Brinkman