
Hi, I did not have a change to participate in a review because I was busy at the time(as it is usual lately). I did like the submission when it originally appeared. And now after reading docs some more I could only congratulate Eric for yet another good addition to the boost. Being the author of similar (though obviously not that powerful) component myself I find two featured omitted. I must admit that I may've easily missed something. But here we go. 1. accumulator composition It would be a pity if I need to write new accumulator any time I need to combine features implemented by two existing one. For example min/max average value min/max change rate 2. timing policy. In many of my real life projects that require some statistical value it almost always need to be combined with the time of the event. For example when particular value reached it's maximum? when particular value reached it's minimum? when particular value was last changed? or more specific: When 10 sec throughput average reached it's maximum? similar idea could be applied to any statistics that model some extreme value. IMO framework should support this in a form of timing policy somewhere. Another general comment. I personally would find single changing variable oriented interface more convenient and ore widely applicable (as opposed to the samples set). Variable could change in many ways (not only addition or subtraction, and even those could be done more conveniently with operator overloading). Essentially what I am looking for is something like this: tracked_var<....> v; v += 10; int i = v +1; v -= 5; v *= 2; cout << average( v ); cout << max( v ); cout << min( v ); cout << max( average( v ) ) << " @" min( average( v ) ).time(); Regards, Gennadiy

Gennadiy Rozental wrote:
Hi,
I did not have a change to participate in a review because I was busy at the time(as it is usual lately). I did like the submission when it originally appeared. And now after reading docs some more I could only congratulate Eric for yet another good addition to the boost.
Thanks!
Being the author of similar (though obviously not that powerful) component myself I find two featured omitted. I must admit that I may've easily missed something. But here we go.
1. accumulator composition
It would be a pity if I need to write new accumulator any time I need to combine features implemented by two existing one. For example
min/max average value min/max change rate
Sorry, I don't understand. A data series has one average. What is the max average?
2. timing policy. In many of my real life projects that require some statistical value it almost always need to be combined with the time of the event. For example
when particular value reached it's maximum? when particular value reached it's minimum? when particular value was last changed?
or more specific:
When 10 sec throughput average reached it's maximum?
similar idea could be applied to any statistics that model some extreme value. IMO framework should support this in a form of timing policy somewhere.
You could implement this with accumulators either using covariate data, where the times are covariate with the samples, or by using a std::pair< sample, time > as the value type of the accumulator, and defining an appropriate sort criterion.
Another general comment. I personally would find single changing variable oriented interface more convenient and ore widely applicable (as opposed to the samples set). Variable could change in many ways (not only addition or subtraction, and even those could be done more conveniently with operator overloading). Essentially what I am looking for is something like this:
tracked_var<....> v;
v += 10;
int i = v +1;
v -= 5;
v *= 2;
cout << average( v ); cout << max( v ); cout << min( v );
cout << max( average( v ) ) << " @" min( average( v ) ).time();
Interesting. Each mutating operation on v is considered a new sample? This is a less powerful interface (no way to express covariate data; eg., where is the time of each sample specified?), but might be cleaner for some applications. It would be pretty simple to implement such an interface on top of accumulator_set. -- Eric Niebler Boost Consulting www.boost-consulting.com

On 16 Feb 2007, at 12:19, Eric Niebler wrote:
Gennadiy Rozental wrote:
Another general comment. I personally would find single changing variable oriented interface more convenient and ore widely applicable (as opposed to the samples set). Variable could change in many ways (not only addition or subtraction, and even those could be done more conveniently with operator overloading). Essentially what I am looking for is something like this:
tracked_var<....> v;
v += 10;
int i = v +1;
v -= 5;
v *= 2;
cout << average( v ); cout << max( v ); cout << min( v );
cout << max( average( v ) ) << " @" min( average( v ) ).time();
Interesting. Each mutating operation on v is considered a new sample? This is a less powerful interface (no way to express covariate data; eg., where is the time of each sample specified?), but might be cleaner for some applications. It would be pretty simple to implement such an interface on top of accumulator_set.
One problem I see with this is: how does the tracked_var know when to record the current value if it does not change? Do I need to do v += 0. if I want to enforce a sample even when the value does not change? Matthias

On 16 Feb 2007, at 14:57, Matthias Schabel wrote:
One problem I see with this is: how does the tracked_var know when to record the current value if it does not change? Do I need to do
v += 0.
if I want to enforce a sample even when the value does not change?
How about :
v = v;
Ugly, since this means that any copy might trigger a sample. If one does things like t would prefer explicit sampling. v.sample(); but then one cold just as well write acc(v); Matthias

"Matthias Troyer" <troyer@phys.ethz.ch> wrote in message news:2BF31F01-A3AB-4B1F-9ACB-970CDF56632D@phys.ethz.ch...
On 16 Feb 2007, at 12:19, Eric Niebler wrote:
Gennadiy Rozental wrote:
Another general comment. I personally would find single changing variable oriented interface more convenient and ore widely applicable (as opposed to the samples set). Variable could change in many ways (not only addition or subtraction, and even those could be done more conveniently with operator overloading). Essentially what I am looking for is something like this:
tracked_var<....> v;
v += 10;
int i = v +1;
v -= 5;
v *= 2;
cout << average( v ); cout << max( v ); cout << min( v );
cout << max( average( v ) ) << " @" min( average( v ) ).time();
Interesting. Each mutating operation on v is considered a new sample? This is a less powerful interface (no way to express covariate data; eg., where is the time of each sample specified?), but might be cleaner for some applications. It would be pretty simple to implement such an interface on top of accumulator_set.
One problem I see with this is: how does the tracked_var know when to record the current value if it does not change? Do I need to do
v += 0.
if I want to enforce a sample even when the value does not change?
How do you do it with current interface? IMO it's comparitevely rare case and both v += 0 and v = v should work. Gennadiy

On 16 Feb 2007, at 17:30, Gennadiy Rozental wrote:
One problem I see with this is: how does the tracked_var know when to record the current value if it does not change? Do I need to do
v += 0.
if I want to enforce a sample even when the value does not change?
How do you do it with current interface?
IMO it's comparitevely rare case and both v += 0 and v = v should work.
It's actually a pretty common case. If you perform a Monte Carlo simulation and you reject an update to your configuration, you will need to measure the current value again. Matthias

"Matthias Troyer" <troyer@phys.ethz.ch> wrote in message news:CD2EF89F-C4F5-4DC7-B880-3168CAAC948E@phys.ethz.ch...
On 16 Feb 2007, at 17:30, Gennadiy Rozental wrote:
One problem I see with this is: how does the tracked_var know when to record the current value if it does not change? Do I need to do
v += 0.
if I want to enforce a sample even when the value does not change?
How do you do it with current interface?
IMO it's comparitevely rare case and both v += 0 and v = v should work.
It's actually a pretty common case. If you perform a Monte Carlo simulation and you reject an update to your configuration, you will need to measure the current value again.
Well, probably in your problem domain, not mine ;) I would guess that on a complete map of projects that need to keeps statistics of some values, Monte Carlo simulation may not be that prominent. Anyway. IMO almost any solution would be good enough for me. v.update() would work either. Gennadiy

"Matthias Troyer" <troyer@phys.ethz.ch> wrote in message news:B503A931-963F-4F5B-8DCF-C6617D48EAAD@phys.ethz.ch...
On 16 Feb 2007, at 19:24, Gennadiy Rozental wrote:
Anyway. IMO almost any solution would be good enough for me. v.update() would work either.
Why is
acc(v);
a problem if
v.update();
is fine?
I never said it's a problem. But from what understand it's completely different thing. acc(0) would be more close to v += 0; acc(v) is more like v *= 2; Genandiy

On 16 Feb 2007, at 19:47, Gennadiy Rozental wrote:
"Matthias Troyer" <troyer@phys.ethz.ch> wrote in message news:B503A931-963F-4F5B-8DCF-C6617D48EAAD@phys.ethz.ch...
On 16 Feb 2007, at 19:24, Gennadiy Rozental wrote:
Anyway. IMO almost any solution would be good enough for me. v.update() would work either.
Why is
acc(v);
a problem if
v.update();
is fine?
I never said it's a problem. But from what understand it's completely different thing. acc(0) would be more close to v += 0; acc(v) is more like v *= 2;
Now I'm really confused. acc is the accumulator, recording features like sum,min, max, mean, median, variance. In what sense is acc(0) like v +=0 and acc(v) like v *= 2????

"Matthias Troyer" <troyer@phys.ethz.ch> wrote in message news:2F2BAA4A-9540-49EF-80F5-3E22D0A6B615@phys.ethz.ch...
On 16 Feb 2007, at 19:47, Gennadiy Rozental wrote:
"Matthias Troyer" <troyer@phys.ethz.ch> wrote in message news:B503A931-963F-4F5B-8DCF-C6617D48EAAD@phys.ethz.ch...
On 16 Feb 2007, at 19:24, Gennadiy Rozental wrote:
Anyway. IMO almost any solution would be good enough for me. v.update() would work either.
Why is
acc(v);
a problem if
v.update();
is fine?
I never said it's a problem. But from what understand it's completely different thing. acc(0) would be more close to v += 0; acc(v) is more like v *= 2;
Now I'm really confused. acc is the accumulator, recording features like sum,min, max, mean, median, variance. In what sense is acc(0) like v +=0 and acc(v) like v *= 2????
Umm. I maybe wrong. Here is correct correspondence (<==> means equivalent): accumulator_set<....> acc; tracked_var<....> v; acc(s1) ; <==> v = s1; acc(s2) ; <==> v = s1; acc(s3) ; <==> v = s3; v+=s1 doesn't really have anything corresponding in current interface. Maybe something like this: T value = 0; accumulator_set<....> acc; tracked_var<....> v; value += s1; v += s1; <==> acc(value); value += s2; v += s2; <==> acc(value); value += s3; v += s3; <==> acc(value); value -= s4; v -= s4; <==> acc(value); value += 0; v += 0; <==> acc(value); If you think this as collecting samples of some variable values v+=0 look pretty natural. if you think about it as accumulating set of values acc(value) looks better. Doesn't matter to me either way. It still does the same thing. Gennadiy

One problem I see with this is: how does the tracked_var know when to record the current value if it does not change? Do I need to do
v += 0.
if I want to enforce a sample even when the value does not change?
There's no guarantee for arbitrary type that operator+= doesn't change v... Matthias

"Eric Niebler" <eric@boost-consulting.com> wrote in message news:45D611ED.1050300@boost-consulting.com...
Gennadiy Rozental wrote:
Hi,
I did not have a change to participate in a review because I was busy at the time(as it is usual lately). I did like the submission when it originally appeared. And now after reading docs some more I could only congratulate Eric for yet another good addition to the boost.
Thanks!
Being the author of similar (though obviously not that powerful) component myself I find two featured omitted. I must admit that I may've easily missed something. But here we go.
1. accumulator composition
It would be a pity if I need to write new accumulator any time I need to combine features implemented by two existing one. For example
min/max average value min/max change rate
Sorry, I don't understand. A data series has one average. What is the max average?
You seems to be looking on this from the phisical experiment prospective. I mostly dealing with realtime data. I track samples of particular vaiable over period of time. Both the variable itself and all statistics derived from the variable samples are function of time. The value of statistic (in above example it's avarage) is a derived variable. For which I could apply another statistic (in above example I want to track it's maximum and minimum). This is what I mean by accumulator composition. Very similar to function composition. Average(v) is a function of v. Max(x) is a function of Max(Average(v)) is a composition.
2. timing policy. In many of my real life projects that require some statistical value it almost always need to be combined with the time of the event. For example
when particular value reached it's maximum? when particular value reached it's minimum? when particular value was last changed?
or more specific:
When 10 sec throughput average reached it's maximum?
similar idea could be applied to any statistics that model some extreme value. IMO framework should support this in a form of timing policy somewhere.
You could implement this with accumulators either using covariate data, where the times are covariate with the samples, or by using a std::pair< sample, time > as the value type of the accumulator, and defining an appropriate sort criterion.
I am not an expert in your library. So could you present some working example? One note though. time is different than weight. Weight should be supplied along with each value. An accumulator should be able deduce current time by itself without user's involvement (based on some timing policy). Unless I misunderstand things it should be enough for you to support policy based last_update_time accumulator and an ability to compose the accumulators I refer in item 1 to solve all timing needs. last_update_time keeps track of last time sample being added to the set. To get time variable reached it's maximum I would create a composition from last_update_time and max accumulators
Another general comment. I personally would find single changing variable oriented interface more convenient and ore widely applicable (as opposed to the samples set). Variable could change in many ways (not only addition or subtraction, and even those could be done more conveniently with operator overloading). Essentially what I am looking for is something like this:
tracked_var<....> v;
v += 10;
int i = v +1;
v -= 5;
v *= 2;
cout << average( v ); cout << max( v ); cout << min( v );
cout << max( average( v ) ) << " @" min( average( v ) ).time();
Interesting. Each mutating operation on v is considered a new sample? This is a less powerful interface (no way to express covariate data;
not nesseserily v += make_pair(5,2) should work. O nthe other handthis interface allows to refer to the "current value" of the variable to simplify operation performed with it.
eg., where is the time of each sample specified?),
No. I expect time to be collected automatically.
but might be cleaner for some applications. It would be pretty simple to implement such an interface on top of accumulator_set.
I would like to see this as part of your framework. Gennadiy
participants (4)
-
Eric Niebler
-
Gennadiy Rozental
-
Matthias Schabel
-
Matthias Troyer