[Autosave] Re: [math][accumulators] Empirical distribution function

8 Aug 2011

      Hi,

On Sun, June 19, 2011 22:36, er wrote:
...
Hope this can serve as a basis for a conversation:
https://svn.boost.org/svn/boost/sandbox/acc_ecdf/
I'm assisting Eric with the maintenance of Accumulators. I've had a look
through the code at the above link, and would like to offer the following
comments (if I have misunderstood anything, please let me know).

My basic concern with the code is that a map is used to store the counts of
data-points that have been added (the map keys are the data-points, the
map values are the counts). In real-world floating point data it is rare
for two data-points to be exactly the same, so in practice the map would
have a single key-value pair for each data-point q_i, of the form
(key=q_i,value=1). This is inefficient, because all the key values will
be 1. Also, the memory usage will grow linearly with the number of
data-points accumulated, which doesn't seem to be in keeping with the
spirit of the Accumulators library.

For these reasons, I'm not convinced that the code should be added to the
library in its current state.

Simon.

Simon West

er

Vadim Stadnik

tags

participants (3)