
Hi, On Sun, June 19, 2011 22:36, er wrote:
Hope this can serve as a basis for a conversation:
I'm assisting Eric with the maintenance of Accumulators. I've had a look through the code at the above link, and would like to offer the following comments (if I have misunderstood anything, please let me know). My basic concern with the code is that a map is used to store the counts of data-points that have been added (the map keys are the data-points, the map values are the counts). In real-world floating point data it is rare for two data-points to be exactly the same, so in practice the map would have a single key-value pair for each data-point q_i, of the form (key=q_i,value=1). This is inefficient, because all the key values will be 1. Also, the memory usage will grow linearly with the number of data-points accumulated, which doesn't seem to be in keeping with the spirit of the Accumulators library. For these reasons, I'm not convinced that the code should be added to the library in its current state. Simon.