
Hi Robert, Robert Kubrick schrieb:
I changed the number of samples to 10, matching the original cache_size parameter. It works better:
first the behaviour of the density feature is not a bug. If you set the cache_size=10, then the first 10 samples will be used to determine the bin lower bound. That is, it first extracts the min and max from the first 10 samples, after that it creates the bin_positions according to this rule: bin_positions[i]=minimum + (i - 1.) * (maximum - minimum)/num_bins. After this procedure the calculation of the statistics begins. That's why the bin values and indexes are not initialized when sample size is smaller then the cache_size.
A couple of questions:
1) I have a real-time application that keeps track of some statistics based on the events received so the values can not be pre-determined. Statistics are reset by the application at regular time intervals. What happens if I leave the cache_size parameter to 1 Leaving the cache_size to one doesnt make sense, because min=max and bin_positions will all be the same. I suggest you to leave cache_size to two and create two artificial values acc(min=your choice) and acc(max=your choice). This wont have any statistical impact, if your sample size is large>100 enough. and then keep adding samples? Is there a serious performance penalty?
No there is no serious performance penalty.
2) Is there a way to reset the acc object in the example? What do you mean by that? Do you want to set the statistics to zero? Then you need definitely to deallocate the object and instantiate a new one.
Or do I have to deallocate the object and instantiate a new one to generate a new sampling Best regards, Kim