
Kevin Martin wrote:
I think you should rename it categorical distribution though as it
Thanks, done so.
only samples a single trial from the distribution. I believe the
It's called discrete_distribution in the new standard. Also, the alias algorithm is more efficient. See attached. (Note that I haven't tried to make this implementation numerically bulletproof.)
discrete does not sort the weights, so this step has to be carried out beforehand to get a comparable basis. I've added a small test file to compare categorical and discrete, where discrete is 10x faster than categorical at initialization and is equally fast at sampling. However, the weights that I work with usually come like this: w <- exp(lw+offset), where offset satisties sum{w}<inf, which depends on the ordering of w (machine precision). So sorting has to be carried out first, before finding offset...