[random] uniform_int is biased

I've observed that the sum of one million [-1..+1] random numbers produced by boost::uniform_int<> is consistently negative, usually between -3000 and -5000. Looking at the source reveals that the [0..N) -> [0..2] case, where N is 2^32, appears to be done with return (r / F) % 3, where F is 2^24. So, unless I'm missing something, this only uses 8 bits of the engine, and since 256 is not evenly divisible by 3, there is a slight bias of 1/256 which corresponds to -3906.25. Leaving aside the aggressive trimming of the low-order bits, which I believe is unnecessary when using mt19937 (is there ever a reason to use any other engine?) I still don't understand why we don't just use return r / K; where K is N / 3 and the values that exceed the target range are rejected to eliminate the bias entirely. This appears to simplify the code since (a) the separate "do we need rejection" test no longer needs to be done and is folded into the main case and (b) it uses the high-order bits, so it should work for LCGs as is.
participants (1)
-
Peter Dimov