[random] uniform_int is biased

14 Jan 2008

      I've observed that the sum of one million [-1..+1] random numbers produced 
by boost::uniform_int<> is consistently negative, usually between -3000 
and -5000.

Looking at the source reveals that the [0..N) -> [0..2] case, where N is 
2^32, appears to be done with return (r / F) % 3, where F is 2^24. So, 
unless I'm missing something, this only uses 8 bits of the engine, and since 
256 is not evenly divisible by 3, there is a slight bias of 1/256 which 
corresponds to -3906.25.

Leaving aside the aggressive trimming of the low-order bits, which I believe 
is unnecessary when using mt19937 (is there ever a reason to use any other 
engine?) I still don't understand why we don't just use

    return r / K;

where K is N / 3 and the values that exceed the target range are rejected to 
eliminate the bias entirely. This appears to simplify the code since (a) the 
separate "do we need rejection" test no longer needs to be done and is 
folded into the main case and (b) it uses the high-order bits, so it should 
work for LCGs as is.

Peter Dimov

tags

participants (1)