
Bugs in boost random number distributions: Configuration: Boost 1.32.0 Linux 2.6.16, Fedora Core 4 gcc 4.0.2 Preamble #include <boost/random.hpp> using namespace boost; typedef boost::mt19937 random_generator; random_generator r(1); (1) uniform_int<char>(low, high) produces values outside [low, high]. E.g., variate_generator<random_generator&, uniform_int<char> > unichar(r, uniform_int<char>(-100, 100)); (2) uniform_int<short>(low, high) produces values outside [low, high]. E.g., variate_generator<random_generator&, uniform_int<short int> > unishort(r, uniform_int<short int>(-20000, 20000)); (3) uniform_int<long long>(low, high) produces a distribution with the wrong mean. E.g., variate_generator<random_generator&, uniform_int<long long> > unill(r, uniform_int<long long>(0LL, 0xffffffffffLL)); has a mean of about 2.0e9 instead of 5.5e11. (4) uniform_int<long long>(low, high) sometimes takes a LONG time. E.g., variate_generator<random_generator&, uniform_int<long long> > unillint(r, uniform_int<long long>(0LL, 0x10000000000LL)); unillint(); takes about 40 seconds on a 800 MHz Intel PC. (5) uniform_01<random_generator, float> uni01(r) sometimes returns a result outside [0,1). Specifically it can return 1.0f because (2^32-1)/2^32 rounds to 1. I haven't delved into the code to see where all these problems arise. However, it appears that getting some of these examples "right" is complicated because the underlying random generators that boost uses can have arbitrary min() and max() values. In the case of mt19937 (and some of the other random generators) which provides 32 bits of random data, the implementation can be made considerably simpler and faster. See for example http://charles.karney.info/random/ This is specialized to using the Mersenne-Twister generator. However, uniform integer, uniform real, and Bernoulli distributions are then efficient and exact (under the assumption that mt19937 is a perfect source of random bits). Thus the "right" number of random bits are delivered for float, double, and long double distributions; the uniform integer distribution is exact and faster than boost's inexact uniform_smallint; and so on. Perhaps the boost libraries should include specializations for those generators which return a whole number of random bits. Here is the code illustrating the bugs above... ================================================================ /* g++ -O2 -funroll-loops -o boost-testb boost-testb.cpp */ #include <iostream> #include <ctime> #include <boost/random.hpp> #ifdef BOOST_NO_STDC_NAMESPACE namespace std { using ::time; } #endif using namespace std; using namespace boost; typedef mt19937 random_generator; template<typename T> void testint(random_generator& r, size_t n, T x, T y) { double d = 0; variate_generator<random_generator&, uniform_int<T> > uniint(r, uniform_int<T>(x, y)); for (size_t i = 0; i < n; ++i) { T z = uniint(); d += z; if (z < x || z > y) { cout << "Out of bounds " << int(z) << endl; return; } } d /= double(n); if ( abs(d - (double(x)+double(y))/2.0) / ((double(y)-double(x))/sqrt(12.0*n)) > 10.0) { cout << "Mean wrong " << d << endl; } } template<typename T> void testreal(random_generator& r, size_t n) { T x = 0, y = 1; double d = 0; uniform_01<random_generator, T> unireal(r); for (size_t i = 0; i < n; ++i) { volatile T z = unireal(); d += z; if (z < x || z >= y) { cout << "Out of bounds " << z << endl; return; } } d /= double(n); if ( abs(d - (double(x)+double(y))/2.0) / ((double(y)-double(x))/sqrt(12.0*n)) > 10.0) { cout << "Mean wrong " << d << endl; } } int main() { random_generator r(1); testint<char>(r, 100, -100, 100); testint<short int>(r, 100, -20000, 20000); testint<long long>(r, 1000, 0, 0xffffffffffLL); testreal<float>(r, 100000000); variate_generator<random_generator&, uniform_int<long long> > unillint(r, uniform_int<long long>(0LL, 0x10000000000LL)); int t0 = time(0); unillint(); cout << " " << time(0)-t0 << "s per unillint()" << endl; return 0; } -- Charles Karney <ckarney@sarnoff.com> Sarnoff Corporation, Princeton, NJ 08543-5300 URL: http://charles.karney.info Tel: +1 609 734 2312 Fax: +1 609 734 2662