New subject: [random] random_device only available on which platforms?

4 Aug 2006

      Bugs in boost random number distributions:

Configuration:
    Boost 1.32.0
    Linux 2.6.16, Fedora Core 4
    gcc 4.0.2

Preamble
    #include <boost/random.hpp>
    using namespace boost;
    typedef boost::mt19937 random_generator;
    random_generator r(1);

(1) uniform_int<char>(low, high) produces values outside [low, high].  E.g.,

    variate_generator<random_generator&, uniform_int<char> >
      unichar(r, uniform_int<char>(-100, 100));

(2) uniform_int<short>(low, high) produces values outside [low, high].  E.g.,

    variate_generator<random_generator&, uniform_int<short int> >
      unishort(r, uniform_int<short int>(-20000, 20000));

(3) uniform_int<long long>(low, high) produces a distribution with the
wrong mean. E.g.,

    variate_generator<random_generator&, uniform_int<long long> >
      unill(r, uniform_int<long long>(0LL, 0xffffffffffLL));

has a mean of about 2.0e9 instead of 5.5e11.

(4) uniform_int<long long>(low, high) sometimes takes a LONG time.  E.g.,

    variate_generator<random_generator&, uniform_int<long long> >
      unillint(r, uniform_int<long long>(0LL, 0x10000000000LL));
    unillint();

takes about 40 seconds on a 800 MHz Intel PC.

(5) uniform_01<random_generator, float> uni01(r) sometimes returns a
result outside [0,1).  Specifically it can return 1.0f because
(2^32-1)/2^32 rounds to 1.

I haven't delved into the code to see where all these problems arise.
However, it appears that getting some of these examples "right" is
complicated because the underlying random generators that boost uses can
have arbitrary min() and max() values.

In the case of mt19937 (and some of the other random generators) which
provides 32 bits of random data, the implementation can be made
considerably simpler and faster.  See for example

    http://charles.karney.info/random/

This is specialized to using the Mersenne-Twister generator.  However,
uniform integer, uniform real, and Bernoulli distributions are then
efficient and exact (under the assumption that mt19937 is a perfect
source of random bits).  Thus the "right" number of random bits are
delivered for float, double, and long double distributions; the uniform
integer distribution is exact and faster than boost's inexact
uniform_smallint; and so on.

Perhaps the boost libraries should include specializations for those
generators which return a whole number of random bits.

Here is the code illustrating the bugs above...

================================================================

/* g++  -O2 -funroll-loops -o boost-testb boost-testb.cpp */
#include <iostream>
#include <ctime>
#include <boost/random.hpp>

#ifdef BOOST_NO_STDC_NAMESPACE
namespace std {
  using ::time;
}
#endif

using namespace std;
using namespace boost;

typedef mt19937 random_generator;

template<typename T> void testint(random_generator& r, size_t n, T x, T y) {
  double d = 0;
  variate_generator<random_generator&, uniform_int<T> >
    uniint(r, uniform_int<T>(x, y));
  for (size_t i = 0; i < n; ++i) {
    T z = uniint();
    d += z;
    if (z < x || z > y) {
      cout << "Out of bounds " << int(z) << endl;
      return;
    }
  }
  d /= double(n);
  if ( abs(d - (double(x)+double(y))/2.0) /
       ((double(y)-double(x))/sqrt(12.0*n)) > 10.0) {
    cout << "Mean wrong " << d << endl;
  }
}

template<typename T> void testreal(random_generator& r, size_t n) {
  T x = 0, y = 1;
  double d = 0;
  uniform_01<random_generator, T> unireal(r);
  for (size_t i = 0; i < n; ++i) {
    volatile T z = unireal();
    d += z;
    if (z < x || z >= y) {
      cout << "Out of bounds " << z << endl;
      return;
    }
  }
  d /= double(n);
  if ( abs(d - (double(x)+double(y))/2.0) /
       ((double(y)-double(x))/sqrt(12.0*n)) > 10.0) {
    cout << "Mean wrong " << d << endl;
  }
}

int main()
{
  random_generator r(1);

  testint<char>(r, 100, -100, 100);
  testint<short int>(r, 100, -20000, 20000);
  testint<long long>(r, 1000, 0, 0xffffffffffLL);
  testreal<float>(r, 100000000);

  variate_generator<random_generator&, uniform_int<long long> >
    unillint(r, uniform_int<long long>(0LL, 0x10000000000LL));
  int t0 = time(0);
  unillint();
  cout << "    " << time(0)-t0 << "s   per unillint()"
	    << endl;

  return 0;
}

-- 
Charles Karney <ckarney@sarnoff.com>
Sarnoff Corporation, Princeton, NJ 08543-5300

URL: http://charles.karney.info
Tel: +1 609 734 2312
Fax: +1 609 734 2662

Bugs in boost random number distributions

Charles Karney

Bronek Kozicki

Charles Karney

Damien Fisher

Charles Karney

Neal Becker

Jeremy Day

Charles Karney

Stephan Tolksdorf

AlisdairM

Charles Karney

AlisdairM

Stephan Tolksdorf

Jens Maurer

Beman Dawes

John Maddock

tags

participants (10)