There's a new version at: https://github.com/DEShawResearch/Random123-Boost that I think addresses your concerns. Click on the link to see the new README. On Fri, May 2, 2014 at 11:43 AM, Steven Watanabe <watanabesj@gmail.com>wrote:
AMDG
On 04/29/2014 10:00 AM, John Salmon wrote:
I've dusted off the code I wrote a couple of years ago and republished
it,
now on github:
<snip>
CBRNGs overcome all these problems. With CBRNGs, the code looks like this (see libs/random/examples/counter_based_example.cpp for a fully worked example):
using namespace boost::random; typedef threefry<4, uint32_t> Prf; normal_distribution nd; Prf::key_type key = {seed}; Prf prf(key); // seed may come from command line for(size_t i=0; i<atoms.size(); ++i){ float boltzmannfactor = sqrt(kT/atoms[i].mass); Prf::domain_type d = {atoms[i].id, timestep, THERMALIZE_CTXT}; counter_based_urng<Prf> cbrng(prf, d); nd.reset(); atoms[i].vx = boltzmannfactor*nd(cbrng); atoms[i].vy = boltzmannfactor*nd(cbrng); atoms[i].vz = boltzmannfactor*nd(cbrng); }
Is there a good reason why this shouldn't be written as:
for(...) { ... std::seed_seq s{ seed, atoms[i].id, timestep, THERMALIZE_CTXT }; counter_based_engine<Prf> cbrng(s); ... }
Seed_seq is the wrong tool for the job. It's a middleman who charges a hefty commission (time and space) and then delivers your product slightly damaged. You give it values that you have carefully chosen so that they're *guaranteed* to be unique. Seed_seq then transforms them into values that are *probably* unique, but there are no guarantees. If you use it a lot, the Birthday Paradox will eat away at that "probably" and after a few billion uses you'll be in for a nasty surprise: two of your generators will produce exactly the same output. Seed_seq is an attempt to work around the problem that many generators (and some of the most popular ones) are difficult to initialize in a way that results in independent streams. The CBRNGs are designed to give you independent streams without the overhead or risk of a seed_seq. I think you'll like the new example code better. From the new README: typedef threefry<4, uint32_t> Prf; counter_based_engine<Prf, 32> cbeng(seed); for(size_t i=0; i<Natoms; ++i){ float boltzmannfactor = sqrt(kT/atoms[i].mass); normal_distribution nd(0., rmsvelocity); Prf::domain_type base = {atoms[i].id, timestep, THERMALIZE_CTXT}; cbeng.restart(base); atoms[i].vx = boltzmannfactor*nd(cbeng); atoms[i].vy = boltzmannfactor*nd(cbeng); atoms[i].vz = boltzmannfactor*nd(cbeng); } You can seed the counter_based_engine directly with an arithmetic seed, or, if you prefer, you can use the the much larger key_type, or you can pre-initialize a Prf and use that. In addition to seed()-ing (which meets all the standard expectations), counter_based_engine can be 'restart()-ed' with a new 'base counter'. This is how you associate different, independent sequences with whatever program elements you like: threads, atoms, whatever.
I don't really like making the details of the algorithm (i.e. key and domain) part of the primary interface.
I think it's important that they be available, but in the new version, you don't have to use them if you don't want to. A user *may* limit himself to using standard seeding methods and avoid restart(). Their one remaining advantage is that they offer constant-time discard(). The additional features that rely on the details of the underlying Prf are entirely optional.
Let's consider the code changes between the two fragments:
<snip>
- Since it models a URNG, cbrng can be passed as an argument to the normal_distribution, nd. In order to make each atom independent,
nd.reset()
is called each time through the loop.
<snip>
FWIW, I don't really think that reset was a good idea in the first place, and it's currently a no-op in all distributions. To run the loop in parallel, you'd need to make a copy of nd in every thread, anyway.
Fair enough. In the example, I've moved the normal distribution into the loop and removed reset().
counter_based_urng ------------------
The counter_based_urng class uses the pseudo-random property of PRFs to perform random number generation in accordance with the requirements of a UniformRandomNumberGenerator. <snip>
counter_based_engine --------------------
The counter_based_engine class is a templated adapter class that models a bona fide RandomNumberEngine. <snip>
Is there really a good reason to make two separate templates? A RandomNumberEngine is a URNG, by definition.
There was, but not any more. There is now only a counter_based_engine. It takes a template parameter that determines the length of individual sequences and the number of permissible 'base counters'. It has a 'restart()' method that resets the 'base counter' to begin a new sequence. And it has a few constructor() and seed() overloads that extend the capabilities of a generic Engine.