Re: [boost] [random] seeding with arbitrary integers

21 Oct 2009

      Does the documentation say that two different seeds will produce 
different sequences?  If so, is there a stated guarantee when they will 
diverge?  In general, unless you understand the guts of the algorithm -- 
including how a seed is turned into an initial state -- or have been 
given a guarantee by someone who presumably does, you should not assume 
that different seeds will definitely produce different results.  You 
should be able, however, to assume that two rngs seeded with two 
arbitrarily chosen seeds have a very low probability of remaining in 
sync for very long.  One mark of good quality for an implementation of 
an RNG is that the arbitrariness of the choice of those seeds doesn't 
have to be very high: seeds that are small, or negative if interpreted 
as signed, or powers of two or whatever don't cause problems.

When writing RNGs I generally do something like xor the seed with an 
arbitrary mask so that rather unarbitrary seeds become much more 
arbitrary ones.  That is a better solution than adding the minimum value 
to the seed, since it will not map small values into other slightly 
less-small values.  That circumstance makes it much more likely that two 
carelessly chosen seeds will map to the same place.

Addressing this issue, by the way, was most of my "contribution" to the 
initially-published/benchmark implementation of MT19937, that resulted 
in my name plastered all over the Net. A lot of visibility for an hour 
or so of code cleanup.   MT19937  passes very stringent randomness 
tests, both theoretical and pragmatic, under the assumption that the 
starting point is chosen randomly.  There are regions of the state 
space, though, where its behavior is very far from random.  The state 
space is so large, however, the probability of a random starting point 
hitting such a region is vanishingly small and can be ignored -- IF the 
seed is chosen randomly.  In particular if the 19937 bits in the state 
vector have very few 1 bits then the next generation of the state vector 
will also have very few 1 bits (though generally more than before).  
Certain casual choices for the 32 bit seed in the initially circulated 
version of the code could result in just this situation.  I suggested 
some changes (frankly, its been a while and I don't remember exactly 
what) so that one would have to work very hard to figure out a 32 bit 
seed value that would result in one ending up in this Random Number 
Sargasso Sea.

Topher Cooper

Steven Watanabe wrote:
...
AMDG
Currently, some PRNGs cannot be seeded with certain values.
For example, the seed for linear_congruential cannot be
a multiple of the modulus if the offset is zero.  lagged_fibonacci
also cannot be seeded with 0, because it uses minstd_rand0.
Since linear_congruential already tries to wrap the seed to fit
in the range, I'd like to change it so that any seed works.
The other generator that needs to be changed is linear_feedback_shift.
linear_feedback_shift cannot be seeded with a small integer.
What I've done is to add the minimum value to the seed, if
the seed is too small.  I'm not sure that this is a good solution,
since this means that there are a few (and only a few) seeds
that result in identical generators.  Thoughts?  Would it be
better to wrap the values to make them fit in range instead?
At least this would be consistent with linear_congruential.
I've attached a patch with these changes and test cases.
See also: http://svn.boost.org/trac/boost/ticket/3516
which prompted this investigation.
In Christ,
Steven Watanabe
------------------------------------------------------------------------
_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost