[Random] Mersenne Twister initialization problem [PATCH]

Hi. The Mersenne Twister RNG was updated by the original author in 2002, because of a small problem in the seeding procedure, which was that the most significant bit of the seed was not well reflected to the state vector. Read http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/MT2002/emt19937ar.html for more details. The version implemented in Boost is the original 1998 one, which contains the problem. The attached patch implements the new seeding procedure, which uses something different than the linear congruential RNG to initialize the state vector. I haven't identified the new procedure to match any generator already implemented in Boost, so I just wrote it there (please correct me if I'm wrong). I have compared the output to the official implementation output, and it's identical. If the patch is OK, could it be applied? Thanks. -- Tiago de Paula Peixoto <tiago@forked.de>

Tiago de Paula Peixoto wrote:
Hi.
The Mersenne Twister RNG was updated by the original author in 2002, because of a small problem in the seeding procedure, which was that the most significant bit of the seed was not well reflected to the state vector. Read http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/MT2002/emt19937ar.html for more details.
The version implemented in Boost is the original 1998 one, which contains the problem. The attached patch implements the new seeding procedure, which uses something different than the linear congruential RNG to initialize the state vector. I haven't identified the new procedure to match any generator already implemented in Boost, so I just wrote it there (please correct me if I'm wrong).
I have compared the output to the official implementation output, and it's identical.
This will cause the sequences to differ in the case of default initialization, correct? This will cause test vectors generated by the old implementation to mismatch with new implementation. If so, please note this in the documentation.

On Tue, 2005-01-25 at 09:58 -0500, Neal Becker wrote:
Hi.
The Mersenne Twister RNG was updated by the original author in 2002, because of a small problem in the seeding procedure, which was that the most significant bit of the seed was not well reflected to the state vector. Read http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/MT2002/emt19937ar.html for more details.
The version implemented in Boost is the original 1998 one, which contains the problem. The attached patch implements the new seeding procedure, which uses something different than the linear congruential RNG to initialize the state vector. I haven't identified the new procedure to match any generator already implemented in Boost, so I just wrote it there (please correct me if I'm wrong).
I have compared the output to the official implementation output, and it's identical.
This will cause the sequences to differ in the case of default initialization, correct? This will cause test vectors generated by the old implementation to mismatch with new implementation. If so, please note this in the documentation.
Not only in the case of default initialization, but every time, since the new way the seed value populates the state vector is totally different. The only case the output sequence will be the same, is when the user explicitly uses a linear congruential generator, with the same parameters as the current implementation, as a seeding procedure. So it's trivial to revert to the current behavior, if the user wants backwards compatibility: mt19937 rng; linear_congruential<uint32_t, 69069, 0, 0, 0> gen(seed_value); rng.seed(gen); This is identical to the following, in the current (broken) implementation: mt19937 rng; rng.seed(seed_value); And yes, I agree, this should be noted in the documentation. -- Tiago de Paula Peixoto <tiago@forked.de>

The Mersenne Twister RNG was updated by the original author in 2002, because of a small problem in the seeding procedure, which was that the most significant bit of the seed was not well reflected to the state vector. Read http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/MT2002/emt19937ar.html for more details.
The version implemented in Boost is the original 1998 one, which contains the problem. The attached patch implements the new seeding procedure, which uses something different than the linear congruential RNG to initialize the state vector. I haven't identified the new procedure to match any generator already implemented in Boost, so I just wrote it there (please correct me if I'm wrong).
The problem we have here is that the behavior of our mersenne twister is now part of the C++ Technical Report 1, we should really report this as a defect, and get it changed there if this is indeed a problem. John.

On Tue, 2005-01-25 at 16:07 +0000, John Maddock wrote:
The Mersenne Twister RNG was updated by the original author in 2002, because of a small problem in the seeding procedure, which was that the most significant bit of the seed was not well reflected to the state vector. Read http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/MT2002/emt19937ar.html for more details.
The version implemented in Boost is the original 1998 one, which contains the problem. The attached patch implements the new seeding procedure, which uses something different than the linear congruential RNG to initialize the state vector. I haven't identified the new procedure to match any generator already implemented in Boost, so I just wrote it there (please correct me if I'm wrong).
The problem we have here is that the behavior of our mersenne twister is now part of the C++ Technical Report 1, we should really report this as a defect, and get it changed there if this is indeed a problem.
Regarding the severity of the problem, read the author's page above for more details. What it amounts to, in a nutshell (as I understand it), is that because the old seeding procedure doesn't reflect well the most significant bit of the seed value to the state vector, two seed values that are close to each other may generate series which are also close. This undermines the overall (very good) quality of the generator. It is a minor problem, because it doesn't happen for most pair of seeds, and therefore shouldn't happen very often, but it's a problem nevertheless. Also the version of the algorithm with the new seeding procedure is considered the "standard" by the original authors, and may be what users expect to find. And as a "bonus", the new procedure allows for seed values of zero. The current implementation in boost aborts at runtime if a zero seed is used. Since I'm not a boost developer, I don't think there's anything else I can do, besides posting the problem here with the fix. How difficult is it to change this in the C++ Technical Report? Is there any chance of this fix reaching CVS before then? Should I add this to the sourceforge bug system, so it can be tracked? Thanks. -- Tiago de Paula Peixoto <tiago@forked.de>

Tiago de Paula Peixoto wrote:
The Mersenne Twister RNG was updated by the original author in 2002, because of a small problem in the seeding procedure, which was that the most significant bit of the seed was not well reflected to the state vector. Read http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/MT2002/emt19937ar.html for more details.
Boost.Random has the updated seeding algorithm since April 2005. It is now mentioned in Boost's main index.htm changelog, too. The specification in the C++ ISO Working Group's Draft Library Technical Report has been changed as of April 2005. Jens Maurer
participants (4)
-
Jens Maurer
-
John Maddock
-
Neal Becker
-
Tiago de Paula Peixoto