Sean Luke wrote:
> Pelle Evensen wrote:
>> Randy Casstevens wrote:
>>> Do some random seeds give better quality results for
>>> MersenneTwisterFast? I have heard that random seeds should be odd
>>> numbers, large numbers, and/or prime numbers, but does it matter for
>>> MersenneTwisterFast?
>>>
>> In short, for any good quality PRNG (the MersenneTwister would be
>> one) the choice of seed does not matter at all as far as "quality" goes.
>
> Actually, many very high-grade RNGs are known to have certain
> pathological seed cases. The crucial issue is: do we know what those
> cases are? If so, we're fine -- just avoid those seed choices.
Using the definition from "Definition 1" of
http://www.iro.umontreal.ca/~lecuyer/myftp/papers/cacm90.pdf (P.
L'Ecuyer, ``Random Numbers for Simulation'', Communications of the ACM,
33 (1990), 85--98),
I distinguish between the *state*, S, and the *seed*, s_0. My point was
that if the API is well designed, s_0 is indeed in S but the set of
possible (reasonable, if you wish) s_0 is a strict subset of S. Well
designed in the sense that the implementer of the PRNG at hand most
likely can make sure that any s_0 is transformed into a unique initial
state for the generator where pathological behaviours are avoided.
> Mersenne Twister is no exception: and it's considered among best of
> the best! MT's internal state is an array of about 600 32-bit
> integers. Its seeding pathology is to fill that array initially with
> lots of zeros. It takes Mersenne Twister a while to work its way out
> of that state, during which time it tends to produce lots of
> relatively poorly random numbers.
And this is exactly why one should distinguish between seed and state;
you may want to be able to access the state to make sure that you can
clone a stream or resume a previous simulation but one should not
manipulate it by direct means, if that makes sense.
The "best of the best" is as usual depending on what you want to use it
for. It does fail some randomness tests;
See page 29 of
http://www.iro.umontreal.ca/~lecuyer/myftp/papers/testu01.pdf
It is possibly also slightly slower (the measurements shown in the
TestU01 paper are from C-code) and has a shorter period than some other
generators tested on pages 28-29.
> This turned out to be a problem for early seeding algorithms common
> among the Mersenne Twister community. The current standard seeding
> algorithm is to take a seed and hand it off to a simple but relatively
> random RNG (in this case, a particular linear congruential generator
> in one of Knuth's books), and pump that generator 600-ish times. Each
> output of the generator gets put into a different slot in the Mersenne
> Twister array. That gives you a nice random collection of numbers,
> and an acceptably small number of initial zeros.
>
> FWIW, my MersenneTwisterFast code follows the current standards.
So this is the seed expansion or seed -> state transformation which
(hopefully) makes sure that you can not set up the MT in any
particularly bad state.
>> The only thing to possibly worry about is if you start many different
>> simulations with nearby seeds. For some generators (depending on how
>> the seed is expanded to the internal state of the generator) this
>> will be a problem and for some others, it won't be. What the state of
>> the Mersenne Twister as implemented in MASON is, I don't know.
>
> Generally MT doesn't have this problem, partly because as an algorithm
> it's pretty robust to that issue, and partly because the Knuth seeding
> procedure described above creates very different initial internal
> array states from very similar random numbers. Overall, most high
> grade RNGs are robust against similar seeds.
A peculiarity about the MersenneTwisterFast class is that the
constructor taking a long and the setSeed(long) (I guess for some
compatibility with java.util.Random) ignores the topmost 32 bits. Since
there is nothing magical about the seed expansion, and Java handles 64
bit arithmetic just fine, why not use a 64-bit generator or two separate
32 bit generators to initialize the state (mt[...]) in an interleaved
fashion? It should still be trivial to guarantee that we won't get a bad
state.
Yet another peculiarity is that all conversions from int to the lower
precision types is made by shifting instead of bitmasking. Is this
faster for most CPU's? If there is an explicit conversion, the masking
will probably take place anyway. I don't know if the javac compiler or
JIT-compiler is clever enough to recognize that the msb:s are all zero.
For a linear congruential generator this makes sense but is there some
results (published or not) that supports the hypothesis that the msb:s
from MT19937 are any "worse" than the lsb:s?
>> A better (or at least safer) approach than the Mersenne Twister would
>> be to have a generator that generates distinct streams. This article
>> discusses the desirability of distinct streams;
>> http://www.iro.umontreal.ca/~lecuyer/myftp/papers/rstream.pdf
> Having a generator which is forkable is a nice feature. But I would
> be very hesitant to rely on an algorithm that doesn't have fairly
> provable features: and the paper you cited is actually a technical
> report for a code library, not an algorithmic description. My
> suggestion about random numbers: BE VERY CAREFUL ABOUT DEVIATING FROM
> WELL-STUDIED ALGORITHMS. I don't get a good feeling about the lack of
> formality in this paper.
I should have pointed to this paper instead;
http://www.iro.umontreal.ca/~lecuyer/myftp/papers/streams00.pdf
<http://www.iro.umontreal.ca/%7Elecuyer/myftp/papers/streams00.pdf> P.
L'Ecuyer, R. Simard, E. J. Chen, and W. D. Kelton, ``An
Objected-Oriented Random-Number Package with Many Long Streams and
Substreams'', Operations Research, 50, 6 (2002), 1073--1075.
This provides an algorithmic description. I don't think anyone should
worry too much about the lack of formality in
http://www.iro.umontreal.ca/~lecuyer/myftp/papers/rstream.pdf. L'Ecuyer
may be the most prolific writer of journal papers on PRNG:s and PRNG
testing. A few of his papers are co-authored with Matsumoto of MT-fame.
Sorry for pointing to the wrong paper; it was mostly the reasoning and
motivation for independent streams I was after.
Sean Luke: Have you measured the performance effects of having the
MersenneTwisterFast implement some sensible interface? (Yeah, so
java.util.Random should *really* be an interface, not a class)?
If one is to do exact replication of some existing program written in a
different environment it may be much easier to replace the PRNG in MASON
than doing it in the original program. Being able to replace the
generator with a quasi-random generator or something easily observed as
deterministic would also simplify some testing and debugging.
> Stream protocols etc. are often implementable directly on top of
> well-regarded generators. MT would be particularly good for this
> given its famously gigantic period. It might be worthwhile to
> investigate the current RNG literature for formal protocols.
Yes, if any one is aware of fast methods of transforming an arbitrary
(but good) generator into one with independent streams, I'd very much
like to hear about it.
It can be done trivially but *not fast* by using any (reasonably strong)
symmetric block cipher like this;
Let $R$ be a simulation-quality PRNG outputting $n$-bit words.
Let $E_k(m)$ be encryption of $k$-bit message $m$ under the key $k$.
Note that $n \le k$ for this to work.
Now for each output of $R$, we run it through $E$ and since we do expect
all permutations to be equally likely, the streams would be independent,
of course under the assumption that the original generator has a period
shorter than $2^k$.
I will hopefully soon find the time to try this with reduced versions of
some common block ciphers. It may very well be that the function call
overhead is so large that can still be considered fast if one does batch
generation of numbers, in a way similar to what the MT19937
implementation does in MASON.
Even though the state space of MT19937 is huge, we don't have any
theoretical guarantees that two different seeds don't make us use two
sequences that are (partially) overlapping.
/Pelle
|