February 2009


Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Sean Luke <[log in to unmask]>
Reply To:
Wed, 4 Feb 2009 23:47:40 -0500
text/plain (117 lines)
Pelle Evensen wrote:

> Using the definition from "Definition 1" of
> (P.
> L'Ecuyer, ``Random Numbers for Simulation'', Communications of the ACM,
> 33 (1990), 85--98),
> I distinguish between the *state*, S, and the *seed*, s_0.

Perhaps this is getting pedantic. By "seed" I mean "the data with which
you initialize the generator". By "state" I mean "the internal state of
the generator at any particular point."

> The "best of the best" is as usual depending on what you want to use it
> for.

I meant that it is very highly regarded, as the paper you cite seems to
suggest as well.

> A peculiarity about the MersenneTwisterFast class is that the
> constructor taking a long and the setSeed(long) (I guess for some
> compatibility with java.util.Random) ignores the topmost 32 bits.

This was very much intentional: it's algorithmically identical to the
standard Mersenne Twister algorithm (see,
and so it generates identical output for comparison.

You can always seed MTF's array directly. Or make up your own
constructor in a subclass.

BTW, java.util.Random only uses the bottom 48 bits.

> Yet another peculiarity is that all conversions from int to the lower
> precision types is made by shifting instead of bitmasking. Is this
> faster for most CPU's? If there is an explicit conversion, the masking
> will probably take place anyway. I don't know if the javac compiler or
> JIT-compiler is clever enough to recognize that the msb:s are all zero.

That I can't say. MTF is the way it is because it's based on earlier
code which was in turn based on , well-regarded code. I am inclined to
be conservative about changes. Try modding it and tell me what you find.

> Sean Luke: Have you measured the performance effects of having the
> MersenneTwisterFast implement some sensible interface? (Yeah, so
> java.util.Random should *really* be an interface, not a class)?
> If one is to do exact replication of some existing program written in a
> different environment it may be much easier to replace the PRNG in MASON
> than doing it in the original program. Being able to replace the
> generator with a quasi-random generator or something easily observed as
> deterministic would also simplify some testing and debugging.

Maybe some history is useful.

MersenneTwisterFast, and its sibling MersenneTwister, were created for
ECJ, my evolutionary computation package, in 1997, based on earlier code
by Michael Lecuyer (no relationship to Pierre L’Ecuyer I believe). I
first created the MersenneTwister class (you can get it in my ECJ
package -- I don't distribute it with MASON). MersenneTwister is a
drop-in subclass replacement for java.util.Random.

I then created a class called MersennTwisterFast, which is just like
MersenneTwister except for three features which enabled it to be quite
significantly faster than MersenneTwister:

- its methods are all hard-inlined (making them huge)
- it is unsynchronized
- it is not a subclass of java.util.Random

Over the course of the next ten years, I found that I *never once used*
MersenneTwister; it existed really to maintain a simple version of the
code. So when I built MASON, I used MTF rather than MT, and haven't
really found a problem with that yet.

At this stage I'm hesitant to refactor all of MASON (and likely ECJ)
just to allow for experimentation with alternative RNGs. RNGs are at
the core of stochastic systems and it's where I am most conservative.
But here's an alternative approach for you: why not subclass MTF and
override its methods? It looks evil but it's not really: just override
every method with the small ones used in the original MersenneTwister
class code, and then write the next() method to the specification of
your own generator. You just pass in an instance of your subclass when
you create the MASON simulation rather than creating a MT instance.
Believe it or not, I was considering your ilk :-) when I set up MASON so
that users could provide their own RNG instance. I figured someone
might want to subclass MTF.

BTW: one disadvantage of using an interface rather than a direct class
is that the methods of the extending subclass cannot be inlined. But of
course MTF's methods are big and couldn't get inlined anyway, so that's
kinda moot.

> Even though the state space of MT19937 is huge, we don't have any
> theoretical guarantees that two different seeds don't make us use two
> sequences that are (partially) overlapping.

??? To the contrary, I can guarantee you that you'll have a zillion of
them! Because 2^19937 (MT's period) is a much much much much much
bigger number than (2^32). In fact you'd expect some very long strings
repeating in that period, since 2^19937 > (2^32)^623.

In fact, if you want to guarantee that two generators, with long
periods, don't create overlapping sequences, that's another way of
saying that you do NOT want them to be random. So if this is a concern
to you then you may want to revise your experimental methodology.