Print

Print


Pelle Evensen wrote:


> Using the definition from  "Definition 1" of
> http://www.iro.umontreal.ca/~lecuyer/myftp/papers/cacm90.pdf (P. 
> L'Ecuyer, ``Random Numbers for Simulation'', Communications of the ACM, 
> 33 (1990), 85--98),
> I distinguish between the *state*, S, and the *seed*, s_0. 

Perhaps this is getting pedantic.  By "seed" I mean "the data with which 
you initialize the generator".  By "state" I mean "the internal state of 
the generator at any particular point."


> The "best of the best" is as usual depending on what you want to use it 
> for.

I meant that it is very highly regarded, as the paper you cite seems to 
suggest as well.


> A peculiarity about the MersenneTwisterFast class is that the 
> constructor taking a long and the setSeed(long)  (I guess for some 
> compatibility with java.util.Random) ignores the topmost 32 bits.

This was very much intentional: it's algorithmically identical to the 
standard Mersenne Twister algorithm (see 
http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/MT2002/CODES/mt19937ar.c), 
and so it generates identical output for comparison.

You can always seed MTF's array directly.  Or make up your own 
constructor in a subclass.

BTW, java.util.Random only uses the bottom 48 bits.


> Yet another peculiarity is that all conversions from int to the lower 
> precision types is made by shifting instead of bitmasking. Is this 
> faster for most CPU's? If there is an explicit conversion, the masking 
> will probably take place anyway. I don't know if the javac compiler or 
> JIT-compiler is clever enough to recognize that the msb:s are all zero.

That I can't say. MTF is the way it is because it's based on earlier 
code which was in turn based on , well-regarded code.  I am inclined to 
be conservative about changes.  Try modding it and tell me what you find.



> Sean Luke: Have you measured the performance effects of having the 
> MersenneTwisterFast implement some sensible interface? (Yeah, so 
> java.util.Random should *really* be an interface, not a class)?
 >
> If one is to do exact replication of some existing program written in a 
> different environment it may be much easier to replace the PRNG in MASON 
> than doing it in the original program. Being able to replace the 
> generator with a quasi-random generator or something easily observed as 
> deterministic would also simplify some testing and debugging.


Maybe some history is useful.

MersenneTwisterFast, and its sibling MersenneTwister, were created for 
ECJ, my evolutionary computation package, in 1997, based on earlier code 
by Michael Lecuyer (no relationship to Pierre L’Ecuyer I believe).  I 
first created the MersenneTwister class (you can get it in my ECJ 
package -- I don't distribute it with MASON).  MersenneTwister is a 
drop-in subclass replacement for java.util.Random.

I then created a class called MersennTwisterFast, which is just like 
MersenneTwister except for three features which enabled it to be quite 
significantly faster than MersenneTwister:

	- its methods are all hard-inlined (making them huge)
	- it is unsynchronized
	- it is not a subclass of java.util.Random

Over the course of the next ten years, I found that I *never once used* 
MersenneTwister; it existed really to maintain a simple version of the 
code.  So when I built MASON, I used MTF rather than MT, and haven't 
really found a problem with that yet.

At this stage I'm hesitant to refactor all of MASON (and likely ECJ) 
just to allow for experimentation with alternative RNGs.  RNGs are at 
the core of stochastic systems and it's where I am most conservative. 
But here's an alternative approach for you: why not subclass MTF and 
override its methods?  It looks evil but it's not really: just override 
every method with the small ones used in the original MersenneTwister 
class code, and then write the next() method to the specification of 
your own generator.  You just pass in an instance of your subclass when 
you create the MASON simulation rather than creating a MT instance. 
Believe it or not, I was considering your ilk :-) when I set up MASON so 
that users could provide their own RNG instance.  I figured someone 
might want to subclass MTF.

BTW: one disadvantage of using an interface rather than a direct class 
is that the methods of the extending subclass cannot be inlined.  But of 
course MTF's methods are big and couldn't get inlined anyway, so that's 
kinda moot.



> Even though the state space of MT19937 is huge, we don't have any 
> theoretical guarantees that two different seeds don't make us use two 
> sequences that are (partially) overlapping.

??? To the contrary, I can guarantee you that you'll have a zillion of 
them!  Because 2^19937 (MT's period) is a much much much much much 
bigger number than (2^32).  In fact you'd expect some very long strings 
repeating in that period, since 2^19937 > (2^32)^623.

In fact, if you want to guarantee that two generators, with long 
periods, don't create overlapping sequences, that's another way of 
saying that you do NOT want them to be random.  So if this is a concern 
to you then you may want to revise your experimental methodology.

Sean