ECJ-INTEREST-L Archives

January 2008

ECJ-INTEREST-L@LISTSERV.GMU.EDU

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Michael Wilson <[log in to unmask]>
Reply To:
ECJ Evolutionary Computation Toolkit <[log in to unmask]>
Date:
Thu, 17 Jan 2008 10:13:22 +0000
Content-Type:
text/plain
Parts/Attachments:
text/plain (78 lines)
Loretta Macklem wrote:
> I am working with ECJ and I have made a shell script to do 30 runs in a
> row. I am running into a seeding issue. It is always using the same
> seed. Is there a way to set the seed to use a random number?

The ec.util.MersenneTwister class has some deficiencies that have caused
us to develop an improved version for our project. One of these is the
default seed method, which is just;

public MersenneTwister() {
   this(System.currentTimeMillis());
}

If you create several instances in quick succession you get identical
RNGs, and System.currentTimeMillis() doesn't have much entropy anyway.
Here's the default seed method in our version;

   private final static long ISEED_MULT = 0x5DEECE66DL;
   private final static long ISEED_INC = 0xBL;
   private final static long ISEED_MASK = (1L<<48)-1L;

   private static volatile long iseed = 0xE5524F178035A3C6L;

   private long defaultSeed() {

     final long xseed = ((iseed*ISEED_MULT)+ISEED_INC)&ISEED_MASK;
     final long nanos = System.nanoTime(); iseed = xseed;
     return (nanos>>>32)^(nanos<<32)^System.currentTimeMillis()^
            xseed^(((long)super.hashCode())<<32)^
            Runtime.getRuntime().freeMemory();

   }

It combines entropy from the current real time, the number of clock
ticks since last startup, the new object's memory address (via
super.hashCode()), the amount of free memory in the JVM and the
output of an embedded static RNG which uses the same algorithm as
java.util.Random. This produces a reasonable amount of entropy
for experimental purposes (we have a couple of 'high quality' seeding
methods too but they take longer to run). Feel free to replace the
default seeding algorithm in MersenneTwister and/or MersenneTwisterFast
with this one. Unfortunately the MersenneTwister setSeed(long seed)
method only uses the low 32 bits of the seed supplied, so for use in
the current ECJ class this version will give slightly better results;

   private final static long ISEED_MULT = 0x5DEECE66DL;
   private final static long ISEED_INC = 0xBL;
   private final static long ISEED_MASK = (1L<<48)-1L;

   private static volatile long iseed = 0xE5524F178035A3C6L;

   /**
    * Constructor using an improved default seed.
    */
   public MersenneTwister() {

     final long xseed = ((iseed*ISEED_MULT)+ISEED_INC)&ISEED_MASK;
     iseed = xseed;
     this(System.nanoTime()^System.currentTimeMillis()^xseed^
          super.hashCode()^Runtime.getRuntime().freeMemory());

   }

Access to iseed isn't atomic or synchronized, but that's ok because
super.hashCode() and System.nanoTime() still pretty much guarantee
uniqueness even in the unlikely event of overlapping updates.

We've made various other improvements in performance and statistical
quality over the current ECJ RNG classes (plus we've added lots of
extra utility methods). If anyone is interested I'll try to find the
time to do a proper performance comparison and an open-source release
of the code.

Michael Wilson
Systems Engineer
Purple Frog Text Ltd
+44 (0)114 2015707

ATOM RSS1 RSS2