ECJ-INTEREST-L@LISTSERV.GMU.EDU

View:

 Message: [ First | Previous | Next | Last ] By Topic: [ First | Previous | Next | Last ] By Author: [ First | Previous | Next | Last ] Font: Monospaced Font

Subject:

Re: random seeding

From:

Date:

Mon, 21 Jan 2008 16:19:10 -0500

Content-Type:

text/plain

Parts/Attachments:

 text/plain (213 lines)
 On Jan 21, 2008, at 6:05 AM, Michael Wilson wrote: > By all means, please independently replicate my tests. I did. And I am getting totally different results. What system did you use to compute this? I downloaded 'ent', a randomness tester from Fourmilab (http:// www.fourmilab.ch/random/), and ran it on ec.util.MersenneTwisterFast's output (ec.util.MersenneTwister generated the same identical file of course -- I double-checked though). You didn't specify the seed, so I tried a few. Here's MersenneTwister on 1000, starting cold. Entropy = 7.999983 bits per byte. Optimum compression would reduce the size of this 10000000 byte file by 0 percent. Chi square distribution for 10000000 samples is 230.74, and randomly would exceed this value 75.00 percent of the times. Arithmetic mean value of data bytes is 127.5000 (127.5 = random). Monte Carlo value for Pi is 3.141094856 (error 0.02 percent). Serial correlation coefficient is -0.000300 (totally uncorrelated = 0.0). Here's MersenneTwister with 1. Entropy = 7.999981 bits per byte. Optimum compression would reduce the size of this 10000000 byte file by 0 percent. Chi square distribution for 10000000 samples is 265.11, and randomly would exceed this value 50.00 percent of the times. Arithmetic mean value of data bytes is 127.5013 (127.5 = random). Monte Carlo value for Pi is 3.140317256 (error 0.04 percent). Serial correlation coefficient is 0.000126 (totally uncorrelated = 0.0). Here's MT seeded with the current time, six tests: 1. Entropy = 7.999981 bits per byte. Optimum compression would reduce the size of this 10000000 byte file by 0 percent. Chi square distribution for 10000000 samples is 266.75, and randomly would exceed this value 50.00 percent of the times. Arithmetic mean value of data bytes is 127.5005 (127.5 = random). Monte Carlo value for Pi is 3.141267657 (error 0.01 percent). Serial correlation coefficient is -0.000034 (totally uncorrelated = 0.0). 2. Entropy = 7.999982 bits per byte. Optimum compression would reduce the size of this 10000000 byte file by 0 percent. Chi square distribution for 10000000 samples is 255.09, and randomly would exceed this value 50.00 percent of the times. Arithmetic mean value of data bytes is 127.4979 (127.5 = random). Monte Carlo value for Pi is 3.143893258 (error 0.07 percent). Serial correlation coefficient is -0.000742 (totally uncorrelated = 0.0). 3. Entropy = 7.999981 bits per byte. Optimum compression would reduce the size of this 10000000 byte file by 0 percent. Chi square distribution for 10000000 samples is 256.70, and randomly would exceed this value 50.00 percent of the times. Arithmetic mean value of data bytes is 127.4844 (127.5 = random). Monte Carlo value for Pi is 3.141654057 (error 0.00 percent). Serial correlation coefficient is 0.000353 (totally uncorrelated = 0.0). 4. Entropy = 7.999984 bits per byte. Optimum compression would reduce the size of this 10000000 byte file by 0 percent. Chi square distribution for 10000000 samples is 217.05, and randomly would exceed this value 95.00 percent of the times. Arithmetic mean value of data bytes is 127.5044 (127.5 = random). Monte Carlo value for Pi is 3.141740457 (error 0.00 percent). Serial correlation coefficient is -0.000518 (totally uncorrelated = 0.0). 5. Entropy = 7.999982 bits per byte. Optimum compression would reduce the size of this 10000000 byte file by 0 percent. Chi square distribution for 10000000 samples is 252.58, and randomly would exceed this value 50.00 percent of the times. Arithmetic mean value of data bytes is 127.4936 (127.5 = random). Monte Carlo value for Pi is 3.142441257 (error 0.03 percent). Serial correlation coefficient is 0.000094 (totally uncorrelated = 0.0). 6. Entropy = 7.999983 bits per byte. Optimum compression would reduce the size of this 10000000 byte file by 0 percent. Chi square distribution for 10000000 samples is 236.90, and randomly would exceed this value 75.00 percent of the times. Arithmetic mean value of data bytes is 127.4783 (127.5 = random). Monte Carlo value for Pi is 3.141865257 (error 0.01 percent). Serial correlation coefficient is 0.000283 (totally uncorrelated = 0.0). So generally we get Chi Squares of 50 (extremely good), occasionally around 75, and one hovering at 95. Overall, very good according to Fermilab. You indicated that MT was doing worse than java.util.Random. That's surprising to me because my understanding is that java.util.Random is rather well known to have poor randomness qualities, see for example http://alife.co.uk/nonrandom/ for an astonishing result. So here's the results for java.util.Random seeded with a 1000. Entropy = 7.999992 bits per byte. Optimum compression would reduce the size of this 10000000 byte file by 0 percent. Chi square distribution for 10000000 samples is 107.78, and randomly would exceed this value 99.99 percent of the times. Arithmetic mean value of data bytes is 127.5021 (127.5 = random). Monte Carlo value for Pi is 3.140214056 (error 0.04 percent). Serial correlation coefficient is -0.000008 (totally uncorrelated = 0.0). Eesh, bad bad Chi Square. Here's some java.util.Random results with current time: 1. Entropy = 7.999993 bits per byte. Optimum compression would reduce the size of this 10000000 byte file by 0 percent. Chi square distribution for 10000000 samples is 100.67, and randomly would exceed this value 99.99 percent of the times. Arithmetic mean value of data bytes is 127.5045 (127.5 = random). Monte Carlo value for Pi is 3.141442857 (error 0.00 percent). Serial correlation coefficient is -0.000018 (totally uncorrelated = 0.0). 2. Entropy = 7.999993 bits per byte. Optimum compression would reduce the size of this 10000000 byte file by 0 percent. Chi square distribution for 10000000 samples is 97.79, and randomly would exceed this value 99.99 percent of the times. Arithmetic mean value of data bytes is 127.5001 (127.5 = random). Monte Carlo value for Pi is 3.141452457 (error 0.00 percent). Serial correlation coefficient is 0.000059 (totally uncorrelated = 0.0). 3. Entropy = 7.999993 bits per byte. Optimum compression would reduce the size of this 10000000 byte file by 0 percent. Chi square distribution for 10000000 samples is 98.80, and randomly would exceed this value 99.99 percent of the times. Arithmetic mean value of data bytes is 127.4982 (127.5 = random). Monte Carlo value for Pi is 3.141766857 (error 0.01 percent). Serial correlation coefficient is -0.000085 (totally uncorrelated = 0.0). So... not a fluke. > It would appear that in this case, the chickens were very much alive. > Incidentally, when you reported a 'bug' in java.util.Random a few > years > back, regarding 'dimensional stability' in reusing an RNG vector > element > three times to generate three bytes, did you run statistical tests or > did you rely on some combination of intuition and poultry of uncertain > freshness? :) I think way back then I relied on a quote from NRC. They were respectable back then. And I think they're probably still correct right now, given java.util.Random's poor performance. Sean