Print

Print


Sean Luke wrote:
> On May 5, 2007, at 12:36 AM, Matthew Walker wrote:
>
>> I'm *very* new to ECJ, so my apologies if I have done something 
>> stupid.  I would, however, very much appreciate someone telling me 
>> where I've gone wrong!
>>
>> I have run the parity examples provided with ECJ and I'm confused 
>> with the results.  I have tried to reproduce some of the experiments 
>> in Koza's second book.  My expectation was that I could use the 
>> parity examples provided with ECJ to get similar results to chapter 6 
>> of GPII which generally concludes that the use of ADFs is a good 
>> thing for this domain.
>>
>> ...
>>
>> So, to summarize, Standard GP scored 32.4% while GP with ADFs scored 
>> 0.6%.  From Koza's second book on GP (page 181), this was not what I 
>> expected to get.  I expected GP with ADFs to outperform standard GP 
>> on this problem domain.  I sat around scratching my head trying to 
>> work out what I had done wrong, however nothing but hair came out ;o)
>
>
> Hi Matt.  This is the first time that this bug [if it is one] has been 
> reported and no, you're not necessarily doing something stupid.  I'm 
> not sure what the problem is, but there are several possibilities:
>
>     - A bug in the ADF code (possible)
>     - A bug in the parity problem example (less likely but possible)
>     - Errors in Koza's text
>
> There have been some significant errors in Koza's text on certain 
> problem domains, so it's a definite possibility.  What we need is a 
> third implementation to verify if it's ECJ doing this or not.  lil-gp 
> anyone?  Or maybe open beagle?
>
Ahh.  Well... actually... I've come from OpenBeagle.  My initial efforts 
were with that system and I've tried quite hard to get it to produce 
performance that's similar to published results. (See 
http://sourceforge.net/mailarchive/forum.php?thread_name=46358035.1090806%40massey.ac.nz&forum_name=beagle-developers). 


Because I was fairly unsuccessful, I came up with the same list you did: 
either it was a subtle bug in Beagle, or an issue with Koza's text.  I 
did not consider the parity code to be a possibility---but perhaps 
should have.  So my step was to try a "third implementation"; and ECJ 
was my choice.  I was surprised to get results with ECJ that did not 
follow Koza's.

Given that I had now failed on two software systems, I went to see if I 
could find any replications of Koza's work.  Firstly, there is David 
Jackson's paper in EuroGP last month.  On the even-4-parity problem with 
a population of 500, he achieved 14% success without ADFs and 43% 
success with ADFs (100 runs each).  This result is statistically 
significant.  In 1998 Naemura, Hasiyama and Okuma also worked with 
even-4-parity but with a population size of 200.  They found no 
solutions after 300 generations without ADFs but had about 55% success 
with ADFs (30 runs each).  I don't yet have any other examples, but even 
from those two I think it's fair to say Koza's results have been 
verified.  Jackson rolled his own system (which is not publicly 
available) but I don't know what software the Japanese group used.

Since then I've looked at the lawnmower problem that's included with 
ECJ.  For this problem I obtained results that were in the same vein as 
Koza's.  500 from 500 runs found a solution by generation 4 with ADFs, 
whereas without ADFs it took 23 generations to obtain the same level of 
success (over 241 runs).  In fact, the first solution without ADFs 
didn't turn up till generation 12, so these results are absolutely 
significant.  I thought the problem defaulted to a 64-square lawn (as 
per http://cs.gmu.edu/~eclab/projects/ecj/docs/), but my results are 
more coherent with Koza's results on a 32-square lawn.  However the 
point is, ECJ's ADF code seemed to work fine on this problem domain.

I've obtained the lisp code that Koza published in his second book.  So 
far I've executed a few runs of even-5-parity with a population size of 
16,000 and the results seem in line with the graphs in GPII.  I plan to 
let the system go overnight to obtain a few more runs.

Given all this, I'm left with the feeling that there could be some minor 
parameter that is having a major effect on the results.  Do you have any 
other possible explanations?  Any recommendations on what I might do next?

> As to memory: 16,000 is a big number for ECJ, which is fairly memory 
> hungry.  What -Xmx and -Xms settings did you set on your VM, however?
>
I don't know.  I ran the command "java ec.Evolve -file 
../../../../ec/app/parity/parity.params -p eval.problem.even=true &> 
console.txt".  From "top" it looks like the machine has 2GB of memory.  
I've just been told that Java defaults to using only 25% of the 
machine's RAM for heap, so that was probably my problem.


Thank you again for your help.

Regards,

Matthew