ECJ-INTEREST-L Archives

February 2006

ECJ-INTEREST-L@LISTSERV.GMU.EDU

Options: Use Monospaced Font
Show HTML Part by Default
Condense Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Sender:
ECJ Evolutionary Computation Toolkit <[log in to unmask]>
Date:
Wed, 1 Feb 2006 11:34:06 -0500
MIME-version:
1.0 (Apple Message framework v746.2)
Reply-To:
ECJ Evolutionary Computation Toolkit <[log in to unmask]>
Content-type:
text/plain; charset=US-ASCII; delsp=yes; format=flowed
Subject:
From:
Sean Luke <[log in to unmask]>
In-Reply-To:
Content-transfer-encoding:
7bit
Comments:
To: ECJ Evolutionary Computation Toolkit <[log in to unmask]>
Parts/Attachments:
text/plain (125 lines)
Thanks, Chris, for reposting in ECJ-INTEREST for everyone to see.

ECJ's biggest performance hindrance is indeed its breeding procedure,  
mostly because we allocate lots of GPNodes and GPNode[] arrays.   
Though I'm surprised at your results, because I list below what I get  
with java -Xprof ec.Evolve -file noerc.params

Note that 38% of the time is spent in GC; but allocation isn't that  
bad at all, at least with Xprof.  new array millicode is only 1.2%   
The big kahuna in allocation is clone(), which clocks in at 21.7% (up  
from 11.4% in Java 1.3.1).  This *should* basically boil down to a  
pointer increment and a memcpy.  (As opposed to new object(), which  
is a pointer increment and a bzero).  What I think is happening is  
that the cost of other elements are being optimized out in successive  
Java versions, so allocation is becoming more costly relative to them.

But profilers are notorious for giving misleading info, and I more  
than believe that new array allocation is expensive.  Nice to see  
that the aggressive heap option

That being said, this _is_ the worst-case scenario for ECJ: symbolic  
regression combines a complex breeding procedure with a fast  
evaluation.  We tend to use ECJ on much longer evaluations than  
this.  But I'm always interested in how we might be able to speed  
ECJ's breeding up in the code.  Ideas?

One I have is to eliminate protoCloneSimple and move everything to  
just protoClone.  But that's probably not going to do anything much  
because one or the other is likely optimized out anyway.

Sean


Flat profile of 28.16 secs (1446 total ticks): main

   Interpreted + native   Method
   1.8%     0  +    26    java.lang.ClassLoader.findBootstrapClass
   1.7%     0  +    24    java.lang.ClassLoader.defineClass0
   0.7%     0  +    10    java.lang.Class.forName0
   0.6%     0  +     8    java.util.zip.ZipFile.open
   0.5%     0  +     7    java.io.FileOutputStream.writeBytes
   0.2%     0  +     3    sun.misc.JarIndex.read
   0.2%     0  +     3    ec.Evolve.initialize
   0.1%     0  +     2    java.util.zip.ZipFile.initIDs
   0.1%     0  +     2    java.util.zip.ZipEntry.initIDs
   0.1%     0  +     2    java.util.zip.Inflater.inflateBytes
   0.1%     2  +     0    java.net.URLClassLoader.findClass
   0.1%     2  +     0    ec.gp.GPTree.deepClone
   0.1%     2  +     0    java.nio.Buffer.<init>
   0.1%     0  +     2    ec.Evolve.loadParameterDatabase
   0.1%     2  +     0    ec.util.RandomChoice.pickFromDistribution
   0.1%     0  +     2    ec.gp.koza.CrossoverPipeline.produce
   0.1%     1  +     0    sun.nio.cs.SingleByteEncoder.encodeLoop
   0.1%     0  +     1    ec.gp.GPIndividual.deepClone
   0.1%     1  +     0    java.lang.Package.isSealed
   0.1%     0  +     1    java.lang.Float.valueOf
   0.1%     0  +     1    ec.gp.GPInitializer.setupNodeConstraints
   0.1%     1  +     0    ec.gp.GPNode.depth
   0.1%     0  +     1    ec.simple.SimpleEvaluator.evaluatePopulation
   0.1%     1  +     0    java.lang.String.<init>
   0.1%     0  +     1    java.io.OutputStreamWriter.flush
10.7%    28  +   127    Total interpreted (including elided)

      Compiled + native   Method
   6.9%   100  +     0    ec.app.regression.func.Mul.eval
   5.7%    83  +     0    ec.app.regression.func.Sin.eval
   5.6%    81  +     0    vtable chunks
   4.8%    70  +     0    ec.gp.GPNode.numNodes
   4.5%    65  +     0    ec.app.regression.func.X.eval
   2.6%    38  +     0    ec.gp.GPNode.cloneReplacing
   2.6%    37  +     0    ec.app.regression.func.Add.eval
   2.1%    31  +     0    ec.app.regression.func.Sub.eval
   1.4%    20  +     0    ec.app.regression.Regression.evaluate
   1.3%    19  +     0    ec.app.regression.func.Exp.eval
   1.1%    16  +     0    ec.gp.GPNode.cloneReplacing
   0.8%    12  +     0    ec.gp.GPNode.nodeInPosition
   0.6%     8  +     0    ec.select.TournamentSelection.produce
   0.4%     6  +     0    ec.app.regression.func.Div.eval
   0.3%     5  +     0    ec.util.MersenneTwisterFast.nextInt
   0.3%     4  +     0    ec.gp.koza.CrossoverPipeline.produce
   0.2%     3  +     0    ec.gp.GPNode.swapCompatibleWith
   0.1%     2  +     0    java.io.PrintWriter.flush
   0.1%     2  +     0    ec.app.regression.func.Cos.eval
   0.1%     1  +     1    ec.app.regression.func.Log.eval
   0.1%     1  +     0    sun.nio.cs.StreamEncoder.flush
   0.1%     1  +     0    ec.gp.GPNode.atDepth
   0.1%     1  +     0     
ec.gp.koza.KozaStatistics.postEvaluationStatistics
   0.1%     1  +     0    java.nio.Buffer.position
   0.1%     0  +     1    ec.gp.GPNode.rootedTreeHashCode
42.7%   615  +     2    Total compiled (including elided)

          Stub + native   Method
21.7%    26  +   288    java.lang.Object.clone
   3.6%     0  +    52    java.io.FileOutputStream.writeBytes
   3.5%    17  +    34    java.lang.StrictMath.exp
   0.1%     0  +     2    java.lang.StrictMath.log
29.0%    43  +   376    Total stub

   Runtime stub + native  Method
   1.2%    18  +     0    new_object_array_millicode Runtime1 stub
   0.5%     7  +     0    slow_subtype_check Runtime1 stub
   0.1%     1  +     0    array_copy_2 Runtime1 stub
   1.8%    26  +     0    Total runtime stubs

   Thread-local ticks:
   0.1%     2             Blocked (of total)
   0.3%     5             Class loader
   0.1%     1             Interpreter
   2.5%    36             Compilation
12.0%   174             Unknown: running frame
   0.2%     3             Unknown: no last frame
   0.6%     8             Unknown: thread_state


Global summary of 28.16 seconds:
100.0%  2537             Received ticks
42.9%  1089             Received GC ticks
   1.9%    47             Compilation
   0.1%     2             Other VM operations
   0.2%     5             Class loader
   0.0%     1             Interpreter
   7.3%   185             Unknown code
geepers[~/ecj/ec/app/regression]>

ATOM RSS1 RSS2