Yes, it seems to be due to item (2) : memory access with a poor memory controller. There are big timing variations between the same multi-threaded runs on my machine : evaluation can range from 19s. to 33s. , while it keeps quite constant around 30s. when using only one thread. Thus, running two threads on a Core2 CPU 6600 @ 2.40GHz can even be slower than using one thread... Denis Sean Luke a écrit : > There are no dependencies between breedthreads and evalthreads. It's > actually quite simple if you're doing plain generational evolution, > it's roughly: > > Eval: > For each thread > Create a Problem for that thread > Fork a thread to do PopSize/NumThreads evals > Breed: > For each thread > Create a BreedingPipeline for that thread > Fork a thread to do Popsize/NumThreads new individuals > > There's no locking in ECJ's basic eval or breed facilities (which is > why we need multiple RNGs). Most performance failures in the > threading are due to (1) GC and (2) memory access with a poor memory > controller, if not (3) a synchronization you put in there but forgot > about :-). My guess is #2 -- it bites us in MASON too. Basically > although the cores can go full-blast, if you're doing lots of fetches > from cold memory (as ECJ is doing constantly -- it does scans across > populations), there's only *one* memory and cache controller on the > machine and that becomes the bottleneck. > > That being said, ECJ will get about 40% improvement on a two-core > Intel chip. For example, when I run ecsuite with 1000 individuals, > here are some rough wall-clock times I get on my Macbook Pro: > > 1 breed 1 eval 24 secs > 2 breed 1 eval 21 secs > 1 breed 2 eval 19 secs > 2 breed 2 eval 17 secs > > Note that eval gives you a bigger boost than breed in this example. > > You might try fooling with the GC parameters (-Xmx and -Xms for > setting, -verbose:gc for testing), though it probably won't be a big > deal for you. > > Sean > > On Jun 2, 2008, at 10:16 AM, Denis Robilliard wrote: > >> Hi, >> >> I just performed some experiences with the "breedthreads" and >> "evalthreads" parameters on the tutorial regression problem. On a >> dual core machine, I observed that the performance increases >> (computing time roughly divided by 2) when breedthreads = 2 & >> evalthreads=1, but there is no gain when breedthreads = 1 & >> evalthreads=2 . However the stat file shows that most of the running >> time is spent for evaluation (as expected). Is there some >> dependencies between breedthreads and evalthreads values ? >> >> -- >> Denis Robilliard >> L.I.L. >> Université du Littoral >> 50 rue F. Buisson >> 62100 Calais >> France > -- Denis Robilliard L.I.L. Université du Littoral 50 rue F. Buisson 62100 Calais