LISTSERV - ECJ-INTEREST-L Archives

ECJ-INTEREST-L Archives

June 2008

ECJ-INTEREST-L@LISTSERV.GMU.EDU

	LISTSERV Archives
	ECJ-INTEREST-L Home
	ECJ-INTEREST-L June 2008

	Log In
	Register

	Subscribe or Unsubscribe

	Search Archives

Options:	Use Proportional Font Show Text Part by Default Condense Mail Headers
Message:	[<< First] [< Prev] [Next >] [Last >>]
Topic:	[<< First] [< Prev] [Next >] [Last >>]
Author:	[<< First] [< Prev] [Next >] [Last >>]

Sender:	ECJ Evolutionary Computation Toolkit <[log in to unmask]>
Date:	Tue, 3 Jun 2008 15:39:11 +0200
Reply-To:	ECJ Evolutionary Computation Toolkit <[log in to unmask]>
Content-Transfer-Encoding:	8bit
Subject:	Re: question about multi-threading performance
From:	Denis Robilliard <[log in to unmask]>
Content-Type:	text/plain; charset=ISO-8859-1; format=flowed
In-Reply-To:	<[log in to unmask]>
MIME-Version:	1.0
Comments:	To: ECJ Evolutionary Computation Toolkit <[log in to unmask]>
Parts/Attachments:	text/plain (79 lines)

Yes, it seems to be due to item (2) : memory access with a poor memory 
controller. There are big  timing variations between the same 
multi-threaded runs on my machine : evaluation can range from 19s. to 
33s. , while it keeps quite constant around 30s. when using only one 
thread. Thus, running two threads on a Core2 CPU 6600  @ 2.40GHz can 
even be slower than using one thread...

Denis

Sean Luke a écrit :
> There are no dependencies between breedthreads and evalthreads.  It's 
> actually quite simple if you're doing plain generational evolution, 
> it's roughly:
>
> Eval:
>     For each thread
>         Create a Problem for that thread
>         Fork a thread to do PopSize/NumThreads evals
> Breed:
>     For each thread
>         Create a BreedingPipeline for that thread
>         Fork a thread to do Popsize/NumThreads new individuals
>
> There's no locking in ECJ's basic eval or breed facilities (which is 
> why we need multiple RNGs).  Most performance failures in the 
> threading are due to (1) GC and (2) memory access with a poor memory 
> controller, if not (3) a synchronization you put in there but forgot 
> about :-).  My guess is #2 -- it bites us in MASON too.  Basically 
> although the cores can go full-blast, if you're doing lots of fetches 
> from cold memory (as ECJ is doing constantly -- it does scans across 
> populations), there's only *one* memory and cache controller on the 
> machine and that becomes the bottleneck.
>
> That being said, ECJ will get about 40% improvement on a two-core 
> Intel chip.   For example, when I run ecsuite with 1000 individuals, 
> here are some rough wall-clock times I get on my Macbook Pro:
>
>     1 breed   1 eval    24 secs
>     2 breed   1 eval    21 secs
>     1 breed   2 eval    19 secs
>     2 breed   2 eval    17 secs
>
> Note that eval gives you a bigger boost than breed in this example.
>
> You might try fooling with the GC parameters (-Xmx and -Xms for 
> setting, -verbose:gc for testing), though it probably won't be a big 
> deal for you.
>
> Sean
>
> On Jun 2, 2008, at 10:16 AM, Denis Robilliard wrote:
>
>> Hi,
>>
>> I just performed some experiences with the "breedthreads" and 
>> "evalthreads" parameters on the tutorial regression problem. On a 
>> dual core machine, I observed that the performance increases 
>> (computing time roughly divided by 2) when breedthreads = 2 & 
>> evalthreads=1,  but there is no gain when breedthreads = 1 & 
>> evalthreads=2 . However the stat file shows that most of the running 
>> time is spent for evaluation (as expected). Is there some 
>> dependencies between breedthreads and evalthreads values ?
>>
>> -- 
>> Denis Robilliard
>> L.I.L.
>> Université du Littoral
>> 50 rue F. Buisson
>> 62100 Calais
>> France
>

-- 
Denis Robilliard
L.I.L.
Université du Littoral
50 rue F. Buisson
62100 Calais

ATOM RSS1 RSS2

LISTSERV.GMU.EDU