Print

Print


I am sorry, I did not understand your problem well enough in the  
beginning....

If you start a single-generation run, give the slaves enough time to  
connect to the server.  You may have to start the single-generation  
run several times, if it is too short.

But I was also serious about the script, I have a script for our  
cluster to allow me to kill all of them.  Assuming I do not have any  
other java programs running on the cluster (and given that I cannot  
kill other people's processes), I type:

>>>> > forall killall java

where forall is a script that looks like this:

>>>> trap_int()
>>>> {
>>>>   thatsallfolks=1
>>>> }
>>>>
>>>> trap trap_int INT
>>>> for a in `seq 1 30`
>>>> do
>>>>   echo Node: $a
>>>>   ssh node$a "$@"
>>>>   if [ -n "$thatsallfolks" ]
>>>>   then
>>>>     exit;
>>>>   fi
>>>> done

Our cluster has nodes node1 to node30, which may be different in your  
case.

Best regards,

Liviu.

On Apr 21, 2006, at 3:53 PM, I Jonyer wrote:

> Yes, that is what I've been doing. But NOW they are running without  
> that code. I think I'll just make a run with a single generation  
> and hope they will exit.
>
> Istvan
>
>
> From:  Liviu Panait <[log in to unmask]>
> Reply-To:  ECJ Evolutionary Computation Toolkit <ECJ-INTEREST- 
> [log in to unmask]>
> To:  [log in to unmask]
> Subject:  Re: Master/Slave
> Date:  Fri, 21 Apr 2006 15:36:59 -0400
> >If you kill the master explicitly, then it does not get to send the
> >V_SHUTDOWN message.  What you can do instead is modify the Slave
> >class such that it exits once an IOException is generated because of
> >  a failed socket.  Look into the ec.eval.Slave class and insert
> >such a  condition when an exception occurs.
> >
> >Hope it helps.
> >
> >Best regards,
> >
> >Liviu.
> >
> >On Apr 21, 2006, at 2:39 PM, I Jonyer wrote:
> >
> >>I use ecj 14.
> >>
> >>I usually kill the master with Ctrl-C, and make the clients exit
> >>when the connection is lost.
> >>
> >>Istvan
> >>
> >>
> >>From:  Liviu Panait <[log in to unmask]>
> >>Reply-To:  ECJ Evolutionary Computation Toolkit <ECJ-INTEREST-
> >>[log in to unmask]>
> >>To:  [log in to unmask]
> >>Subject:  Re: Master/Slave
> >>Date:  Fri, 21 Apr 2006 12:24:47 -0400
> >> >Dear Istvan,
> >> >
> >> >>Is there any way to kill all the slave processes after the
> >>master
> >> >>goes down? I modified the previous version so that the slaves
> >>would
> >> >>  exit, but after upgrading I forgot to do this, and now I have
> >> >>slaves running on my entire cluster and they would not exit. Any
> >> >>way that I would not have to log into all nodes and kill them
> >>all
> >> >>one-by-one?
> >> >Of course there is a way, you can always write a script.... ;-)
> >> >
> >> >On a more serious note, which version are you using?  I tried
> >> >version  15, and the slaves seem to exit when the master shuts
> >>down.
> >> >  The way  this is implemented is that all slaves are sent a
> >> >V_SHUTDOWN message  when the master is about to exit.  When the
> >> >slaves receive this  message, they close their sockets, and then
> >> >they exit (this is  implemented via a return call from the main
> >> >function).  Is it  possible that the return call is commented out
> >>in
> >> >your version for  some reason?
> >> >
> >> >Best regards,
> >> >
> >> >Liviu.
> >> >
> >> >>From:  Sean Luke <[log in to unmask]>
> >> >>Reply-To:  ECJ Evolutionary Computation Toolkit <ECJ-INTEREST-
> >> >>[log in to unmask]>
> >> >>To:  [log in to unmask]
> >> >>Subject:  ECJ 14/15 and MASON 11 released
> >> >>Date:  Tue, 4 Apr 2006 01:10:38 -0400
> >> >> >The George Mason University Evolutionary Computation
> >>Laboratory
> >> >>and
> >> >> >Center for Social Complexity announce a new release of the ECJ
> >> >> >evolutionary computation library and MASON multiagent
> >>simulation
> >> >> >toolkit.  Both systems have seen major improvements and
> >>revisions
> >> >> >since the last release approximately eight months ago.  The
> >>two
> >> >> >systems are also being re-licensed under the Academic Free
> >> >>License
> >> >> >version 3.0.
> >> >> >
> >> >> >ECJ is being released in two versions: a backward-compatable
> >> >>version
> >> >> >  (14) and a non-backward-compatible version (15) with
> >> >>significant
> >> >> >framework revisions.  The dual release will (hopefully) give
> >> >>people
> >> >> >some extra time to convert to the new version.  ECJ 14/15 also
> >> >>has
> >> >> >numerous bug-fixes, speed improvements, and a new package
> >> >>(spatial
> >> >> >embedding).
> >> >> >
> >> >> >ECJ can be found here:
> >> >> > http://cs.gmu.edu/~eclab/projects/ecj/
> >> >> >
> >> >> >ECJ CVS access is also available at SourceForge, but
> >> >>sourceforge.net
> >> >> >  has experienced a major hardware failure this past week and
> >>CVS
> >> >> >access is not expected for several days at the earliest.
> >> >> >
> >> >> >
> >> >> >MASON 11 is a major revision of our multiagent simulator.  It
> >> >>sports
> >> >> >  a new charting and tracking facility, several new problem
> >> >>domains,
> >> >> >  and a very large number of bug fixes and improvements.
> >> >> >
> >> >> >MASON can be found here:
> >> >> > http://cs.gmu.edu/~eclab/projects/mason/
> >> >> >
> >> >> >Sean Luke