Thanks for the reply. I now have a distributed Master-Slave set-up
implemented, and it produces replicable runs regardless of the order
that the slaves are brought online. Indeed, you can shut them down /
bring others up throughout the run and still maintain the replicability.
In this case, I create the test cases once at the start of each
generation (using a stats class as you suggested in a previous post!).
Subsequent fitness evaluations do not rely on the random number
generator. I create the test cases in the Master, then place a
reference to them in each individual. This is then dispatched to each
slave for evaluation. The slaves use the test cases stored within the
individual for evaluation.
The same method can be applied more generally, if the random number
generator is needed for evaluation. Either a set of random numbers can
be attached to an individual at the start of a generation, or a single
seed to be used to seed a Mersenne Twister in the slave.
This does create some overhead in writing the data to the
DataOutputStream, but it is pretty small (at least for me!).
Sean Luke wrote:
> Sure, the slaves can be hacked to ignore the seed given them and
> instead use a seed loaded from a parameter file. But it's not as
> useful as you'd think. ECJ's master-slave facility is not
> synchronous, meaning that any slave can be used to evaluate any
> individual, in any order, and take as long as it needs. This allows
> the facility to be much more efficient, but in turn, it is going to
> make replicability nearly impossible because who knows what slave will
> be asked to evaluate individual X next time around...
> If you farm out the evaluations as described below, AND were using
> generational evolution, AND you guaranteed that the slaves registered
> themselves in the same order initially (perhaps by having one
> register every 5 second -- I made that up just now), yes, I *think*
> you could guarantee replicability, but I'm not positive.
> On Aug 12, 2008, at 11:23 AM, David Robert White wrote:
>> As I understand, when using Master-Slave evaluation the Master
>> generates a seed from the wallclock and sends it to the slaves. As a
>> result, the runs are not repeatable regardless of the timings of
>> evaluation by the slaves. Can I change this so that the slaves are
>> given seeds from their parameter files, or from the Mersenne Twister
>> in the Master? Please could you let me know the rationale behind the
>> decision to use the wallclock, I know you will have one!
>> If we choose to farm out evaluations with max_jobs_per_slave set to
>> M/N (M = pop size, N = slaves), and we had control over the seeds of
>> the slaves, wouldn't we be able to perfectly replicate an individual
>> run? For me, this would be more important than efficiency
>> (especially considering my slaves are almost identical).
>> David R White
>> Research Student
>> Department of Computer Science
>> University of York
>> York YO10 5DD
>> United Kingdom
David R White
Department of Computer Science
University of York
York YO10 5DD