Jim,

Tis an interesting problem you raise.  It's true that the "hack" ECJ's generational algorithms use for re-evaluation doesn't make any sense in a steady-state model.

I don't know how people have handled multiple tests in parallel steady-state EAs before.  I don't recall seeing any discussion of it in the literature.

It seems to me that there are two options, though, that could save you the trouble of implementing a distribution multiple-testing scheme:
  1. If your evaluation function doesn't exhaust all of a node's resources, run multiple tests in parallel on the same node.  This is easy to do inside your implementation of the Problem class.

    Your heavy-duty simulations probably eat up all your nodes' processors, though, so this might not help your application.

  2. Ramp up the population size.  In some cases, given the same computational resources, using a large population can be just as effective at washing out the effects of noise as multiple testing.

    You can see if your application falls into this category by using a fixed budget of fitness evaluations and seeing if it makes more progress with a big population, or with multiple testing.  If the latter truly works much better, then that's a sign that it could be worth your effort to modify ECJ's steady-state master-slave model to support distributed multiple testing.
Just my two cents.  Sean et all will be more familiar with what it might take to implement the feature itself.

Siggy

On Tue, Oct 25, 2016 at 1:34 PM, Jim Rutt <[log in to unmask]> wrote:
I've been evaluating ECJ for possible use in a large scale cloud computing based evolutionary computation project for the optimization of AIs in highly complex wargames.   

What makes this a hard problem is that:

1.  The evaluations are expensive - a mean of 400 seconds per evaluation on a one core 3.5 ghz processor.  
2.  The evaluations are noisy - a better AI can still lose to worse AI, and often does
3.  The evaluation run times also have a large variance from approximately 80 seconds up to 1000 seconds.

As evolutionary approaches, I'm leaning to steady-state EDA type algorithms as a seemingly good fit for the problem domain.  

All was looking good in the evaluation of ECJ until what seems like a fatal problem in the last sentence of section 6.1.6 Noisy Distributed Problems in the ECJ Owners manual :

"There’s no equivalent to this hack in Asynchronous Evolution: you’ll just have to ask a machine to test the individual 5 times."

Unfortunately that would seem to significantly reduce the ability to fan out evaluations to reduce elapsed clock time per evaluation  which would significantly increase "time travel" - ie where evaluated individuals  re-enter a population as candidates for inclusion at a much later time than they were created for evaluation.  

Is another hack possible to spread out evaluations where one needs to run multiple tests to get a good-enough estimator of an individual?   i might even be willing to do the hacking.




--
Jim Rutt
JPR Ventures



--

Ph.D student in Computer Science, George Mason University
CFO and Web Director, Journal of Mason Graduate Research
http://mason.gmu.edu/~escott8/