Jim,Tis an interesting problem you raise. It's true that the "hack" ECJ's generational algorithms use for re-evaluation doesn't make any sense in a steady-state model.I don't know how people have handled multiple tests in parallel steady-state EAs before. I don't recall seeing any discussion of it in the literature.It seems to me that there are two options, though, that could save you the trouble of implementing a distribution multiple-testing scheme:
- If your evaluation function doesn't exhaust all of a node's resources, run multiple tests in parallel on the same node. This is easy to do inside your implementation of the Problem class.
Your heavy-duty simulations probably eat up all your nodes' processors, though, so this might not help your application.
- Ramp up the population size. In some cases, given the same computational resources, using a large population can be just as effective at washing out the effects of noise as multiple testing.
You can see if your application falls into this category by using a fixed budget of fitness evaluations and seeing if it makes more progress with a big population, or with multiple testing. If the latter truly works much better, then that's a sign that it could be worth your effort to modify ECJ's steady-state master-slave model to support distributed multiple testing.Just my two cents. Sean et all will be more familiar with what it might take to implement the feature itself.Siggy--On Tue, Oct 25, 2016 at 1:34 PM, Jim Rutt <[log in to unmask]> wrote:I've been evaluating ECJ for possible use in a large scale cloud computing based evolutionary computation project for the optimization of AIs in highly complex wargames.What makes this a hard problem is that:1. The evaluations are expensive - a mean of 400 seconds per evaluation on a one core 3.5 ghz processor.2. The evaluations are noisy - a better AI can still lose to worse AI, and often does3. The evaluation run times also have a large variance from approximately 80 seconds up to 1000 seconds.
As evolutionary approaches, I'm leaning to steady-state EDA type algorithms as a seemingly good fit for the problem domain.
All was looking good in the evaluation of ECJ until what seems like a fatal problem in the last sentence of section 6.1.6 Noisy Distributed Problems in the ECJ Owners manual :"There’s no equivalent to this hack in Asynchronous Evolution: you’ll just have to ask a machine to test the individual 5 times."Unfortunately that would seem to significantly reduce the ability to fan out evaluations to reduce elapsed clock time per evaluation which would significantly increase "time travel" - ie where evaluated individuals re-enter a population as candidates for inclusion at a much later time than they were created for evaluation.Is another hack possible to spread out evaluations where one needs to run multiple tests to get a good-enough estimator of an individual? i might even be willing to do the hacking.--Jim Rutt