Thanks all for the advice -- I ended up extending the NSGA2Breeder to keep a record of the bred individuals so far and continue to produce individuals until they were all unique. Indeed this doesn't work with more than one breed thread, but this is a reasonable limitation as breeding is not a significant component of the computational time for my workload anyway.

Ngā mihi,
Andrew Lensen


On Fri, 30 Nov 2018 at 15:58, Sean Luke <[log in to unmask]> wrote:
> On Nov 29, 2018, at 6:58 PM, Andrew Lensen <[log in to unmask]> wrote:
>
> What is the most ECJ-esque way of going about ensuring that any individuals duplicated in a population by the breeder are removed? (using NSGA-II, if important). I see ECJ has a UniquePipeline, but this seems to stop the new population containing any of the individuals from the old population, not within the *same* population. 

Not exactly.  When asked to produce N individuls, UniquePipeline repeatedly generates individuals from its source until it has created N unique individuals (using a HashSet).

So if you had a pipeline like this:

        Mutate <- Crossover <- TournamentSelect

You could modify it to look like this:

        UniquePipeline <- Mutate <- Crossover <- TournamentSelect

... and it'd do what you want if you have only one breeding thread.  The reason is that SimpleBreeder will ask the top-level breeding pipeline to create the whole subpopulation's worth at one time.  Let's say that's N individuals.  So UniquePipeline will produce N unique individuals by repeatedly requesting and testing individuals from the Mutate pipeline.

If you have multiple threads however, all you'll be able to guarantee is that *each thread*'s set of individuals will contain unique individuals, not that they're unique across all sets.

I think UniquePipeline will likely do what you want, since NSGA-II's breeder uses the same core breeding mechanism as SimpleBreeder.

Sean