Thanks all for the advice -- I ended up extending the NSGA2Breeder to keep a record of the bred individuals so far and continue to produce individuals until they were all unique. Indeed this doesn't work with more than one breed thread, but this is a reasonable limitation as breeding is not a significant component of the computational time for my workload anyway.
> On Nov 29, 2018, at 6:58 PM, Andrew Lensen <[log in to unmask]> wrote:
> What is the most ECJ-esque way of going about ensuring that any individuals duplicated in a population by the breeder are removed? (using NSGA-II, if important). I see ECJ has a UniquePipeline, but this seems to stop the new population containing any of the individuals from the old population, not within the *same* population.
Not exactly. When asked to produce N individuls, UniquePipeline repeatedly generates individuals from its source until it has created N unique individuals (using a HashSet).
So if you had a pipeline like this:
Mutate <- Crossover <- TournamentSelect
You could modify it to look like this:
UniquePipeline <- Mutate <- Crossover <- TournamentSelect
... and it'd do what you want if you have only one breeding thread. The reason is that SimpleBreeder will ask the top-level breeding pipeline to create the whole subpopulation's worth at one time. Let's say that's N individuals. So UniquePipeline will produce N unique individuals by repeatedly requesting and testing individuals from the Mutate pipeline.
If you have multiple threads however, all you'll be able to guarantee is that *each thread*'s set of individuals will contain unique individuals, not that they're unique across all sets.
I think UniquePipeline will likely do what you want, since NSGA-II's breeder uses the same core breeding mechanism as SimpleBreeder.