SubPopulation's 'duplicate-retries' parameter only applies during the initialization of the population, not during breeding, so that doesn't seem to me to be what Andrew is looking for.

It seems to me that preventing duplicates would be easy enough to do by writing your own BreedingPipeline class (ex. I might start by copy-pasting CheckingPipeline and then modifying its logic)—but the question is how to do it efficiently.  When SubPopulation and UniquePipeline avoid duplicates, they do so by building a temporary HashSet that can be used to check for duplicates as individuals are generated.

Off the top of my head, it's not obvious to me where we would put this hashing logic when trying to prevent duplicates in offspring populations.  This is because breeding in ECJ happens in chunks of various size that might be farmed off to different threads.

Sorry if that's not much help.  There's probably a better way that I haven't thought of.


On Thu, Nov 29, 2018 at 8:27 PM Stephen J. Kozakoff <[log in to unmask]> wrote:
This is the way I would approach the problem, but, there may be other ways.

The simplest way would be to set the "parametrer pop.subpop.X.duplicate-retries" to a number that is so large that it is statistically improbable the population would get a duplicate.

A slightly more difficult way would be to implement your own subpopulation class. If you do this option you can absolutely guarantee there are no duplicates.


On Thu, Nov 29, 2018 at 7:09 PM Andrew Lensen <[log in to unmask]> wrote:
What is the most ECJ-esque way of going about ensuring that any individuals duplicated in a population by the breeder are removed? (using NSGA-II, if important). I see ECJ has a UniquePipeline, but this seems to stop the new population containing any of the individuals from the old population, not within the *same* population. 


Doctoral Candidate, George Mason University