SubPopulation's 'duplicate-retries' parameter only applies during the
initialization of the population, not during breeding, so that doesn't seem
to me to be what Andrew is looking for.

It seems to me that preventing duplicates would be easy enough to do by
writing your own BreedingPipeline class (ex. I might start by copy-pasting
CheckingPipeline and then modifying its logic)—but the question is how to
do it efficiently.  When SubPopulation and UniquePipeline avoid duplicates,
they do so by building a temporary HashSet that can be used to check for
duplicates as individuals are generated.

Off the top of my head, it's not obvious to me where we would put this
hashing logic when trying to prevent duplicates in offspring populations.
This is because breeding in ECJ happens in chunks of various size that
might be farmed off to different threads.

Sorry if that's not much help.  There's probably a better way that I
haven't thought of.


On Thu, Nov 29, 2018 at 8:27 PM Stephen J. Kozakoff <[log in to unmask]>

> This is the way I would approach the problem, but, there may be other ways.
> The simplest way would be to set the "parametrer
> pop.subpop.X.duplicate-retries" to a number that is so large that it is
> statistically improbable the population would get a duplicate.
> A slightly more difficult way would be to implement your own subpopulation
> class. If you do this option you can absolutely guarantee there are no
> duplicates.
> -Steve
> On Thu, Nov 29, 2018 at 7:09 PM Andrew Lensen <[log in to unmask]>
> wrote:
>> What is the most ECJ-esque way of going about ensuring that any
>> individuals duplicated in a population by the breeder are removed? (using
>> NSGA-II, if important). I see ECJ has a UniquePipeline, but this seems to
>> stop the new population containing any of the individuals from the old
>> population, not within the *same* population.


Doctoral Candidate, George Mason University