I'm working on a real-world problem and trying to use as much standard ECJ
code as possible. My useful genome values range from 0 up to a few dozen or
maybe into the low hundreds, but most are going to be small. The genes are
related in 18-gene chunks, and many of them will have a total chunk value of
just 1 to 5. So my gene values often flirt with zero.
I started with integer genes, but the only mutation option there is a random
number between gene-min and gene-max, which was blowing out chunks
which had already mutated well. I switched to double genes so I could use
Gaussian Mutation, so the mutations would not stray far in each iteration. For
expression in the phenotype, rounding to integers is fine.
But that allowed gene values to drop below zero, which is invalid in the real
world problem, so I introduced a greater-than-zero check into my fitness
function (the only code I really want to customize). Simply disallowing any
negative gene values was too harsh, and eventually killed off most individuals
- remember many chunks need to total up very close to zero, so the likelihood
of mutating some member of the chunk to a negative value is high.
So my negative check is careful: if a gene value is less than -1, the fitness is
zero because the individual has mutated too far and probably won't come
back. If it is between 0 and -1, it survives but with a very low fitness value so
it will only survive as a last resort. If it is >= zero, it survives and other
fitness functions are applied. The idea is that if an individual is already well-
adapted, but mutates a gene or two into the negative, we want to rescue it
and let it mutate further. I don't think it's possible to "fix" negative genes in
the fitness function - the individuals given to the fitness function are read-
only, aren't they?
I'm getting many individuals which stop mutating and get stuck with slightly
negative values. If too many get in this state, evolution slows to a crawl: no
progress until someone mutates multiple genes in multiple chunks back up
into the positive area.
I'm wondering if there is an issue in the Gaussian noise function. This could be
in a couple of forms (I'm not a mathematician, so I don't know the theoretical
basis of the function very well):
1. Will random noise be added if the gene value is truly zero? Or does a zero
gene result in zero noise, so it can never mutate away from zero?
2. Is there any unintentional skew in the Gaussian noise function that would
tend to cause negative values to favor further negative noise?
Short of extending the mutation function with a custom version that
rejects/retries negative results, is there any way to influence Gaussian
Mutation to have a floor of zero? (I'm new to Java and would like to avoid
customizing anything but the fitness functions.) min-gene and max-gene only
seem to apply to new individuals, not to mutation.