Print

Print


I'm working on a real-world problem and trying to use as much standard ECJ 
code as possible. My useful genome values range from 0 up to a few dozen or 
maybe into the low hundreds, but most are going to be small. The genes are 
related in 18-gene chunks, and many of them will have a total chunk value of 
just 1 to 5. So my gene values often flirt with zero.

I started with integer genes, but the only mutation option there is a random 
number between gene-min and gene-max, which was blowing out chunks 
which had already mutated well. I switched to double genes so I could use 
Gaussian Mutation, so the mutations would not stray far in each iteration. For 
expression in the phenotype, rounding to integers is fine.

But that allowed gene values to drop below zero, which is invalid in the real 
world problem, so I introduced a greater-than-zero check into my fitness 
function (the only code I really want to customize). Simply disallowing any 
negative gene values was too harsh, and eventually killed off most individuals 
- remember many chunks need to total up very close to zero, so the likelihood 
of mutating some member of the chunk to a negative value is high.

So my negative check is careful: if a gene value is less than -1, the fitness is 
zero because the individual has mutated too far and probably won't come 
back. If it is between 0 and -1, it survives but with a very low fitness value so 
it will only survive as a last resort. If it is >= zero, it survives and other 
fitness functions are applied. The idea is that if an individual is already well-
adapted, but mutates a gene or two into the negative, we want to rescue it 
and let it mutate further. I don't think it's possible to "fix" negative genes in 
the fitness function - the individuals given to the fitness function are read-
only, aren't they?

I'm getting many individuals which stop mutating and get stuck with slightly 
negative values. If too many get in this state, evolution slows to a crawl: no 
progress until someone mutates multiple genes in multiple chunks back up 
into the positive area. 

I'm wondering if there is an issue in the Gaussian noise function. This could be 
in a couple of forms (I'm not a mathematician, so I don't know the theoretical 
basis of the function very well):
1. Will random noise be added if the gene value is truly zero? Or does a zero 
gene result in zero noise, so it can never mutate away from zero?
2. Is there any unintentional skew in the Gaussian noise function that would 
tend to cause negative values to favor further negative noise?

Short of extending the mutation function with a custom version that 
rejects/retries negative results, is there any way to influence Gaussian 
Mutation to have a floor of zero? (I'm new to Java and would like to avoid 
customizing anything but the fitness functions.) min-gene and max-gene only 
seem to apply to new individuals, not to mutation.