## ECJ-INTEREST-L@LISTSERV.GMU.EDU

 Options: Use Monospaced Font Show HTML Part by Default Condense Mail Headers Message: [<< First] [< Prev] [Next >] [Last >>] Topic: [<< First] [< Prev] [Next >] [Last >>] Author: [<< First] [< Prev] [Next >] [Last >>]

```Ha ah!  Read 5.2.12 Parsimony Pressure.  Thank you.  You're constraining
the system, possibly limiting good regression as well  This brings up a
notion of an engineering trade-off between a number of variables (often
just two but possibly more) with a weak or strong relation to some
constant.  This is the math approach to saying you can't have
everything.  A common relationship in many dynamic systems from biology
to physics (my original training).  It just occurred to me that
evolution algorithm might be yet another off the original concept
application of Arrow's impossibility theorem.  Like duck tape, it has
many uses.  Evolution can be thought of as a social group interaction.
Now that I think on it, ECJ, or any EP system, would have a lot in
common with scheduling algorithms, computer or not, to which Arrow's
theorem also applies.  Apologies, what passes for my mind wanders like this.

Tournament selection, likely other selections too, can result in similar
regression strings, where similar has some definition. Eureqa has an
interesting way of showing this with a Pareto graph.  You get to see
leaps, if you will, in progress towards the fitness goal.  Knowing where
those leaps are, you can go back and look in a log of expressions, each
generation or whatever, around the leap and see what happened.

Does ECJ have a general way of showing this kind of thing?  Maybe I
haven't found it yet?

> Chris,
>
> You're describing the problem of "overfitting" to the data.
>
> I'm not a GP guy, but I know that it's normal to use a training, test,
> and validation set when doing symbolic regression (just like you would
> with other regression/classification methods).
>
> If you find that you are over-fitting, you might want to add some kind
> of parsimony pressure to your GP method (effectively limiting the size
> of your polynomial or expression).  p. 185 of the manual
> <https://cs.gmu.edu/%7Eeclab/projects/ecj/docs/manual/manual.pdf>
> talks about parsimony pressure, FWIW.
>
> Siggy
>

--