ECJ-INTEREST-L Archives

May 2017

ECJ-INTEREST-L@LISTSERV.GMU.EDU

Options: Use Monospaced Font
Show HTML Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Chris Johnson <[log in to unmask]>
Reply To:
ECJ Evolutionary Computation Toolkit <[log in to unmask]>
Date:
Mon, 8 May 2017 10:21:45 -0400
Content-Type:
multipart/alternative
Parts/Attachments:
text/plain (2149 bytes) , text/html (3385 bytes)
Ha ah!  Read 5.2.12 Parsimony Pressure.  Thank you.  You're constraining 
the system, possibly limiting good regression as well  This brings up a 
notion of an engineering trade-off between a number of variables (often 
just two but possibly more) with a weak or strong relation to some 
constant.  This is the math approach to saying you can't have 
everything.  A common relationship in many dynamic systems from biology 
to physics (my original training).  It just occurred to me that 
evolution algorithm might be yet another off the original concept 
application of Arrow's impossibility theorem.  Like duck tape, it has 
many uses.  Evolution can be thought of as a social group interaction.  
Now that I think on it, ECJ, or any EP system, would have a lot in 
common with scheduling algorithms, computer or not, to which Arrow's 
theorem also applies.  Apologies, what passes for my mind wanders like this.

Tournament selection, likely other selections too, can result in similar 
regression strings, where similar has some definition. Eureqa has an 
interesting way of showing this with a Pareto graph.  You get to see 
leaps, if you will, in progress towards the fitness goal.  Knowing where 
those leaps are, you can go back and look in a log of expressions, each 
generation or whatever, around the leap and see what happened.

Does ECJ have a general way of showing this kind of thing?  Maybe I 
haven't found it yet?

> Chris,
>
> You're describing the problem of "overfitting" to the data.
>
> I'm not a GP guy, but I know that it's normal to use a training, test, 
> and validation set when doing symbolic regression (just like you would 
> with other regression/classification methods).
>
> If you find that you are over-fitting, you might want to add some kind 
> of parsimony pressure to your GP method (effectively limiting the size 
> of your polynomial or expression).  p. 185 of the manual 
> <https://cs.gmu.edu/%7Eeclab/projects/ecj/docs/manual/manual.pdf> 
> talks about parsimony pressure, FWIW.
>
> Siggy
>

-- 

Chris Johnson 	[log in to unmask]
Ex SysAdmin, now, writer 	/If sex is a pain in the ass,
then you’re doing it wrong…/
(Rodney Dangerfield)



ATOM RSS1 RSS2