I do have a question. If this question belong elsewhere, please point me there and there I will ask it.

I had to wait until tutorial4 because I'm getting into symbolic regression modeling. You have a bunch of data, these days quite often peta-data or more. You want a mathematical model that has the attributes of describing the data and hopefully making, successful, predictions.

Here's my issue. If I remember correctly, it is possible to come up with a polynomial of degree n-1, where n is the number of data points, that precisely passes through every data point in your data set. However, the odds of such a polynomial having any descriptive truths about the data, let alone predictive capabilities, are pretty small as a rule.

What you want is probably something more in the way of a spline function, at the least, with the wonderful piece wise continuous differential hoo-ha yada yada they taught back in the Precambrian era when I studied math.

I googled Koza fitness tests. I've seen similar for symbolic regression. Many look a lot like a statistical variance. Maybe I'm missing something here, probably am. Looks to me like my aforementioned n-1 degree polynomial would fit like the proverbial glove with a 0 fitness measure. What's to prevent such a symbolic regression system, ECJ or other, from simply coming up with a useless polynomial?

Thanks.

--

Chris Johnson | [log in to unmask] |

Ex SysAdmin, now, writer | A bargain is something you don’t need(Franklin Jones) at a price you can’t resist. |