On May 17, 2004, at 2:52 PM, Anthony Loeppert wrote:
> a) Looking at the class doc for GPFunctionSet, it looks like it's
> really designed to be setup once at initialization via parameter files,
> given the methods. Would it be a tricky endeavor to manipulate a
> GPFunctionSet programmatically during simulation? Is my best option to
> subclass?
It's tricky but doable.
GPFunctionSet's big complication is that it uses lots of arrays to
speed up tree generation. The easiest way to handle this is to just
modify the hash table and then re-build the arrays with
GPFunctionSet.postProcessFunctionSet(). You'll want to make sure you
have enough GPNodeConstraints to cover all of your constraints
possibilities (or hack in new ones at runtime).
You can manipulate the GPFunctionSet but you have to be careful: tree
builders and some mutation operators rely on it to pick new nodes from
which you can do stuff. So you'll want to do it outside of a mutation
or tree building operation. Also you'll find it a lot easier to *add*
nodes to the function set than to delete them (since deleting also
means making sure that they don't exist anywhere in the populations'
trees either).
> b) Given the listed differences between GLiB functions and ADF, would
> subclassing or reusing much of the ADF code result in a mess?
> Basically, would I be better off starting from scratch as to how a GLiB
> function is executed? If starting from scratch I was thinking of using
> BCEL (http://jakarta.apache.org/bcel/) to create GPNodes on the fly,
> but I've never used BCEL before so I don't know how feasible this is.
I don't know GLiB well. I can say that ADFs are a complex monster in
languages without a built-in evaluator (such as Java). You have to
maintain two stacks of contextual information. I'd rely on the ADF
code mostly for hints about how to pull off such a beast.
> My completely naive thought was to simply run all this from an external
> program which analyzes the population (ecj output) outputs java files
> and param files (with funcset and node constraints), invokes javac, and
> then ecj (for a couple of generations the output the population) then
> go through the whole cycle again until termination criteria are
> achieved. That would avoid digging too deeply into the ECJ internals,
> but it also strikes me as extremely expensive.
javac is indeed extremely expensive. I also think there's a better
way: emit your own VM code. That's actually surprisingly easy. You
write bytes to an array, then load the array with a custom ClassLoader.
The byte format is highly standard and easy to understand. And it'd
be a billion times faster than emitting java code, recompiling it, and
then loading it.
Sean
|