Print

Print


Hi,
   I have a question about the use of the PTC1 builder and functionset. 
Before I outline my question, a caveat - I have moved the location of 
the probability of selection from the NodeConstraint object to the 
GPNode object, and when the PTC tries to compute its probabilities it is 
using values from each GPNode, not each NodeConstraint that has been 
configured. That should not change the interpretation of my question, I 
don't think, because at the moment I am still leaving all probabilities 
at the default, 1.0.

I have defined a constraint on the sole tree in each individual, so that 
there must be a node returning a type I have labelled "TopLevelOutput." 
This is in recognition of the fact that for my particular problem, 
solutions must conform to a certain schema rooted at the top of the 
tree. My params file defines only one nonterminal whose node constraint 
has a return type of TopLevelOutput, and zero terminals returning that 
type. It seems to me perfectly legitimate in this case not to define any 
terminal with the TopLevelOutput return type. I could readily create a 
dummy terminal of that return type, and I think that that would mitigate 
the acute issue I am encountering, but I would like to understand some 
of the design decisions in ECJ, because they tend to reflect the results 
of many experiments performed in the past by people more knowledgeable 
that myself.

In the ptc1() method of PTC1.java, an intial test is performed that 
determines whether to try to add a terminal or a nonterminal. The 
builder will attempt to retrieve a terminal from the GPFuncInfo array if 
the current depth in the tree is >= maxDepth, or if this method call 
returns false:

state.random[thread].nextBoolean( nonterminalSelectionProbabilities[t] )

When looking up the selection probability for type 0, which is the index 
for TopLevelOutput, I see that it is 0.15, which means that the vast 
majority of the time, when trying to decide whether to lookup a terminal 
or a nonterminal of type 0, a terminal will be chosen.

I read the code in PTCFunctionSet, and I see how the 
computeNonterminalSelectionProbailities( int expectedTreeSize ) computes 
a value for a node that returns a given type, in this case type 0. I do 
not understand exactly why the probability is computed the way it is, 
perhaps I should reread the paper that describes the PTC builder 
strategy again. Suffice it to say that the sole operator that returns 
type 0 has 5 child nodes, and the expectedTreeSize var is 4, so the 
resulting probabiliy of selection of the sole nonterminal node is 0.15. 
If the strategy for computing the selection probability is not in any 
other published research, perhaps you could explain the motivation.

I would think that in this special case, where there is only one node 
that satisfies the strong typing constraint of returning TopLevelOutput, 
the scheme for picking a terminal or nonterminal node in PTC1.ptc1() is 
not the best one. I am thinking about modifying or extending PTC1 to try 
to deal with this case, but first I thought I would ask Sean Luke or 
anyone else on the list if you think that my approach for constraining 
the contents of the tree is consistent with the conventional wisdom 
about the composition of fucntionsets.


thx
George Coles