Hi,
I have a question about the use of the PTC1 builder and functionset.
Before I outline my question, a caveat  I have moved the location of
the probability of selection from the NodeConstraint object to the
GPNode object, and when the PTC tries to compute its probabilities it is
using values from each GPNode, not each NodeConstraint that has been
configured. That should not change the interpretation of my question, I
don't think, because at the moment I am still leaving all probabilities
at the default, 1.0.
I have defined a constraint on the sole tree in each individual, so that
there must be a node returning a type I have labelled "TopLevelOutput."
This is in recognition of the fact that for my particular problem,
solutions must conform to a certain schema rooted at the top of the
tree. My params file defines only one nonterminal whose node constraint
has a return type of TopLevelOutput, and zero terminals returning that
type. It seems to me perfectly legitimate in this case not to define any
terminal with the TopLevelOutput return type. I could readily create a
dummy terminal of that return type, and I think that that would mitigate
the acute issue I am encountering, but I would like to understand some
of the design decisions in ECJ, because they tend to reflect the results
of many experiments performed in the past by people more knowledgeable
that myself.
In the ptc1() method of PTC1.java, an intial test is performed that
determines whether to try to add a terminal or a nonterminal. The
builder will attempt to retrieve a terminal from the GPFuncInfo array if
the current depth in the tree is >= maxDepth, or if this method call
returns false:
state.random[thread].nextBoolean( nonterminalSelectionProbabilities[t] )
When looking up the selection probability for type 0, which is the index
for TopLevelOutput, I see that it is 0.15, which means that the vast
majority of the time, when trying to decide whether to lookup a terminal
or a nonterminal of type 0, a terminal will be chosen.
I read the code in PTCFunctionSet, and I see how the
computeNonterminalSelectionProbailities( int expectedTreeSize ) computes
a value for a node that returns a given type, in this case type 0. I do
not understand exactly why the probability is computed the way it is,
perhaps I should reread the paper that describes the PTC builder
strategy again. Suffice it to say that the sole operator that returns
type 0 has 5 child nodes, and the expectedTreeSize var is 4, so the
resulting probabiliy of selection of the sole nonterminal node is 0.15.
If the strategy for computing the selection probability is not in any
other published research, perhaps you could explain the motivation.
I would think that in this special case, where there is only one node
that satisfies the strong typing constraint of returning TopLevelOutput,
the scheme for picking a terminal or nonterminal node in PTC1.ptc1() is
not the best one. I am thinking about modifying or extending PTC1 to try
to deal with this case, but first I thought I would ask Sean Luke or
anyone else on the list if you think that my approach for constraining
the contents of the tree is consistent with the conventional wisdom
about the composition of fucntionsets.
thx
George Coles
