package ec.gp.build; import ec.gp.*; import java.util.*; import java.math.*; import ec.util.*; import ec.*; /* * Uniform.java * * Created Fri Jan 26 14:02:08 EST 2001 * By: Sean Luke */ /** Uniform implements the algorithm described in

Bohm, Walter and Andreas Geyer-Schulz. 1996. "Exact Uniform Initialization for Genetic Programming". In Foundations of Genetic Algorithms IV, Richard Belew and Michael Vose, eds. Morgan Kaufmann. 379-407. (ISBN 1-55860-460-X)

The user-provided requested tree size is either provided directly to the Uniform algorithm, or if the size is NOSIZEGIVEN, then Uniform will pick one at random from the GPNodeBuilder probability distribution system (using either max-depth and min-depth, or using num-sizes).

Further, if the user sets the true-dist parameter, the Uniform will ignore the user's specified probability distribution and instead pick from a distribution between the minimum size and the maximum size the user specified, where the sizes are distributed according to the actual number of trees that can be created with that size. Since many more trees of size 10 than size 3 can be created, for example, size 10 will be picked that much more often.

Uniform also prints out the actual number of trees that exist for a given size, return type, and function set. As if this were useful to you. :-)

The algorithm, which is quite complex, is described in pseudocode below. Basically what the algorithm does is this:

1. For each function set and return type, determine the number of trees of each size which exist for that function set and tree type. Also determine all the permutations of tree sizes among children of a given node. All this can be done with dynamic programming. Do this just once offline, after the function sets are loaded.
2. Using these tables, construct distributions of choices of tree size, child tree size permutations, etc.
3. When you need to create a tree, pick a size, then use the distriutions to recursively create the tree (top-down).

Dealing with Zero Distributions

Some domains have NO tree of a certain size. For example, Artificial Ant's function set can make NO trees of size 2. What happens when we're asked to make a tree of (invalid) size 2 in Artificial Ant then? Uniform presently handles it as follows:

1. If the system specifically requests a given size that's invalid, Uniform will look for the next larger size which is valid. If it can't find any, it will then look for the next smaller size which is valid.
2. If a random choice yields a given size that's invalid, Uniform will pick again.
3. If there is *no* valid size for a given return type, which probably indicates an error, Uniform will halt and complain.

### Pseudocode:

```
*    Func NumTreesOfType(type,size)
*        If NUMTREESOFTYPE[type,size] not defined,       // memoize
*            N[type] = all nodes compatible with type
*            NUMTREESOFTYPE[type,size] = Sum(n in N[type], NumTreesRootedByNode(n,size))
*            return NUMTREESOFTYPE[type,size]
*
*    Func NumTreesRootedByNode(node,size)
*        If NUMTREESROOTEDBYNODE[node,size] not defined,   // memoize
*            count = 0
*            left = size - 1
*            If node.children.length = 0 and left = 0  // a valid terminal
*                count = 1
*            Else if node.children.length <= left  // a valid nonterminal
*                For s is 1 to left inclusive  // yeah, that allows some illegal stuff, it gets set to 0
*                    count += NumChildPermutations(node,s,left,0)
*            NUMTREESROOTEDBYNODE[node,size] = count
*        return NUMTREESROOTEBYNODE[node,size]
*
*
*    Func NumChildPermutations(parent,size,outof,pickchild)
*    // parent is our parent node
*    // size is the size of pickchild's tree that we're considering
*    // pickchild is the child we're considering
*    // outof is the total number of remaining nodes (including size) yet to fill
*        If NUMCHILDPERMUTATIONS[parent,size,outof,pickchild] is not defined,        // memoize
*            count = 0
*            if pickchild = parent.children.length - 1        and outof==size        // our last child, outof must be size
*                count = NumTreesOfType(parent.children[pickchild].type,size)
*            else if pickchild < parent.children.length - 1 and
*                                outof-size >= (parent.children.length - pickchild-1)    // maybe we can fill with terminals
*                cval = NumTreesOfType(parent.children[pickchild].type,size)
*                tot = 0
*                For s is 1 to outof-size // some illegal stuff, it gets set to 0
*                    tot += NumChildPermutations(parent,s,outof-size,pickchild+1)
*                count = cval * tot
*            NUMCHILDPERMUTATIONS [parent,size,outof,pickchild] = count
*        return NUMCHILDPERMUTATIONS[parent,size,outof,pickchild]
*
*
*    For each type type, size size
*        ROOT_D[type,size] = probability distribution of nodes of type and size, derived from
*                            NUMTREESOFTYPE[type,size], our node list, and NUMTREESROOTEDBYNODE[node,size]
*
*    For each parent,outof,pickchild
*        CHILD_D[parent,outof,pickchild] = probability distribution of tree sizes, derived from
*                            NUMCHILDPERMUTATIONS[parent,size,outof,pickchild]
*
*    Func FillNodeWithChildren(parent,pickchild,outof)
*        If pickchild = parent.children.length - 1               // last child
*            Fill parent.children[pickchild] with CreateTreeOfType(parent.children[pickchild].type,outof)
*        Else choose size from CHILD_D[parent,outof,pickchild]
*            Fill parent.pickchildren[pickchild] with CreateTreeOfType(parent.children[pickchild].type,size)
*            FillNodeWithChildren(parent,pickchild+1,outof-size)
*        return
```
Func CreateTreeOfType(type,size) Choose node from ROOT_D[type,size] If size > 1 FillNodeWithChildren(node,0,size-1) return node

Parameters
 base.true-dist bool= true or false (default) (should we use the true numbers of trees for each size as the distribution for picking trees, as opposed to the user-specified distribution?)
*/ public class Uniform extends GPNodeBuilder { public static final String P_UNIFORM = "uniform"; public static final String P_TRUEDISTRIBUTION = "true-dist"; public Parameter defaultBase() { return GPBuildDefaults.base().push(P_UNIFORM); } // the checkboundary we hand to RandomChoice public final static int CHECKBOUNDARY = 8; // Mapping of integers to function sets public GPFunctionSet[] functionsets; // Mapping of function sets to Integers public Hashtable _functionsets; // Mapping of GPNodes to Integers (thus to ints) public Hashtable funcnodes; // number of nodes public int numfuncnodes; // max arity of any node public int maxarity; // maximum size of nodes computed public int maxtreesize; // true size distributions public BigInteger[/*functionset*/][/*type*/][/*size*/] _truesizes; public double[/*functionset*/][/*type*/][/*size*/] truesizes; // do we use the true distributions to pick tree sizes? public boolean useTrueDistribution; // Sun in its infinite wisdom (what idiots) decided to make // BigInteger IMMUTABLE. There is a MutableBigInteger, but it's not // public! And Sun only caches the first 16 positive and 16 negative // integer constants, not exactly that useful for us. As a result, we'll // be making a dang lot of BigIntegers here. Garbage-collection hell. :-( // ...well, it's not all that slow really. public BigInteger NUMTREESOFTYPE[/*FunctionSet*/][/*type*/][/*size*/]; public BigInteger NUMTREESROOTEDBYNODE[/*FunctionSet*/][/*nodenum*/][/*size*/]; public BigInteger NUMCHILDPERMUTATIONS[/*FunctionSet*/][/*parentnodenum*/][/*size*/][/*outof*/][/*pickchild*/]; // tables derived from the previous ones through some massaging public UniformGPNodeStorage ROOT_D[/*FunctionSet*/][/*type*/][/*size*/][/*the nodes*/]; public boolean ROOT_D_ZERO[/*FunctionSet*/][/*type*/][/*size*/]; // is ROOT_D all zero for these values? public double CHILD_D[/*FunctionSet*/][/*type*/][/*outof*/][/*pickchild*/][/* the nodes*/]; public void setup(final EvolutionState state, final Parameter base) { super.setup(state,base); Parameter def = defaultBase(); // use true distributions? false is default useTrueDistribution = state.parameters.getBoolean( base.push(P_TRUEDISTRIBUTION), def.push(P_TRUEDISTRIBUTION),false); if (minSize>0) // we're using maxSize and minSize maxtreesize=maxSize; else if (sizeDistribution != null) maxtreesize = sizeDistribution.length; else state.output.fatal("Uniform is used for the GP node builder, but no distribution was specified." + " You must specify either a min/max size, or a full size distribution.", base.push(P_MINSIZE), def.push(P_MINSIZE)); // preprocess offline preprocess(state,maxtreesize); } public int pickSize(final EvolutionState state, final int thread, final int functionset, final int type) { if (useTrueDistribution) return RandomChoice.pickFromDistribution( truesizes[functionset][type],state.random[thread].nextDouble(),CHECKBOUNDARY); else return super.pickSize(state,thread); } public void preprocess(final EvolutionState state, final int _maxtreesize) { state.output.message("Determining Tree Sizes"); maxtreesize = _maxtreesize; Hashtable functionSetRepository = ((GPInitializer)state.initializer).functionSetRepository; // Put each function set into the arrays functionsets = new GPFunctionSet[functionSetRepository.size()]; _functionsets = new Hashtable(); Enumeration e = functionSetRepository.elements(); int count=0; while(e.hasMoreElements()) { GPFunctionSet set = (GPFunctionSet)(e.nextElement()); _functionsets.put(set,new Integer(count)); functionsets[count++] = set; } // For each function set, assign each GPNode to a unique integer // so we can keep track of it (ick, this will be inefficient!) funcnodes = new Hashtable(); Hashtable t_nodes = new Hashtable(); count = 0; maxarity=0; GPNode n; for(int x=0;x= (parent.children.length - pickchild-1)) { BigInteger cval = numTreesOfType(initializer,functionset,parent.constraints(initializer).childtypes[pickchild].type,size); BigInteger tot = BigInteger.valueOf(0); for (int s=1; s<=outof-size; s++) tot = tot.add(numChildPermutations(initializer,functionset,parent,s,outof-size,pickchild+1)); count = cval.multiply(tot); } // System.out.println("Parent: " + parent + " Size: " + size + " OutOf: " + outof + // " PickChild: " + pickchild + " Count: " +count); NUMCHILDPERMUTATIONS[functionset][intForNode(parent)][size][outof][pickchild] = count; } return NUMCHILDPERMUTATIONS[functionset][intForNode(parent)][size][outof][pickchild]; } private final double getProb(final BigInteger i) { if (i==null) return 0.0f; else return i.doubleValue(); } public void computePercentages() { // load ROOT_D for(int f = 0;f 1) // nonterminal fillNodeWithChildren(initializer,functionset,node,ROOT_D[functionset][type][size][choice].node,0,size-1,mt); return node; } void fillNodeWithChildren(final GPInitializer initializer, final int functionset, final GPNode parent, final GPNode parentc, final int pickchild, final int outof, final MersenneTwisterFast mt) throws CloneNotSupportedException { if (pickchild == parent.children.length - 1) { parent.children[pickchild] = createTreeOfType(initializer,functionset,parent.constraints(initializer).childtypes[pickchild].type,outof, mt); } else { int size = RandomChoice.pickFromDistribution( CHILD_D[functionset][intForNode(parentc)][outof][pickchild], mt.nextDouble(),CHECKBOUNDARY); parent.children[pickchild] = createTreeOfType(initializer,functionset,parent.constraints(initializer).childtypes[pickchild].type,size,mt); fillNodeWithChildren(initializer,functionset,parent,parentc,pickchild+1,outof-size,mt); } parent.children[pickchild].parent = parent; parent.children[pickchild].argposition = (byte)pickchild; } public GPNode newRootedTree(final EvolutionState state, final GPType type, final int thread, final GPNodeParent parent, final GPFunctionSet set, final int argposition, final int requestedSize) throws CloneNotSupportedException { GPInitializer initializer = ((GPInitializer)state.initializer); if (requestedSize == NOSIZEGIVEN) // pick from the distribution { final int BOUNDARY = 20; // if we try 20 times and fail, check to see if it's possible to succeed int bound=0; int fset = ((Integer)(_functionsets.get(set))).intValue(); int siz = pickSize(state,thread,fset,type.type); int typ = type.type; while(ROOT_D_ZERO[fset][typ][siz]) { if (++bound == BOUNDARY) { // do the check for(int x=0;x=0;x--) if (ROOT_D_ZERO[fset][typ][siz]) { siz=x; break; } // issue an error state.output.fatal("ec.gp.build.Uniform was asked to build a tree with functionset " + set + " rooted with type " + type + ", but cannot because for some reason there are no trees of any valid size (within the specified size range) which exist for this function set and type."); } GPNode n = createTreeOfType(initializer,fset,typ,siz,state.random[thread]); n.parent = parent; n.argposition = (byte)argposition; return n; } } } class UniformGPNodeStorage implements RandomChoiceChooserD { public GPNode node; public double prob; public double getProbability(final Object obj) { return (((UniformGPNodeStorage)obj).prob); } public void setProbability(final Object obj, final double _prob) { ((UniformGPNodeStorage)obj).prob = _prob; } }