Print

Print


On Apr 9, 2008, at 10:00 AM, Lawrence Tsang wrote:

> Hi Tibi and Yow Tzu Lim,
>
>      Thanks for all the answers. I've got the idea.

Too late!  Now you've got me responding too.  :-)

It's important to distinguish between "types" and "data".  Formally a  
type is just a compatibility constraint.  If function FOO has a  
"return type" of bar, and function BAZ takes one argument which is of  
type "bar", then FOO is permitted to fill the argument slot of BAZ.

This doesn't actually have anything to do with data per se; but the  
#1 reason why you'd want to state that functions have certain types  
is because if FOO sends out an integer and BAZ expects a String, then  
yo wouldn't want FOO to be allowed to hook up with BAZ, so you'd  
stipulate that they have different types.

Because formal language types are so closely associated with "data  
types" -- two functions are typed the same way because they return  
the same data or whatever -- people think they're the same thing.   
But they're not.  There are other reasons you might not want two  
functions to hook up.  For example, you might want to be guaranteeing  
that your tree structure has functions of some form A in its top row,  
functions of some form B in its second row, and functions of some  
form C in its third row.  This kind of thing shows up a lot when the  
tree structures aren't "programs" per se, but are used for some other  
functionality.  To do this, you could say that the root only accepts  
functions whose return type is "A", and "A" functions only accept  
child arguments of type "B", and so on.

What's nil?

That's easy.  By default, most ECJ examples are "untyped", meaning  
that there are no constraints on who is able to hook up with whom.   
This isn't zero types: it's one type -- everyone shares the same type  
so everyone is compatible with everyone.  I had to have some name for  
that type, so I picked nil.  It's just a name.  You can change it to  
something else like gobbledygook and that'd work fine.

One further note on typing: ECJ's typing is a fairly rudimentary  
typing which I call "set typing".  Types are sets of objects.  Two  
things are compatible if their types have a nonempty intersection --  
they share an object common in their two sets.  You can do a lot with  
set typing, not the least of which is generic functions and  
replicating the functionality of polymorphism.  People actually  
typically need an even simpler typing notion, which I call "atomic  
typing" -- here types are just symbols, and two functions are  
compatible if their types are the same.  Atomic typing is really just  
a degenerate form of set typing: instead of an atomic type FOO, you  
could just have a set type which contains a unique single symbol  
inside it, which is special just to that set type:  FOO2 = { FOO }.   
So you can easily replicate atomic typing with set typing -- I  
include it just for convenience.  Also, you can mix atomic types and  
set types.  An atomic type is "compatible" with a set type if the  
atomic type is found in the set type's set.

Note that this is strictly NOT as powerful as various "polymorphic  
typing" approaches people have taken -- there's been a fair bit of  
work on people who have functions whose return types (say) change  
based on the argument types of the children that wound up plugging  
into them.  This is very powerful stuff but it's a nightmare to do  
general crossover operators and mutation operators in, and it's  
fairly rarely needed.  An example of where it'd be nice to have.   
Let's say the data structure your functions are passing around is a  
matrix.  You want to make a MATRIX_MULTIPLY function which takes two  
matrices, multiplies them and returns the resulting matrix.  In a  
sophisticated polymorphic typing mechanism you could say that IF you  
have an N x M matrix plugged into your first child, AND you had an M  
x P matrix plugged into your second child, then your return type  
would be "N x P".  Note that this implies that you have a potentially  
infinite number of types and so this clearly can't be done with set  
typing.  Maarten Keijzer (http://www.cs.vu.nl/~mkeijzer/) did work  
exactly like this, and wound up building his own extension of EO (I  
believe) to handle polymorphic typing.  Tina Yu (http://www.cs.mun.ca/ 
~tinayu/) did polymorphic typing for her thesis I think.

Sean