Print

Print


On Sat, 22 Oct 2005, Colbert Philippe wrote:

> Good comment!  I tend to agree with your statement.  Some peole still
> code like they want to save micro-seconds.  It's not an issue anymore.
> It's better to have cleaner and simpler code than to have heavy
> complicated code that saves only micro-seconds.  I should know because I
> have extensive experience in program profiling.  Today's Intel or AMD's
> 64-bit processor is multi-core (two processor in the same box) and is
> 64-bit wide (rather than older 32-bit wide).  The cache is often 512 Meg
> or 1 Meg.  So it is extreemly fast.  It makes small code optimization
> totally unnecessary.

As time goes by, I have been working to de-optimize ECJ where it wasn't
all that necessary.  But let me assure you as someone who's done a lot of
coding in this regard, your last statement is not at all true.

I'm a stickler for efficiency in ECJ (and MASON) because of what I and my
lab do with them: genetic programming involving a very large number of
expensive evaluations.  By large I mean: we have an upcoming paper
involving 25,000 runs, each of which producing 100,000 individuals, many
of whom evaluate on average 1,000 nodes.

There are certain places where we painted ourselves in a corner in ECJ
by not heeding Knuth's Warning, but generally I think it's been pretty
good.

The big place where ECJ made cushiness sacrifices in the name of
efficiency is in its widespread use of arrays rather than ArrayLists (or
as they would have been in 1997, Vectors).  Let me give you an idea of the
gain there, however.  In MASON I created an object called Bag: it's more
or less an ArrayList with a few special methods, but importantly it lets
you have direct access to the underlying array.  This is a pretty violent
affront to good encapsulation, but:

        - Bag's carefully-written get/set methods are twice the speed
        - Accessing Bag's array directly gets you 3.5 times the speed

...and that still requires you to cast from the array to your native
object type.  If you have a dedicated, types array, you get about 5 times
the speed of using ArrayLists.

That's not a few milliseconds.

There are a couple of others: the big one being the unsynchronized,
heavily inlined MersenneTwisterFast, which is approximately twice the
speed of MersenneTwister.  Again, that's a big difference.

There are definitely trade-offs involved in doing efficient code versus
cushy code.  But I think industrial-strength EC *definitely* falls on the
efficiency side of the spectrum.  My goal has been to write uncushy code
that *I* maintain, but to keep others from having to do so in their
applications.

Sean