February 2012


Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Sean Luke <[log in to unmask]>
Reply To:
MASON Multiagent Simulation Toolkit <[log in to unmask]>
Wed, 1 Feb 2012 12:57:36 -0500
text/plain (38 lines)
On Feb 1, 2012, at 11:21 AM, Richard O. Legendi wrote:

> The code is slow with and without using generics (i.e. ArrayList or ArrayList<Double>). The performance gain here is not because of not using generics, but using native primitive arrays (double[]) to store the elements and saving time on the (automatic) conversion to wrapper types from double to Double.

With all due respect, I think generics have everything to do with this.  Sun specifically introduced autoboxing to deal with the problematic issue of reconciling generics with basic data types (boolean, double, int, etc.)  The problem with autoboxing is that it is exceptionally slow and furthermore naive Java coders (that is, 95% of them) do not realize what is going on.  Before generics it was clear what was happening because you could code it otherwise.

Had Sun properly done non-erasure generics and unified the type system (which was a major proposal at the time), none of this would be an issue.  That's the situation we have in -- I loathe to say this -- C++, which did generics properly.  We also would have the situation where generics actually sped *up* the code rather than doing nothing or slowing it down.

If your response is "well, those people shouldn't be coding Java", my answer to you is: I'm a library writer of a system that needs to be efficient.  I have to do what I can to encourage that.

Let me give you another example.  We recently attached an evolutionary computation framework to a large machine learning library written by an expert machine learning person but a naive Java coder. In this library all of his arrays are written as:

	Double[][] myarray;

Why does he do this?  Because he's also got lots of ArrayList<Double>, so it was only natural to retain the type: because like nearly every Java coder he doesn't understand the consequences.  Then he proceeds to do stuff like:

	myarray[x][y] = myarray[x][y-1]  * 2 + myarray[x][y-2]

... or whatever.  This all compiles just fine and everything looks peachy.  And depending on how smart the compiler and runtime is, it's amazingly slow.  We had to rewrite an awful lot.

Another fun one is when we see this -- very common:

	ArrayList<Double> foo = ...
	for(double d : foo)

This awful piece of code looks pretty but in fact is powerfully slow and and Sun has hidden all of the slowness implementation details so it's not obvious.

In code I am asked to review from modelers using one toolkit or another, I'd say that inadvertant boxing slowdowns due to mistaken notions of Java typing reinforced by generics and the new for loop probably account for 80% of the inefficiency problems I deal with.

There's a 50-year-old saying about Lisp coders: that they "know the value of everything and the cost of nothing".  That's no longer the case about Lisp.  But Java's picked up the mantle.

> 	 Where I'd like to see some generic is for instance something like Steppable<T extends SimState> to make the casting unnecessary in the step(T) function (ain't sure though if it's a valid concept from my side).

That's *exactly* the kind of thing I'm thinking about adding into MASON.  Adding generics where it definitively reduces boilerplate without encouraging laziness.  But we need to do it in a bulk consideration so it'll be a bit.