How to be efficient on top of a decently engineered jvm

Mark S. Miller markm@caplet.com
Thu, 06 May 1999 00:06:21 -0700


				Peano Movers


Right now, E has an unreasonably slow implementation.  It obviously needs to 
be compiled, but to what target platform?  An obvious answer (also used by 
JPython) is JVM bytecodes, resulting in a preservation of E's Java-derived 
platform independence, leverage of the existing Java libraries, and 
potential to ride the java-acceptance wave.

However, there was one requirement for efficient execution I couldn't 
imagine how to do efficiently on even a decent stock jvm. 
Precision-unlimited integers.  (By "stock", I mean one not engineered 
specifically to enable E to run fast.)  Lisp, Smalltalk, Self, and other 
symbolic languages are only possible, in an engineering sense, because of 
the tagged arithmetic trick.

If you don't statically know the type of an object reference -- the typical 
case -- you can only allocate a generic "pointer to object".  If the object 
pointed to is a small integer (in Smalltalk, 29 bits), then tag bits in the 
pointer say that the rest of the bits are the integer.  The implementation 
of sending the "+" message first checks these tag bits, and only does a 
general message pass if they fail.  If the inputs are small and the result 
is small, no heap allocation happens, and the only extra cost is tag 
checking and overflow checking.

Unfortunately, jvm arithmetic is only built to be efficient for scalar types, 
scalars are *NOT* objects in java(!?!?), and java's static type checking is 
inescapable.  It seemed to me that the only reasonable jvm-compatible way to 
express E's integers was to always operate of references to BigIntegers, 
which means always heap allocating.

Then Ping told me how Python's arithmetic is implemented, which turns 
out to be similar to how symbolic languages implement other scalars, such as 
characters:  preallocate an array of the contiguous common ones as objects, 
and normally point at these preallocated objects.  For characters, one might 
decide, for example, to preallocate the ascii or latin-1 characters.

Java's scalar type "byte" and corresponding object type "Byte" are signed 
8-bit numbers ranging from -128 to +127.  Seems like a good preallocation 
candidate to me.  The compiled code for "a + b" might then look like:

	Object a = //...
	Object b = //...
	int temp;
	Object result;
	try {
	    temp = ((Byte)a).byteValue() + ((Byte)b).byteValue();
	    result = ByteCache.TheBytes[128 + temp];

	} catch (ClassCastException cce) {
	    //inputs weren't bytes, so do a generic message pass
	    Object[] args = { b };
	    result = E.call(a, "add", args);

	} catch (IndexOutOfBoundsException ioobe) {
	    //temp doesn't fit into a byte, so heap allocate
	    result = BigInteger.valueOf(temp);
	}

Something that all JVMs get very right (that Python gets slightly wrong) is 
that a try/catch block costs *no* extra execution time when the try block 
completes successfully.  (Thanks to an innovation by Barbara Liskov in CLU.)  
The success case above has three pointer indirection overheads compared to 
the normal tagged arithmetic trick, but is otherwise very similar.  If the 
success case -- byte+byte produces byte -- is sufficiently more common than 
other cases of addition, we win.

Of course, the above may be a distressingly lot of code to put inline for 
every addition.  It may be better to put it into a separate function, 
depending on the standard tradeoffs.  This would assume that future JVMs 
become better are turning methods calls into (in C++ terminology) 
non-virtual calls when possible.


With this problem solved, we can be efficient given a decently implemented 
jvm not customized for E.  Now that there are three competing clean-room 
open-source jvms available, this is likely to happen much faster than we 
could engineer a decent E virtual machine from scratch, and faster than we 
could port the E libraries onto an existing decently engineered platform 
like Squeak.

It's a shame when a successful tidal wave of hype (Java) is a better reason 
to expect a decent platform than is the present existence of a decently 
engineered platform (Squeak), but such is the logic of bandwagons (or 
Schelling points).  The JVM will probably be the x86-architecture-equivalent 
for symbolic-language-virtual-machines, and may remain so till singularity.
 At least it's better that the x86.


	Cheers,
	--MarkM