How to be efficient on top of a decently engineered jvm
Mark S. Miller
markm@caplet.com
Thu, 06 May 1999 00:06:21 -0700
Peano Movers
Right now, E has an unreasonably slow implementation. It obviously needs to
be compiled, but to what target platform? An obvious answer (also used by
JPython) is JVM bytecodes, resulting in a preservation of E's Java-derived
platform independence, leverage of the existing Java libraries, and
potential to ride the java-acceptance wave.
However, there was one requirement for efficient execution I couldn't
imagine how to do efficiently on even a decent stock jvm.
Precision-unlimited integers. (By "stock", I mean one not engineered
specifically to enable E to run fast.) Lisp, Smalltalk, Self, and other
symbolic languages are only possible, in an engineering sense, because of
the tagged arithmetic trick.
If you don't statically know the type of an object reference -- the typical
case -- you can only allocate a generic "pointer to object". If the object
pointed to is a small integer (in Smalltalk, 29 bits), then tag bits in the
pointer say that the rest of the bits are the integer. The implementation
of sending the "+" message first checks these tag bits, and only does a
general message pass if they fail. If the inputs are small and the result
is small, no heap allocation happens, and the only extra cost is tag
checking and overflow checking.
Unfortunately, jvm arithmetic is only built to be efficient for scalar types,
scalars are *NOT* objects in java(!?!?), and java's static type checking is
inescapable. It seemed to me that the only reasonable jvm-compatible way to
express E's integers was to always operate of references to BigIntegers,
which means always heap allocating.
Then Ping told me how Python's arithmetic is implemented, which turns
out to be similar to how symbolic languages implement other scalars, such as
characters: preallocate an array of the contiguous common ones as objects,
and normally point at these preallocated objects. For characters, one might
decide, for example, to preallocate the ascii or latin-1 characters.
Java's scalar type "byte" and corresponding object type "Byte" are signed
8-bit numbers ranging from -128 to +127. Seems like a good preallocation
candidate to me. The compiled code for "a + b" might then look like:
Object a = //...
Object b = //...
int temp;
Object result;
try {
temp = ((Byte)a).byteValue() + ((Byte)b).byteValue();
result = ByteCache.TheBytes[128 + temp];
} catch (ClassCastException cce) {
//inputs weren't bytes, so do a generic message pass
Object[] args = { b };
result = E.call(a, "add", args);
} catch (IndexOutOfBoundsException ioobe) {
//temp doesn't fit into a byte, so heap allocate
result = BigInteger.valueOf(temp);
}
Something that all JVMs get very right (that Python gets slightly wrong) is
that a try/catch block costs *no* extra execution time when the try block
completes successfully. (Thanks to an innovation by Barbara Liskov in CLU.)
The success case above has three pointer indirection overheads compared to
the normal tagged arithmetic trick, but is otherwise very similar. If the
success case -- byte+byte produces byte -- is sufficiently more common than
other cases of addition, we win.
Of course, the above may be a distressingly lot of code to put inline for
every addition. It may be better to put it into a separate function,
depending on the standard tradeoffs. This would assume that future JVMs
become better are turning methods calls into (in C++ terminology)
non-virtual calls when possible.
With this problem solved, we can be efficient given a decently implemented
jvm not customized for E. Now that there are three competing clean-room
open-source jvms available, this is likely to happen much faster than we
could engineer a decent E virtual machine from scratch, and faster than we
could port the E libraries onto an existing decently engineered platform
like Squeak.
It's a shame when a successful tidal wave of hype (Java) is a better reason
to expect a decent platform than is the present existence of a decently
engineered platform (Squeak), but such is the logic of bandwagons (or
Schelling points). The JVM will probably be the x86-architecture-equivalent
for symbolic-language-virtual-machines, and may remain so till singularity.
At least it's better that the x86.
Cheers,
--MarkM