[E-Lang] compiling E

Darius Bacon darius@accesscom.com
Wed, 8 Aug 2001 21:57:51 -0700


Today MarkM, Dean, SteveJ, and I met to talk about compiling E.  Dean
and I have been working on the problem independently of each other,
with mostly complementary results.

Dean has sped up the current interpreter by compiling out noun lookup
and some slot accesses, and changing the representation of an object's
state to be essentially just an array of all its free variable
bindings.  He has some more improvements in mind but I'll leave all
this for him to write about.

There are subtleties in how we choose to group an object's data, in
that different choices make some values be retained longer by the
garbage collector, causing some programs to use asymptotically more
space.  (This is related to the `safe for space' property introduced
in SML/NJ, but not identical -- for one thing, E doesn't have to
optimize tail recursion.)  There's some unresolved controversy in this
regard about facets sharing state -- a topic best left to Mark or
Dean.

Meanwhile I've been working towards a bytecode compiler.  This design
takes Kernel E as input, then:

 0. verifies its static correctness;
 1. reduces it to a simplified sublanguage that maps fairly directly
    onto bytecodes but is still manipulable;
 2. does an optimization pass to clean up the translation and perform
    new transformations like inlining and promise elimination (this can
    be skipped at first);
 3. selects a physical layout for scopes and emits bytecode.

Dean's variable -> index transformation would most naturally go in
pass 3, while deslotification probably belongs earlier where it could
assist other optimizations.  I'll post more details soon, and figure
out how to use Dean's code when I've seen it.  So far I've worked only
on parts 0 and 1 above.

The sublanguage of pass 1 met some criticisms: 

 - It's not at first glance a Kernel E subset.
   This is mostly a documentation problem, but this criticism
   suggested the answer to the next one:

 - It describes only whole programs and not top-level expressions.
   This is best fixed by making it a slightly larger subset of Kernel E.

 - It can express the data layouts Dean used, but not very naturally.
   I'm not sure how to answer this one yet.  It's important because of
   the space-safety stuff mentioned earlier.

Discussing ENative took us to the problem of efficient data
representations in Java.  I may not have heard these ideas right but
basically you could have extra typed fields in your data object for
immediate data like small integers, etc., and this would work
especially well together with something like a SmallIntegerGuard, and
so on.  Since a main efficiency advantage of ENative with JNI was said
to be custom data representations, I wonder if ENative/JNI is still
worth pursuing.

As far as I can remember, those are all the topics that haven't come
up on e-lang before.  Much of the rest of the discussion went over
Kernel E and debugging support.  It was an enjoyable and I think
productive meeting -- thanks!

-Darius