Compiling E: Phases of Transformations
Mark S. Miller
markm@caplet.com
Mon, 07 Aug 2000 23:42:57 -0700
At 11:33 AM 8/7/00 , Monty Zukowski wrote:
> > Doesn't the
> > ability to debug make this whole transformation scheme much more
> > complicated?
>
>Usually debugging information is saved with the byte codes or binary executable.
>If these compiliation phases are actually implemented as source to source
>translations then I hope E has a statement equivalent to C's #line directive.
Actually, as I mentioned in my reply to Dean, this is where I lied. E does
not currently have an equivalent of a #line directive, but perhaps it
should. Instead, E parse nodes point at an optional source position
description
http://www.erights.org/javadoc/org/erights/e/elib/base/SourceSpan.html .
This includes not just a line number, but, imitating Smalltalk and Joule, a
character span. In any language in which multiple expressions can occur on
a line, it is *much* better for the debugger to highlight the specific
expression rather than the entire line.
My source-to-source transformations are done with a visitor pattern
modified to pass the SourceSpan information by default from the pre- to the
corresponding post-transformed node.
When Dean & I compiled Joule to Smalltalk, we actually compiled to Smalltalk
parse trees rather than Smalltalk source. We used this modified visitor
pattern to pass source information all the way through from Joule source
text to the Smalltalk virtual machine. We were then able to use the
Smalltalk debugger to watch execution proceed over Joule source text with
hardly any extra work at all. The same would be true on an E machine for
any language compiling to E parse trees rather than to E source text.
But this does create one way in which E trees can't be written down. A
#line-like directive may still be a good idea.
>Actually something like #line would be very useful for parser generators as
>well. GCC's #line directive allows nesting so you can see the whole stack of
>files which included other files, etc.
>
>Something like:
>
>Syntax Error at someheader.h:12
> included in main.c:8 ...
I don't have anything like textual file inclusion, so I don't think I have
*this kind* of nesting.
>Line directives would be useful for parser generators and other things that
>generate E. This information could be kept externally to the program, too, to
>avoid cluttering up the intermediate source code.
If you keep it external, you need to be able to say what it corresponds to
in the program. This would seem to require giving each parse-node a unique
identity. Since parse-nodes are pass-by-copy (to enable mobile code), this
could be unpleasant.
In the absence of #line-like directives, parser generators should still be
happy to generate parse-trees with source positions in the grammar file.
Likewise, were ANTLR to generate .class files instead of .java files, it
could represent real source positions, and get many Java debuggers
to display positions in the source grammar file with no extra work.
Cheers,
--MarkM