Compiling E: Phases of Transformations

Mark S. Miller markm@caplet.com
Mon, 07 Aug 2000 23:42:57 -0700


At 11:33 AM 8/7/00 , Monty Zukowski wrote:
> > Doesn't the
> > ability to debug make this whole transformation scheme much more
> > complicated?
>
>Usually debugging information is saved with the byte codes or binary executable.
>If these compiliation phases are actually implemented as source to source
>translations then I hope E has a statement equivalent to C's #line directive.

Actually, as I mentioned in my reply to Dean, this is where I lied.  E does 
not currently have an equivalent of a #line directive, but perhaps it 
should.  Instead, E parse nodes point at an optional source position 
description 
http://www.erights.org/javadoc/org/erights/e/elib/base/SourceSpan.html .  
This includes not just a line number, but, imitating Smalltalk and Joule, a 
character span.  In any language in which multiple expressions can occur on 
a line, it is *much* better for the debugger to highlight the specific 
expression rather than the entire line.

My source-to-source transformations are done with a visitor pattern 
modified to pass the SourceSpan information by default from the pre- to the 
corresponding post-transformed node.

When Dean & I compiled Joule to Smalltalk, we actually compiled to Smalltalk 
parse trees rather than Smalltalk source.  We used this modified visitor 
pattern to pass source information all the way through from Joule source 
text to the Smalltalk virtual machine.  We were then able to use the 
Smalltalk debugger to watch execution proceed over Joule source text with 
hardly any extra work at all.  The same would be true on an E machine for 
any language compiling to E parse trees rather than to E source text.

But this does create one way in which E trees can't be written down.  A 
#line-like directive may still be a good idea.


>Actually something like #line would be very useful for parser generators as
>well.  GCC's #line directive allows nesting so you can see the whole stack of
>files which included other files, etc.
>
>Something like:
>
>Syntax Error at someheader.h:12
>     included in main.c:8 ...

I don't have anything like textual file inclusion, so I don't think I have 
*this kind* of nesting.


>Line directives would be useful for parser generators and other things that
>generate E.  This information could be kept externally to the program, too, to
>avoid cluttering up the intermediate source code.

If you keep it external, you need to be able to say what it corresponds to 
in the program.  This would seem to require giving each parse-node a unique 
identity.  Since parse-nodes are pass-by-copy (to enable mobile code), this 
could be unpleasant.

In the absence of #line-like directives, parser generators should still be 
happy to generate parse-trees with source positions in the grammar file.  
Likewise, were ANTLR to generate .class files instead of .java files, it 
could represent real source positions, and get many Java debuggers
to display positions in the source grammar file with no extra work.


         Cheers,
         --MarkM