RE: Build Problems Resolved mzukowski@bco.com
Thu, 02 Sep 1999 11:59:07 -0700

[Jonathan, any idea why he might be running into this problem? I just noticed that his posts are missing from the archive as well. --MarkM]

P.S. Could you cross post to elang list? Every time I copy e-lang@eros-os.org it gets rejected because of not finding the host, and I'm not in a position at this big company to figure out what's wrong where.

Thanks,

Monty

> >How exactly to you match trees? Do you first compile the
> match expression
> >into a tree and then compare the results?
>
> That happens to be how I do it, but it doesn't have to be how
> I do it. An
> observably identical quasi-pattern parser could instead produce a
> MatchMaker containing code for a decision tree compiled from
> the original
> string. Think about compiling Prolog clause-heads to Warren Abstract
> Machine instructions. Similarly, a ValueMaker could contain code for
> generating a tree, without itself containing a tree. The
> analogy here
> might be Prolog clause-bodies.
>
> >How do you distinguish between parents and children?
>
> I don't understand.

Without having studied the code, it sounds like you are building parse trees, in which case you have an exact correspondence between your trees and what you parsed and in what order. This is very convienent, especially for your example. Given E's design it seems well suited to that representation, but other less friendly languages can benefit greatly from an AST representation which is not so directly related to the parse. But that also means that I couldn't mix quasi-pattern parsing of text with trees because one representation isn't equivalent to the other. I can't be in a parser rule parsing text and then hit a subtree object and know what to do with it. The only way to get it right would be to unparse the subtree into text and let the parser do its thing. Actually the key information that is missing is which decision the parser made in order to choose which alternative to parse. In some cases there may not have been any alternatives. In other cases it might be possible to cache that information directly in the tree. But then you also have to worry about side effects of parsing such as scope state, etc. I may be worrying about a problem that doesn't need solved though.

> >How do you distinguish between parents and children?
>
> I don't understand.

ANTLR includes a facility for tree-parsing, assuming LISP like parent-sibling trees where each node responds to getChild() and getSibling() messages. To write a tree parser you need to denote the difference. Your approach to trees builds two trees and then matches them, letting the parser build the trees. It's just two different approaches.

Now onto the fun stuff. ANTLR's lexers are LL which makes them easy to reason about since they compile into recursive descent routines instead of DFAs. A simple-text matching quasi-pattern like `a@{x}d@y` would translate into ANTLR syntax as:

AnonymousRule:
options {greedy=false;}
"a" x:(.)* "d" y:(.)* ;

And following the rx quasi-pattern matcher, we could have anltr patterns like:

`format (@from:ID) into (@result ID) using (@fmt FormatString)` which is trivially translated to (assuming rules ID, WS and FormatString are already defined from a lexer-grammar)

"format" (WS)? from:ID (WS)? "into" (WS)? result:ID (WS)? "using" (WS)? fmt:FormatString

Why not `format @{from:ID} into @{result:ID} using @{FormatString}` ? I guess because you need everything inside of @{} to be only E patterns... Which means you need another kind of escape mechanism such as parenthesis for the "other" language like rx or antlr. And this because you want to implicitly quote the strings inbetween the embedded stuff.

When we get to trees with antlr there is no nice text representation and I wonder about the utility of an antlr tree quasi-pattern parser, although I can definately see the advantage of an antlr tree quasi-literal for the building of new trees.

I also wonder about real "parsing" of tokenized text. Seems like maybe overkill for these quasi-parsers.

Along the lines of the ending comments of your Quasi-Literal overview about Hypercard, I've been playing with something related but tangential. For the interesting problem of writing code generators I've been using a combination of noweb and Python. Quasi-literals are cool for small amounts of output text, but noweb is really nice for whole programs. For instance "<<something>>=" defines a chunk named "something", and "<<something>>" references that chunk and causes it to be substituted in:

<<test>>=

@SysInclude { fontdefs	} # font definitions
@SysInclude { langdefs	} # language definitions
@SysInclude { tbl	} # Table definitions
@SysInclude { bsf	} # BasicSetup package
@Use { @BasicSetup }

<<font>>
{lines 1.2fx nohyphen} @Break
{0.0 0.0 0.0 setrgbcolor} @SetColor

10.5i @High 8.0i @Wide
English @Language
90d @Rotate
{

	@CD {	
		<<lout object>>
	}

}

@
A single @ on a line starts a "documentation chunk" which won't make it into "tangled" code but will be typeset in the literate program "woven" for human readers. Following us is another chunk named "font"

<<font>>=
{Times Base 8p} @Font

The nice thing about noweb is that it preserves indentation, so when <<lout object>> is expanded, every line will be prefixed with the indentation in front of the reference to <<lout object>>. And of course I said to myself, why should I limit myself to textual representation? I wrote a noweb parser and "tangler" (in ANTLR of course!) and embedded JPython. If a chunk name is not defined in the noweb document, then it is evaluated in the JPython interpreter's namespace. If a chunk name ends with "exec", then I take that to mean that the chunk is not a text template but rather JPython code to execute.

This has turned into a very fun way to program code generators. It makes it especially easy to create a test piece of code, get it to work, then factor out the parts that need to be code instead of constants, and break the whole generated routine into easily digestable chunks--which is what literate programming was meant for.

See my "Noweb Parser" at http://www.antlr.org/showcase.html for code, but email me if you start using it because I have fixed a few bugs.

Monty