[E-Lang] Kernel-E in Minimal-XML

Mark S. Miller markm@caplet.com
Thu, 05 Oct 2000 14:24:25 -0700


At 03:08 PM 10/2/00, Dan Moniz wrote:
>    Mark> How to represent source-position information? (Since we'll be using
>    Mark> this as our universal parse-tree and quasi-parse-tree
>    Mark> representation.)
>
>Floating attribute perhaps?

Demonstrating my ignorance of XML once again, what's a floating attribute?  
Also, having succeeded so far in painlessly representing Kernel-E within the 
Minimal-XML subset of XML, I would strongly prefer to stay within this 
subset.  (Note that I'm stating this as a strong preference, but not a 
requirement.)  Since Minimal-XML doesn't have attributes, I'd guess it 
doesn't have floating attributes, whatever they are.  


>    Mark> How to turn literal character data (as in a literal String) into the
>    Mark> text between tags so that it is XML-processed back into the original
>    Mark> data? (I'm sure it's a trivially solved problem. We just need to find
>    Mark> the solution.)
>
>Hrm. I was thinking that you could write (to use an IBM buzzterm) a transcoding
>utility to turn literal strings into Unicode strings and use &bla; type
>entities inside the XML (so as not to clash with angle brackets), but that
>would get hella annoying to read. Not that this presents a solutuion.
>
>Or am I not on the the right page here?

I think you are.  I simply need a transcoding(?) algorithm for converting 
each way that preserves as much readability as possible.  Since I'm even 
more ignorant of the subtleties of Unicode than I am of the subtleties of 
XML, I'd rather copy and paste such an open-source algorithm than have to 
figure it out myself from the specs.


>    Mark> What (quasi-)parser to use? Current candidates are to adapt MinML, to
>    Mark> wait for Monty's fixes to ANTLR, or to just write a Minimal-XML
>    Mark> quasi-parser by hand.
>
>I think writing one by hand is a bad idea. Better, in my mind, to wait for
>Monty's fixes or adapt MinML and publish the adaptions

Monty, any ETA?

Another possibility is to use Xerces-J with SAX, and to write a SAX document 
handler to generate qdom, and to reject input outside Minimal-XML.  Finally, 
we'd need to adapt Xerces to handle quasi-literal input and build 
quasi-literal qdom trees.  I don't know how plausible this direction would 
be, but I don't know how plausible the other alternatives are either.  I 
just read the MinML parser, it's very elegant in its own peculiar way, but I 
sure don't want to try maintaining it (or a quasi-supporting variant of it).


>    Mark> Is Minimal-XML really a downward compatible subset of XML (as they
>    Mark> claim) or (as I suspect) do we need to identify and restrict
>    Mark> ourselves to the intersection of these two standards?
>
>I think that regardless, we shold do our due dilligence and identify and
>restrict. It won't hurt if we do and the stadards are compatible at some level,
>and it will certainly help if they're not. Both are moving targets anyways, so
>what's true today could change tomorrow.

I agree.  Although both are moving targets, I think XML at least is rather 
strongly committed to evolving in only upward compatible ways.  In any case, 
even if we choose to constrain ourselves to the Minimal-XML subset of XML, 
the important constraint, both from a marketing check-off point of view and 
to leverage the world's huge investment in XML tools, is that we always be 
rigorously XML compatible.  We shouldn't expect many others to care whether 
or not we're also Minimal-XML compatible.  We're doing that mainly for our 
own sanity.


>    Mark> Should ERights.org "endorse" Minimal-XML? What would this mean? Would
>    Mark> anyone care? Are there any downsides?
>
>I think using it is plenty. To publically state endorsement gets us into all
>sorts of battles and potentiall squabbles we would rather avoid, I think. Do
>you endorse the use of a hammer publicly? I don't, although I may suggest it if
>a friend has a problem to which I have found a hammer is an acceptable solution
>(perhaps one of many such solutions).

Well, that's certainly fine for now.  In any case, we shouldn't endorse 
anything until we've spent some time using it.


>When are you going to be away/unreachable again? I may be in San Fran later
>this week.

I am reachable again starting this Saturday 10/7.  Hope we can meet!