[E-Lang] Kernel-E in Minimal-XML
Mark S. Miller
markm@caplet.com
Thu, 05 Oct 2000 14:24:25 -0700
At 03:08 PM 10/2/00, Dan Moniz wrote:
> Mark> How to represent source-position information? (Since we'll be using
> Mark> this as our universal parse-tree and quasi-parse-tree
> Mark> representation.)
>
>Floating attribute perhaps?
Demonstrating my ignorance of XML once again, what's a floating attribute?
Also, having succeeded so far in painlessly representing Kernel-E within the
Minimal-XML subset of XML, I would strongly prefer to stay within this
subset. (Note that I'm stating this as a strong preference, but not a
requirement.) Since Minimal-XML doesn't have attributes, I'd guess it
doesn't have floating attributes, whatever they are.
> Mark> How to turn literal character data (as in a literal String) into the
> Mark> text between tags so that it is XML-processed back into the original
> Mark> data? (I'm sure it's a trivially solved problem. We just need to find
> Mark> the solution.)
>
>Hrm. I was thinking that you could write (to use an IBM buzzterm) a transcoding
>utility to turn literal strings into Unicode strings and use &bla; type
>entities inside the XML (so as not to clash with angle brackets), but that
>would get hella annoying to read. Not that this presents a solutuion.
>
>Or am I not on the the right page here?
I think you are. I simply need a transcoding(?) algorithm for converting
each way that preserves as much readability as possible. Since I'm even
more ignorant of the subtleties of Unicode than I am of the subtleties of
XML, I'd rather copy and paste such an open-source algorithm than have to
figure it out myself from the specs.
> Mark> What (quasi-)parser to use? Current candidates are to adapt MinML, to
> Mark> wait for Monty's fixes to ANTLR, or to just write a Minimal-XML
> Mark> quasi-parser by hand.
>
>I think writing one by hand is a bad idea. Better, in my mind, to wait for
>Monty's fixes or adapt MinML and publish the adaptions
Monty, any ETA?
Another possibility is to use Xerces-J with SAX, and to write a SAX document
handler to generate qdom, and to reject input outside Minimal-XML. Finally,
we'd need to adapt Xerces to handle quasi-literal input and build
quasi-literal qdom trees. I don't know how plausible this direction would
be, but I don't know how plausible the other alternatives are either. I
just read the MinML parser, it's very elegant in its own peculiar way, but I
sure don't want to try maintaining it (or a quasi-supporting variant of it).
> Mark> Is Minimal-XML really a downward compatible subset of XML (as they
> Mark> claim) or (as I suspect) do we need to identify and restrict
> Mark> ourselves to the intersection of these two standards?
>
>I think that regardless, we shold do our due dilligence and identify and
>restrict. It won't hurt if we do and the stadards are compatible at some level,
>and it will certainly help if they're not. Both are moving targets anyways, so
>what's true today could change tomorrow.
I agree. Although both are moving targets, I think XML at least is rather
strongly committed to evolving in only upward compatible ways. In any case,
even if we choose to constrain ourselves to the Minimal-XML subset of XML,
the important constraint, both from a marketing check-off point of view and
to leverage the world's huge investment in XML tools, is that we always be
rigorously XML compatible. We shouldn't expect many others to care whether
or not we're also Minimal-XML compatible. We're doing that mainly for our
own sanity.
> Mark> Should ERights.org "endorse" Minimal-XML? What would this mean? Would
> Mark> anyone care? Are there any downsides?
>
>I think using it is plenty. To publically state endorsement gets us into all
>sorts of battles and potentiall squabbles we would rather avoid, I think. Do
>you endorse the use of a hammer publicly? I don't, although I may suggest it if
>a friend has a problem to which I have found a hammer is an acceptable solution
>(perhaps one of many such solutions).
Well, that's certainly fine for now. In any case, we shouldn't endorse
anything until we've spent some time using it.
>When are you going to be away/unreachable again? I may be in San Fran later
>this week.
I am reachable again starting this Saturday 10/7. Hope we can meet!