[E-Lang] Concerning XML docs
Jonathan S. Shapiro
shap@eros-os.org
Tue, 18 Sep 2001 11:20:57 -0400
This is a multi-part message in MIME format.
------=_NextPart_000_00F6_01C14033.FCD1C440
Content-Type: text/plain;
charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
[CapIDL related, out of context]
The longest portion of my conversation with MarkM last night concerned =
the commitment to XML. At one point, he was quite distressed about =
committing to XML, I suspect because of issues in the content model =
semantics. I would like to understand this issue better, and if somebody =
knows of a *brief* summary I can read, I would be very appreciative. I =
refer to the issues that have forced canonical XML to diverge, and to =
the content model semantic gap between the "official" version and the =
Xpath version of the content model.
My opinion, in the absence of further information, is that the XML world =
has no artistic aesthetic worth mentioning, and that there *are* =
semantic problems in the spec, but that the cost of deviation exceeds =
the cost of poor semantics in this area.
We cannot -- and should not try to -- redirect the XML community as a =
whole, and this means that we should be prepared to parse full XML =
wherever XML is generated by (a) humans or (b) tools outside of our own =
tool suite. Note that we do not need a validating parser for this, and =
that non-validating parsers really aren't that complex.
That said, I am all in favor of using a sensible, restricted subset of =
XML for use by our own tools. The most reasonable subset is probably =
minml with the follwing attributes (only)
id so we can encode graphs
idref so we can encode graphs
Speaking for myself, I might also argue for:
class so we can apply CSS to decently display the results.
A.href (or something equivalent to href) so that we can express
*outbound* links in minml. The issue here is that the
content restrictions on element bodies (#PCDATA) are not
quite the same as the restrictions for attributes, with the =
result
that off-the-shelf translators may not be able to reliably
produce HTML <a> entities without use of attributes for this
purpose
A.name (or equivalent) for the same reasons as above.
Of these, I could contentedly yield on the A's, but the class attribute =
might prove valuable for visualization so as to key presentation using =
CSS.
Jonathan
------=_NextPart_000_00F6_01C14033.FCD1C440
Content-Type: text/html;
charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META http-equiv=3DContent-Type content=3D"text/html; =
charset=3Diso-8859-1">
<META content=3D"MSHTML 5.50.4807.2300" name=3DGENERATOR>
<STYLE></STYLE>
</HEAD>
<BODY bgColor=3D#ffffff>
<DIV><FONT face=3DArial size=3D2>[CapIDL related, out of =
context]</FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT> </DIV>
<DIV><FONT face=3DArial size=3D2>The longest portion of my conversation =
with MarkM=20
last night concerned the commitment to XML. At one point, he was quite=20
distressed about committing to XML, I suspect because of issues in the =
content=20
model semantics. I would like to understand this issue better, and if =
somebody=20
knows of a *brief* summary I can read, I would be very appreciative. I =
refer to=20
the issues that have forced canonical XML to diverge, and to the content =
model=20
semantic gap between the "official" version and the Xpath version of the =
content=20
model.</FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT> </DIV>
<DIV><FONT face=3DArial size=3D2>My opinion, in the absence of further =
information,=20
is that the XML world has no artistic aesthetic worth mentioning, and =
that there=20
*are* semantic problems in the spec, but that the cost of deviation =
exceeds the=20
cost of poor semantics in this area.</FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT> </DIV>
<DIV><FONT face=3DArial size=3D2>We cannot -- and should not try to =
-- redirect=20
the XML community as a whole, and this means that we should be prepared =
to parse=20
full XML wherever XML is generated by (a) humans or (b) tools outside of =
our own=20
tool suite. Note that we do not need a validating parser for this, and =
that=20
non-validating parsers really aren't that complex.</FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT> </DIV>
<DIV><FONT face=3DArial size=3D2>That said, I am all in favor of using a =
sensible,=20
restricted subset of XML for use by our own tools. The most reasonable=20
subset</FONT><FONT face=3DArial size=3D2> is probably minml with the =
follwing=20
attributes (only)</FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT> </DIV>
<DIV><FONT face=3DArial size=3D2> id =
so we can encode graphs</FONT></DIV>
<DIV><FONT face=3DArial size=3D2> =
idref so we=20
can encode graphs</FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT> </DIV>
<DIV><FONT face=3DArial size=3D2>Speaking for myself, I might also argue =
for:</FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT> </DIV>
<DIV><FONT face=3DArial size=3D2> =
class so we=20
can apply CSS to decently display the results.</FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT> </DIV>
<DIV><FONT face=3DArial size=3D2> =
A.href (or=20
something equivalent to href) so that we can express</FONT></DIV>
<DIV><FONT face=3DArial size=3D2> =
*outbound*=20
links in minml. The issue here is that the</FONT></DIV>
<DIV><FONT face=3DArial size=3D2> =
content=20
restrictions on element bodies (#PCDATA) are not</FONT></DIV>
<DIV><FONT face=3DArial size=3D2> =
quite the=20
same as the restrictions for attributes, with the result</FONT></DIV>
<DIV><FONT face=3DArial size=3D2> =
that=20
off-the-shelf translators may not be able to reliably</FONT></DIV>
<DIV><FONT face=3DArial size=3D2> =
produce HTML=20
<a> entities without use of attributes for this</FONT></DIV>
<DIV><FONT face=3DArial size=3D2> =20
purpose</FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT> </DIV>
<DIV><FONT face=3DArial size=3D2> A.name (or =
equivalent) for the=20
same reasons as above.</FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT> </DIV>
<DIV><FONT face=3DArial size=3D2>Of these, I could contentedly yield on =
the A's, but=20
the class attribute might prove valuable for visualization so as to key=20
presentation using CSS.</FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT> </DIV>
<DIV> </DIV>
<DIV><FONT face=3DArial size=3D2>Jonathan</FONT></DIV></BODY></HTML>
------=_NextPart_000_00F6_01C14033.FCD1C440--