[E-Lang] Concerning XML docs

Karp, Alan alan_karp@hp.com
Tue, 18 Sep 2001 14:23:56 -0700


Chip Morningstar wrote:
> 
> Is there any actual technical benefit from touching this 
> tarbaby or is it being
> done for marketing/positioning purposes?
> 

That's exactly the question we were faced with before the first release of
e-speak.  At that time, we used a binary format for our communications
protocol.  (The first one was ASCII strings on sockets, which I wanted to
keep.)  We were criticized for being "non-standard" because we were not
using XML.  So, part of the decision to provide an XML version of the
protocol was so we would be considered "standard".  In other words,
marketing was a consideration.

At that time, I argued that what you had to say (the semantic content of the
protocol) was more important than how you said it (XML vs. binary).  After
all, you've still got to convert the message contents to object state.  Does
it really matter if you parse XML or binary, I argued?  Besides, who wants
to spend the overhead of parsing XML, something of the order of 100 ms?  I
now believe I was wrong for a number of reasons.  They are

o There are generic parsers for XML, meaning that there's less software for
us to maintain and ship.  More significantly, the parser deals with
malformed messages, not code we had to write.

o It is easier to recover from and/or identify errors in XML than in binary.
One bad byte in a binary message can ruin your whole day.

o Versioning is easier with XML.

  - New elements can be introduced without breaking old code; it just
ignores them.  
  - You can also more easily process old messages with new code by providing
the appropriate default values.  

  - The problem of denoting new types is not an issue with XML, but defining
an extensible set of "signal bytes", as we called them, is.

  - Including new elements is easier because you're not disrupting a
position dependent message.

o It is possible to use your messages for a completely different purpose
because the semantic meaning of the elements is denoted.  Thus, someone who
only understands a single tag out of an XML document can use the data as
needed.

_________________________
Alan Karp
Principal Scientist
Decision Technology Department
Hewlett-Packard Laboratories MS 1U-3
1501 Page Mill Road
Palo Alto, CA 94304
(650) 857-3967, fax (650) 857-6278
https://ecardfile.com/id/Alan_Karp
http://www.hpl.hp.com/personal/Alan_Karp/