[E-Lang] Draft Kernel-E DTD & Sketch of translation to
debugg able Java
Mark S. Miller
markm@caplet.com
Wed, 27 Sep 2000 11:50:10 -0700
At 09:05 AM 9/27/00 , Karp, Alan wrote:
>In the first incarnation, the protocol was ASCII strings on sockets. People
>snickered because it was so simple, but isn't that what XML is? ...
Yes. But it's no longer simple, so the snickering stopped ;)
>Our first hope was that we could define the communication as the Java
>serialization of an object. It's a well-defined format that covers
>everything you'd like to represent. All you need is to have code in your
>language to produce the right bytes. It's not hard. I've written a Perl
>script to produce the format and a Python script to parse it.
Could you please please please send me this code??
But please, only under Mozilla-compatible open-source terms. This usually
means any open source license but GPL.
>Hence, RMI
>serialization is as "language independent" as any other binary format.
Indeed!
>Unfortunately, we found this format to be overly verbose. For example,
>sending a single byte resulted in a 380 byte message. That's why e-speak
>Beta 2.2 used its own serialization. This format used a single byte to
>denote the base and e-speak types and an escape mechanism to extend this
>flag to more than one byte for user-defined types.
Yeah, it's got its problems too. However, in order to play by the pure-Java
rules, I feel stuck with it. JOSS internally uses some private native
methods for reaching into an object's instance variables. I've looked
carefully at the hooks JOSS provides for customization, which have grown as
of 1.3. But even with 1.3, the only way to make use of these native methods
for high speed serialization is to use the JOSS format. But I don't know
how the time gained by these native methods trades off against the time lost
in generating and parsing a more verbose format. Do you have any data on that?
>We got beaten up for being "proprietary", whatever that means. People said
>we should use HTTP. In my view, the protocol part of HTTP deals only with
>the transport part of the problem, not the payload part. That criticism
>disappeared when XML came onto the stage. Then we were beaten up for not
>using XML. At least this criticism has some merit. Anyone with a generic
>XML parser can process the document. Of course, they still need application
>specific knowledge to know what to do with the fields.
This experience corroborates my expectations of the politics of rolling out
a protocol these days. Thanks.
>XML is good, not for the human readable part, but because it solves the
>syntax problem. All parties need not produce the same sequence of bytes as
>long as they obey the same schema. That's good. It makes the whole
>communication piece less fragile. Misplace a byte in a binary format, and
>the entire message is unusable; do it in an XML document, and at most one
>field is garbage.
I don't see this as an advantage. Once layered on top of a reliable byte
stream, protocols should be fail-stop. To continue after something you
didn't understand is to risk madness. Or, as I used to say at EC:
Death Before Confusion!
Ironically, it looks like the JOSS format has more error recovery ability
than XML, but Sun wisely doesn't make use of that property.
>The downside is latency. It takes about 100 ms to parse an XML document of
>modest size. You only save 1/2-3/4 of that time if you pass the DOM tree.
>This overhead would have been unacceptable in e-speak Beta 2.2, but DR 3.0
>is focused on B2B communication, which can tolerate such latencies.
Until I have good reason to believe otherwise, I'm going to proceed assuming
protocol speed is important for E.
Cheers,
--MarkM