[E-Lang] Draft Kernel-E DTD & Sketch of translation to debugg able Java

Karp, Alan alan_karp@hp.com
Wed, 27 Sep 2000 09:05:39 -0700


We faced many of these protocol issues when designing e-speak, since we knew
we had to allow the ends of the communication to be written in different
languages.  The only viable solution when you want to deal with more than
one or two of them is an intermediate language.  

In the first incarnation, the protocol was ASCII strings on sockets.  People
snickered because it was so simple, but isn't that what XML is?  ASCII
strings are very nice for debugging; telnet to the port and start typing.
However, there is substantial overhead in translating between binary and
ASCII.  At that time, we were concerned with latency, so we abandoned that
approach.

Our first hope was that we could define the communication as the Java
serialization of an object.  It's a well-defined format that covers
everything you'd like to represent.  All you need is to have code in your
language to produce the right bytes.  It's not hard.  I've written a Perl
script to produce the format and a Python script to parse it.  Hence, RMI
serialization is as "language independent" as any other binary format.
Unfortunately, we found this format to be overly verbose.  For example,
sending a single byte resulted in a 380 byte message.  That's why e-speak
Beta 2.2 used its own serialization.  This format used a single byte to
denote the base and e-speak types and an escape mechanism to extend this
flag to more than one byte for user-defined types.

We got beaten up for being "proprietary", whatever that means.  People said
we should use HTTP.  In my view, the protocol part of HTTP deals only with
the transport part of the problem, not the payload part.  That criticism
disappeared when XML came onto the stage.  Then we were beaten up for not
using XML.  At least this criticism has some merit.  Anyone with a generic
XML parser can process the document.  Of course, they still need application
specific knowledge to know what to do with the fields.

XML is good, not for the human readable part, but because it solves the
syntax problem.  All parties need not produce the same sequence of bytes as
long as they obey the same schema.  That's good.  It makes the whole
communication piece less fragile.  Misplace a byte in a binary format, and
the entire message is unusable; do it in an XML document, and at most one
field is garbage.  

The downside is latency.  It takes about 100 ms to parse an XML document of
modest size.  You only save 1/2-3/4 of that time if you pass the DOM tree.
This overhead would have been unacceptable in e-speak Beta 2.2, but DR 3.0
is focused on B2B communication, which can tolerate such latencies.

"If a man is talking in the forest, and there's no woman to hear him, is he
still wrong?"

_________________________
Alan Karp
Decision Technology Department
Hewlett-Packard Laboratories MS 1U-2
1501 Page Mill Road
Palo Alto, CA 94304
(650) 857-3967, fax (650) 857-6278


> -----Original Message-----
> From: Mark S. Miller [mailto:markm@caplet.com]
> Sent: Tuesday, September 26, 2000 9:24 PM
> To: dnm@pobox.com
> Cc: e-lang@eros-os.org
> Subject: Re: [E-Lang] Draft Kernel-E DTD & Sketch of translation to
> debuggable Java
> 
> 
> At 01:49 PM 9/26/00 , Dan Moniz wrote:
> >Perhaps, but JOSS is Java specific at one end, at least. 
> Userland made XML-RPC
> >because they were dealing with a Java-less world powered 
> primarily C and Perl,
> >if memory serves. 
> 
> "Language specific" is a funny thing.  JOSS & RMI make no 
> bones about being 
> Java specific.  CORBA and SOAP claim to be language neutral.  
> What do these 
> claims mean?  It is certainly the case that Java programs 
> have an easier 
> time with JOSS and RMI than Smalltalk or C++ programs would.  For the 
> "language neutral" systems, the relative amounts of pain for 
> clients in 
> various languages may be better balanced, giving real meaning 
> to the claim 
> of neutrality.  However, this compares relative pain along the wrong 
> dimension. Notice that a "language specific" system designed 
> for an obscure 
> language no one uses of might appear language neutral by this 
> standard, 
> since the pain imposed on the non-obscure languages may be balanced.
> 
> This is essentially the situation with CORBA -- even though 
> there was no 
> obscure language, there may as well have been.  CORBA has as 
> much of its own 
> idiosyncratic semantics for each language to adapt to as any 
> language-specific system would have.  Indeed, I don't see how 
> it could be 
> otherwise, if the standard has enough semantics to enable 
> inter-operability 
> among objects in different languages.  So, without knowing 
> SOAP or XML-RPC 
> well enough to say this with confidence, I'd guess either
> 1) it's true for them as well, or
> 2) they haven't specified enough semantics to enable cross-language 
> interoperability, or 
> 3) they've discovered a new principle of protocol design that 
> they haven't 
> called anyone's attention to.
> 
> Comparing pain the other way, we may say that standard X 
> imposes strictly 
> less pain than standard Y across languages A, B, and C, iff 
> for at least one 
> of these languages X is less painful than Y and for no 
> languages is Y less 
> painful than X.  It may be the case that X is designed around 
> the semantics 
> of language A while Y is designed to be language neutral.  
> However, if X is 
> strictly less painful, there's no reason for B and C 
> programmers to prefer Y 
> to X.  I believe this to be true for the substitutions:
> 
> X = JOSS/RMI
> Y = CORBA
> A = Java
> B = C++
> C = Smalltalk
> 
> I don't know if this is true when Y = SOAP or XML-RPC, but I 
> suspect so, 
> except for the one advantage of these XML systems: a standard 
> human-readable 
> textual format.  Once I build a XML <-> JOSS converter, the 
> human-readability should no longer be an issue, and we'll be 
> in a better 
> position to compare it to these other proposals.
> 
> We should expect an advantage across languages for well 
> designed language 
> specific system which are specific to well designed 
> languages: They already 
> have a well thought out semantics to start from, and good 
> language designers 
> have in general designed much better semantics than good 
> protocol designers. 
> For purposes of these points, Java counts as a good language 
> design.  CORBA 
> has a semantics that could have come from being language 
> specific for a bad 
> language.
> 
> In any case, I'm not advocating JOSS/RMI of course, but JOSS/Pluribus 
> http://www.erights.org/elib/object-pluribus/index.html , 
> which is E-specific 
> in the above sense.  My immediate need for the converter is 
> simply to help 
> me debug. But to accommodate current tastes, I may eventually 
> allow the 
> connect-time protocol negotiation to negotiate to use the 
> textual protocol.  
> Given the converter, it should be painless to support this. 
> 
> Strange isn't it? Using the massive MIPS the hardware guys 
> give us to enable 
> programs to speak to programs in human-readable notations.  
> If a readable 
> tree calls in the network and no one reads it, ... ?
> 
> 
> >SOAP was a collaboration between Documentor, Userland, and
> >Microsoft (at least in the beginning), and I don't see MS 
> pushing anything Java
> >these days, not even J++ or MSJVM.
> 
> Yup.  By winning their lawsuit, Sun may have come out behind. 
>  Microsoft's 
> C# and .NET initiatives also look like interesting challenges 
> to Java.  If 
> Microsoft plays their cards right, I can imagine a someday 
> working on a 
> compiler from E to the C#.NET virtual machine instruction 
> set.  It some ways 
> it looks like it could be a better target than the JVM for 
> compiling E.
> 
> Thanks to Ken Kahn for drawing my attention to these.
> 
> 
> >... The presenter
> >quickly backpedeled and said he meant that the ROPE tool 
> didn't support large
> >data sets, although the SOAP standard does.
> 
> Well, at least he did backpedal ;)
> 
> 
>          Cheers,
>          --MarkM
> 
> _______________________________________________
> e-lang mailing list
> e-lang@mail.eros-os.org
> http://www.eros-os.org/mailman/listinfo/e-lang
>