[e-lang] VatTP and Waterken protocols compared

Bill Frantz frantz at pwpconsult.com
Fri Sep 19 19:54:01 CDT 2008


Tyler described some of the Waterken protocol in this Friday's
morning meeting. Here are some notes from his description.

Both the VatTP protocol part of E on Java
<http://www.erights.org/elib/distrib/vattp/index.html>, and the
Waterken protocol <http://waterken.sourceforge.net/> allow vats to
send messages to other vats. There are some significant differences
in their implementation which affect their behavior during
checkpoints and during message flood conditions.

In VatTP, messages are sent as soon as they are generated. The
receiving thread takes the message and places it on the single vat
queue, and then returns to read another message. There is no
specific message acknowledgement.

In Waterken, each vat keeps a queue of messages to be sent. A
message is only discarded when the remote vat acknowledges having
processed it. Message which aren't idempotent include a serial
number to prevent double processing. The receiving process receives
a message from TCP, gains the vat lock, passes the message to the
vat, waits for the response, releases the vat lock, and returns the
response to the sender. Messages which a vat sends to itself are
handled with the same logic.

Both systems only save persistent state between vat "turns". It is
not clear from the (non)documentation if the vat queue is included
in what is saved by the persistence mechanism. (And I can't
remember that detail.) In Waterken, the output queue is saved as
part of the persistent state (including messages a vat sends to
itself).

These differences give Waterken different failure modes than VatTP:

(1) Message sends are back-pressured into the sending vat using the
TCP flow control mechanism. This prevents run-away sending programs
from crashing the receiving vat by running it out of message buffer
space. However, there may be situations where both vats can not
process more messages because they are waiting for space to send
the other vat a message.

(2) In the Waterken system, a message which causes a serious
program failure (e.g. running out of space in the current
implementation) will be repeated, possibly causing that failure
over and over, until someone repairs the offending program.

Cheers - Bill

-----------------------------------------------------------------------
Bill Frantz        | gets() remains as a monument | Periwinkle
(408)356-8506      | to C's continuing support of | 16345 Englewood Ave
www.pwpconsult.com | buffer overruns.             | Los Gatos, CA 95032


More information about the e-lang mailing list