Virtual Atomicity (was: purse deposit and the Cambio)
Mark S. Miller
markm@caplet.com
Thu, 24 Jun 1999 13:42:25 -0700
At 05:29 PM 6/18/99 , Ka-Ping Yee wrote:
>Date: Fri, 18 Jun 1999 03:30:11 -0700 (PDT)
>From: Ka-Ping Yee <ping@lfw.org>
>The deposit method
>provides a little extra atomicity good for some simple cases, but
>without a general way to do locking or reversible transactions
>you can't then compose new compound operations out of the elementary
>ones in ERTP that remain safely atomic.
[+] The Kernel-E contract must provide adequate atomicity constructs.
However, these don't need to resemble "locking" or "reversible
transactions". The Famed IBM 360 "Principles of Operation" (The POO)
introduced the term "unit of operation". Computation proceeds forward one
unit of operation at a time. Each unit of operation transforms a
well-definable computation state into another well-definable computation
state. Units of operation are interleaved, but computation as a whole is
faithfully describable as a serializable sequence of such atomic units. For
the 360, the unit was typically the instruction. But some instructions, like
block move, were a composite of such units.
Norm & Jonathan, please correct me here, as I'm probably about to screw up,
but here goes:
In KeyKOS & EROS, the units of operation are
1) Whatever the user-mode-instruction units of operation are.
2) The message pass (or gate jump). Call, Fork, & Return
3) When the invoked object is a kernel object, the servicing of the
request. (Thereby avoiding the need to include kernel state in
the definition of "computation state".)
At a higher level, KeyKOS's and EROS's monitor-like exclusive entry into a
domain enables domains to service many requests as if the entire servicing
of a single request were a similar atomic unit, even though it involves many
machine instructions that are interleaved in time with machine instructions
elsewhere. Where this fiction is successful, one may imagine a "virtual
unit of operation".
One can similarly define E's semantics in terms a hierarchy of such units of
operation.
* At the lowest level are the individual eval, match, and call steps in
evaluating Kernel-E parse tree. The semantics of the E virtual machine is
defined in terms of Kernel-E parse trees (really ASTs), not instructions.
However, in order to do a POO style semantics, I need to show an "explicit
control evaluator", essentially a register machine (ref Sussman & Abelson,
"Structure and Interpretation of Computer Programs") rather than a mutually
recursive evaluation algorithm, as the latter hides computation state.
Fortunately, an explicit control evaluator is at least as straightforward to
define as an instruction interpreter.
>From the point of view of such a definition, all accessible Java is
primitive, even when it calls back into E. Which brings us to stack frames.
* The next layer is stack-frame to stack-frame, or caller-to-callee. Since
1) E requires strict stacking discipline, 2) there may be at most one stack
executing inside a vat, and 3) (as we'll see in the next item) a vat only
has synchronous access within the vat; therefore we guarantee that the state
synchronously accessible to a caller after return from a callee must be the
synchronously accessible state in which the callee was called + modification
performed *only* by this callee (and his callees, etc..).
Since Java can call back into E, "computation state" must include Java
computation state. I hope this won't be as messy as I fear it will be.
* The unit of operation for concurrent distributed computation is the Turn.
Every time a message is asynchronously sent
recipient <- foo(args)
(read "recipient should eventually foo the args") the delivery of this
message to this recipient will happen in its own Turn in the Vat hosting the
recipient. Each Vat performs one Turn at a time by synchronously delivering
the message to the recipient, thereby spawning an initial stack frame. This
stack frame can then do all the things described by the previous two
granularities, but until this top stack frame exits, it's still its Turn,
and all other Turns for the same Vat are excluded. Once this Turn is over,
then, if there are further available Turns, the Vat must choose one and
proceed.
The order in which a Vat chooses Turns is constrained by the partial
ordering rules of message delivery, but is otherwise arbitrary. (If
necessary, there is an old priority proposal, consistent with the rest of
E's semantics, than could be dusted off. But I dearly don't want to.)
Because a Turn only has synchronous access within its own Vat, one may
faithfully describe overall distributed computation as a globally
serializable sequence of atomic Turns, even though Vats proceed in parallel
with each other, with no inter-Vat mutual exclusion.
This property is a big deal. This reconciliation of Actor-style and
sequential programming is *the* important invention, by Doug Barnes, that
distinguishes E both from the purely concurrent Actor tradition and from
other attempts to add concurrency to a sequential base.
* With the partial ordering constraints on message delivery, Kernel-E
provides further tools to deal with concurrency control issues, but
hopefully that will be beyond the scope needed for the following issues.
Though I'm violating atomicity, more later,
--MarkM