[E-Lang] Rights Amplification: The Next Layer Up

Tyler Close tjclose@yahoo.com
Wed, 6 Sep 2000 16:12:22 +0100


Markm wrote:
>  The biggest
> thing missing from E is persistence, and both persistence
> and comm have at
> their core serialization. Objects need to serialize themselves very
> differently to these two streams: objects encapsulating
> precious secrets or
> authority need to not serialize them to the comm stream, of
> course, as that
> is being transmitted to untrusted machines.  OTOH, it is
> these precious
> things that most desperately want to be serialized to the
> persistence
> stream, so they won't disappear when Windows crashes.

There's another way of looking at this problem. The other way leads to
a better solution. You can see it in action by playing with the
Droplets environment. (Hmmm... a persistent capability system, wonder
if I'd find any clues there?)

In tl-E, you've only got one kind of local settled reference, the Java
primitive reference. When thinking about serialization, this forces
you to try to solve issues by mucking around with the object. That the
Java APIs also muck around with the object doesn't help your
intuition.

The better way to handle serialization issues is to muck around with
the references. The references are already something that the TCB is
expected to muck around with so we're not spreading the muck. You will
also be forced to muck around here a bit in order to implement
transparent object faulting.

I'll start with the conclusion and then work back from there. You can
stop reading once you get it.

The conclusion: "All implementation level objects are pass-by-copy".
This means that the serialization subsystem, whether comm or local,
should serialize any Java reference that it happens upon. Happily,
this is also what Java Serialization has done ever since it's first
version.

The Aha!:
---------
"The BLOBing of objects for efficient state preservation is shockingly
similar to the BLOBing of objects required for mutual suspicion."

First, the efficient state preservation side of the story. Consider a
simple class:

class Person
{
	String name;
	Address address;
	Set friends;
}

A naive persistence implementation might try to store each object
individually. This means that when a Person object was stored, it
would be stored as a list of links to other objects in the database
(the name String, the address Address and the friends Set). Each of
these other objects would be stored separately, eventually all
bottoming out in primitive types. For example the String object would
be stored as raw characters, but the Address object would be stored as
a list of links to other String objects.

A better persistence implementation would try to BLOB objects together
so they could be stored, and retrieved, in one step. This means that
instead of each object having its own unique storage, some objects
will share storage. For example the Person object would be stored as
one serialized BLOB, with the name String, address Address and friends
Set all clumped together at one place in the database.

For this BLOBing to make sense, all of the BLOBed objects must be
pass-by-copy. There are many reasons for this, some more complicated
to explain than others. The easiest reason to understand is that more
than one BLOB may have a pointer to a given BLOB component object. For
example, two Person objects may point to the same Address object. If
each Person object stores it's own copy of the Address object then, in
effect, there are now two separate, but equal, Address objects. For
the program to continue on without noticing this change, the Address
type must be what E calls "selfless", or pass-by-copy.

(The astute in the crowd will notice at this time why I was arguing so
aggressively for immutable containers. Only immutable containers can
be BLOBed)

The size of a BLOB is determined by the size of the graph of reachable
Java references. We delimit this graph by introducing a new reference
type, a reference to a pass-by-reference object (an object with a
notion of self). In Droplets, this reference type is implemented by
the BLOBReference class. In our Person example, objects of type Person
would always be referred to by a BLOBReference, not a normal Java
reference. This means that the 'friends' member would be a Set of
BLOBReference. When the Person object is serialized, the serialized
graph stops at the BLOBReference. The BLOBReference itself is a Java
object, referred to by a Java reference, so it gets serialized into
the BLOB, but the BLOBReference does not contain a Java reference to
the referred to object, so that's the edge of the graph.

At runtime, the TCB catches method invocations on BLOBReferences and
directs them to the referred to object. It's in this handler code that
the TCB can implement transparent object faulting (waiting until an
object actually gets used before reading it from disk).

Now, the mutual suspicion side of the story.

When sending asynchronous method invocations across Vat boundaries,
the parameters need to be serialized and sent to the external Vat.
Some of these parameters should be pass-by-copy, and some should be
pass-by-reference.

If an object is selfless, and does not act as a protection proxy for a
'secret' (ie: a capability for another object), then the object should
be pass-by-copy.

If an object has identity (a notion of self), or is a protection
proxy, then the object should be pass-by-reference.

This is 'shockingly similar' to the rules for database BLOBing. The
only difference is the additional consideration for protection
proxies.

In Droplets, I have adopted the convention that only selfless, non
protection proxy objects are referred to using Java references, and
everything else is referred to using a BLOBReference. In practice, I
have found this convention to be very natural to work with. By
default, I code everything as pass-by-reference. In cases where I know
that the class I am making is a 'descriptor'-like object (ie: Address)
I follow the different coding for pass-by-value objects. It's a very
elegant solution to a very complex problem. I highly recommend that E
do the same.

E already has syntax for making selfless objects. This is all the
syntax it needs. This is an implementation issue. Please don't muck up
the E language for something that can be solved at the implementation
level.

Tyler


__________________________________________________
Do You Yahoo!?
Talk to your friends online with Yahoo! Messenger.
http://im.yahoo.com