Re: Split Capabilities: Making Capabilities Scale Mark S. Miller (markm@caplet.com)
Fri, 07 Jul 2000 10:59:42 -0700

>We can see that the problem is tied up with how the objects controlled by
>the capabilities are designated. As I understand the way E works, each
>object has a name made up in part by a large, unguessable number chosen to
>guarantee statistical uniqueness.

Essentially yes. Now the quibbles.

First, I like to reserve the term "name" for strings intended to be interpretable at least by humans. In this usage, an E object has no "name", but in some contexts does have a unique identification string, as you say "made up in part by a large, unguessable number chosen to guarantee statistical uniqueness". We refer to this number as the "SwissNumber", as it has the "knowledge is authority" property popularly thought to be associated with Swiss bank account numbers.

Second, the representation used depends on the context. Since assumed overhead issues are a concern in your later messages, I will detail the overheads involved. The representation you refer to above is used in three contexts:

  1. As the encoding on the wire when capabilities are communicated between vats (ignoring an optimization http://www.erights.org/elib/object-pluribus/index.html which saves some space in the typical case).
  2. As the encoding encapsulated within a SturdyRef, which enables a live (typically remote) reference to be recovered after network partition/recovery.
  3. As the off-line representation of a capability to be communicated by means outside of E ( http://www.erights.org/elang/concurrency/introducer.html ).

In all the above cases, the encoding consists of the following three pieces of information:

The opposite context is a live reference held by object A designating object B, when A and B are in the same vat. This is simply an conventional object reference made of normal language-technology pointer material. In a JVM-based implementation, this would be implemented however that JVM implements a pointer to a Java object, and there would be one pointer per facet. In ENative http://www.erights.org/enative/index.html , there are two raw pointers per reference -- one to point at the object's storage and one to point at the script associated with the storage. The latter pointer is like a C++ vtable, but by moving it into the pointer, we only have one allocated instance per composite. Each facet of this composite merely pairs it with a different script without further allocation.

If a reference to B is never exported from its hosting vat, and if no one makes a sturdy or off-line reference to B, which will be the case for the vast majority of objects, then no further representation overheads are associated with B. SwissNumbers and the other information mentioned earlier only gets associated with objects on an as needed basis. Which brings us to:

A live reference to an object in a remote vat. This is how A designates B when they are in different vats. Likewise, it is what the wire encoding of a capability decodes into. In this case, A holds a conventional language-technology object reference not to B, but to an object in A's vat that stands for B. Let us call the object A directly points to a RemoteRef. All objects in the same vat as A that also have a live reference to B share this RemoteRef. There's a lot of information per RemoteRef, in order to manage the message pipelining of live messages sent over this reference. However, none of this is persistent information. The only remote references that survive a checkpoint/crash/revive are SturdyRefs, covered above. The persistent encoding of a RemoteRef merely need be adequate to restore a broken reference on revival, since a revival implies a network partition, and a network partition also causes RemoteRefs to become broken.

We pay these costs because message pipelining http://www.erights.org/elib/concurrency/pipeline.html (but needs documentation) largely addresses those latency issues you're concerned about.

> (I hate statistical uniqueness unless you
>tell me what happens in the case of a collision. It shouldn't happen in the
>life of the Universe, but nothing says it won't. A more serious problem is
>the various flaws in the system, including human error. But this topic is
>for another discussion.)

For security among mutually suspicious parties on open networks, in the absence of any one universally trusted third party, the only known means of trustable interaction is cryptography. Cryptography depends on the statistical uniqueness of private keys (or other secret information). Any cryptographic proposal must be examined wrt the consequences of key theft. A collision will seem like an undetected case of key theft.

In E, the consequence is a compromise of the interests of the party from whom the key is "stolen", and therefore of those who trust that party to operate in an uncompromised fashion. But nothing further. Those users of E organize their activities and relationships to avoid trust hot spots will in aggregate suffer limited damage. Likewise, an individual who divides his affairs among many vats in a similar fashion may bound his worst-case damage to the compromise of a single key. On the other hand, this may have complexity costs that aren't worth paying.

>I presume that the object handle is encoded in
>this name, so it can be wildcarded, but I don't know what it means to
>wildcard the handle to a general object.

As Jonathan mentioned, E itself has no notion of wildcarding. Further, an E object reference is totally opaque to the E programmer, so there's no ability to aggregate objects based on any internal structure of the reference's representation. The one case of this that would be both possible and meaningful is to aggregate objects based on their hosting vat.

>Clearly, the Vat needs a mapping between the object and its identifier in
>the capability. My understanding is that the mapping is encoded in the
>identifier. However, the designation could also be an entry into a table
>kept in the Vat. This table would differ from the e-speak repository in the
>information associated with the object, and the capabilities would be
>cryptographically secure. However, the principle is the same.

Actually, the association between a SwissNumber and the object it designates is via a table in the vat, and is used to dereference all references that use the SwissNumber representation. Without this table, these reference representations by themselves are insufficient to determine what object is designated. So, in one sense, we are doing what you suggest. However, we make no use of the flexibility this table provides -- we use it only as an effectively immutable mapping between SwissNumbers and designated objects. If these techniques would indeed add value to E, it seems we already have the data structures necessary to support them.

         Cheers,
         --MarkM