[cap-talk] Persistence as a cap value (was: Re: ...PLASH discussion)

Jed Donnelley capability at webstart.com
Wed Mar 12 13:23:55 EDT 2008


At 07:58 AM 3/12/2008, Jonathan S. Shapiro wrote:
>On Tue, 2008-03-11 at 23:27 -0700, Jed Donnelley wrote:
> > The fact that persistence issues keep surprising (me anyway,
> > others?) suggests to me that perhaps this major classification
> > category for capability system may not be given enough emphasis/
> > clarity.  I believe that in my heart I always think of capability
> > systems as manipulating persistent capabilities and keep being
> > surprised when I find that many (most?) aren't.
>
>Well it won't surprise you that (a) I agree,

Glad to hear it.

>and (b) I've put some thought into this.

What I hoped to elicit, though as to surprise I didn't have any
previous conceptions.  I notice you didn't include any references
below.  Are there any?

I do recall some discussion of persistence for E capabilities
supported by vats, e.g. even as recent as:

http://www.eros-os.org/pipermail/cap-talk/2007-August/008750.html

but I'm afraid I don't remember how E (Joe-E?, others?) manages
persistence (or not) for local objects.  Sorry if I'm just being
lazy here, a brief reference might do the trick.  Heh, perhaps
this reference to a patent by our friends:

Miller, Mark S. (Los Altos, CA)
Hardy, Norman (Portola Valley, CA)
Tribble, Dean E. (Los Altos Hills, CA)
Hibbert, Christopher T. (Mountain View, CA)
Hill, Eric C. (Palo Alto, CA)

at: http://www.freepatentsonline.com/6049838.html

might be relevant to this discussion?  However, again there
the focus seems to be on a distributed system.  Does that
approach assume that local objects are persistent?  If so,
how is that accomplished?

>Terms: explicit persistence vs. orthogonal persistence. The orthogonal
>kind is transparent.
>
>It comes down to two issues: instances and consistency. Here are what
>seem to be the key intuitions that I have come to.
>
>I. NAMING AND PROTECTION
>
>In any system, *some* things are persistent. Any persistent thing can be
>induced to have a durable unique identity, and given a durable unique
>identity we can both name that thing by a capability (it can be a
>target) and we can associate capabilities with that thing (it can act as
>a capability container).

I haven't before heard that "container" term.  The "thing" presumably
is an 'object' with a capability as a reference.  What does the
"container" term contribute.

>In explicitly persistent systems, application images are generally
>persistent, and it is easy to say things like "any instance of
>application A should be granted capabilities X and Y". Process instances
>generally are NOT persistent, so it is not straightforward to associate
>capabilities with process instances in a way that will survive shutdown
>and restart.
>
>The nearest approximation to this that I have found is something like
>the following:
>
>   1. Invent a "process instance ID allocator" that allocates unique IDs
>      for process instances, and associates each such ID with a binary
>      image and a schedule.
>
>   2. Provide means for an application to store associations between an
>      application ID and capabilities. Further, provide a means for
>      voluntary per-process persistence.
>
>   3. On re-start, kick off new copies of those processes under the old
>      IDs, and let them unpickle the old state to the extent that they
>      want to.
>
>The problem is that this really only seems to work out sensibly for
>things like daemons.

I think you about described the mechanism that we used in NLTSS
as I noted in a parallel message.

>You can get the processes restarted okay, but you
>then need to re-build all of the *connections* associated with those
>processes, and that is hard.

Interesting that you would focus on "connections".  We didn't have
"connections" per se.  It could be that at any time a process was
in the middle of sending or receiving a message - e.g. when
saved in a consistent state or when recovered from lost state.
In any case processes must have mechanisms to deal with arbitrary
behavior on the part of receivers, so for us such mechanisms
were adequate to deal with "connections" that were impacted
by state loss.

>You're almost forced into a sessionful
>communication paradigm (as opposed to RPC) so that the various parties
>can re-negotiate consistency.

I guess that's one way of describing the approach we took.

>In the end, I suspect that the complexity of this is fairly large, but I
>haven't actually tried to implement it.

As I noted we implemented it and didn't find it very complex.
We did have to support timeout mechanism for such situations.
We had some interesting issues come up with the so-called
"good guy" timeout mechanism that we used (I was heavily involved
in early implementation).  The basic idea was that timeouts
should be as long as practically possible (e.g. when interacting
with another process that might be being debugged).  To achieve
this we only triggered timeouts where limited resources were
needed - in practice almost never.  This worked out quite
effectively in nearly all cases.

>In practice, however, it would all seem to depend rather heavily on how
>many times you have instance-specific capabilities that need to survive
>shutdown.

What do you mean by an "instance-specific" capability?

>II. CONSISTENCY
>
>The BIG win of orthogonal persistence -- and also the big loss -- is
>that all processes within a persistence domain come back in mutual
>agreement about their state. This seems to be pretty important if you
>want to do domain-based decomposition, because you don't want to have to
>re-establish consistency *internally* within each application; that
>would add quite a lot of complexity.

As I note above, for us this didn't seem to add any more complexity
than was already required to deal with arbitrary behavior from
other "cooperating" (but suspicious) processes.

>But once again, it depends on whether process recovery is important for
>non-daemon processes.

For us it was very important, mostly because we were supporting
a large scientific computer center (the Livermore Computer
Center at LLNL) and some of the processes could have state
that had cost many hours or days of expensive computer time.

>And an argument can be made that process persistence is the wrong
>default. To the extent that programs are known to be buggy, it may
>actually be important to restart them periodically purely to clear out
>accumulated muck. Persistence adds a lot of pressure for correctness,
>and human programmers don't seem to be very good at that.

I believe I understand the above issues.

What I most wonder still is where people come down on:

1.  Persistence is good (or not)

and

2.  Persistence may be difficult to implement.

For me persistence (perhaps I need an example of "explicit"
persistence so I can distinguish it from transparent
"orthogonal" persistence) is valuable.  Without it
I don't see how one can really manage access control (e.g.
at the people level) with capabilities.  Without
persistence I don't see how we can hope to use the
capability delegation paradigm for communicating
object references between people, systems, and even
processes/active objects - as one never knows when
an object reference may be invalidated by a system
restart.

--Jed  http://www.webstart.com/jed-signature.html 



More information about the cap-talk mailing list