Split Capabilities: Making Capabilities Scale

Karp, Alan alan_karp@hp.com
Fri, 14 Jul 2000 17:21:36 -0700


> -----Original Message-----
> From: Jonathan S. Shapiro [mailto:shap@eros-os.org]
> Sent: Friday, July 14, 2000 3:22 PM
> To: Karp, Alan; e-lang@eros-os.org
> Subject: Re: Split Capabilities: Making Capabilities Scale
> 
> 
> > > > I love a hostile audience...
> > >
> > > Oh dear. I hope not hostile.
> >
> > Of course not, although I've had hostile (mostly inside 
> HP), and I've
> > enjoyed that, too.
> 
> Well, if it will make you feel more comfortable, I suppose I could
> gratuitously insult you once in a while. :-) Ahh. I have it. 
> Your father
> smelt of elderberries! There. Your quota of E-speak hostility 
> for the week
> is now reached. You are now free to send SIGKILL to any 
> further hostile
> parties. :-)

Thanks.  I feel better now.

Funny you should mention SIGKILL.  I use it as an example of the value we
get from not having special cases.  E-speak events are sent as e-speak
messages.  They become events when a client has registered a callback for
that particular event.  The sender of the event must present the SIGKILL
capability, or the recipient will treat it as some incomprehensible message.
Thus, you can't send me SIGKILL (nothing personal), but my wife insists that
I let her do it.  In fact, I'll be getting one if I don't get out of here
soon.

> 
> > The process the agent runs in is not persistent, nor is its 
> execution
> state.
> > E-speak binds the capabilities to the agent's protection 
> domain, not its
> > executable image.  The agent's protection domain can be 
> persistent, so
> that
> > its security environment persists across restarts.  The 
> first thing an
> agent
> > does when starting is attach to its protection domain.
> 
> I think that this has added a layer of indirection, and also 
> some confusion
> on my part. I understand how to make the domain persistent 
> independent of
> the process, and I think this is quite sensible. My question 
> is this: when
> the process goes to "attach" to a protection domain, how is 
> it determined
> that this process has the right to use the authority captured by this
> protection domain?

When a client first connects to the engine, it's requests are interpreted in
the context of a default protection domain.  This PD has only the barest
privileges, often just enough to connect to a PD with more privileges.
There are a number of ways that the client can get access to its own PD.
One would be to contact an authentication service.  If simple passwords
suffice, we can use e-speak resource discovery because we have "essential"
attributes that must match the request.  Once it has access to its PD, the
client tells the system to interpret all commands in that context.

> 
> There is an interesting middle ground of persistence, which 
> is to keep a
> registry of (domain, binary) pairs for processes that should 
> be restarted,
> but live with the fact that the programs must reinitialize. 
> This is perhaps
> more useful in your model than in ours.

All e-speak processes must be restarted and rebuild their memory state.
Those with non-persistent protection domains also need to rebuild their
security environments.

> 
> > ... capabilities are definitely not associated with an
> > executable image and not necessarily attached to an 
> instance.  Instead,
> they
> > are attached to principals, each principal having its own protection
> domain....
> > Hence, your wallet has a different capability than mine, 
> but I can have
> two
> > wallets controlled by the same capability.
> 
> Once you fall back on principals for this it's all over as 
> far as security
> is concerned. It's definitely better than what we have in commodity
> deployment today, but not good enough for some of the 
> problems I want to
> solve.

You're right.  It's not prinicpals in the way it's used in the security
field.  I mean that it's any process that can connect to that protection
domain.

> 
> My wallet can have different capabilities than yours, but can 
> I have two
> wallets, both mine, to two different accounts that hold distinct
> capabilities?

Of course.  Each wallet is a resource with an entry in the repository.  The
repository entry has a field associating a capability with an access right.
Two entries can have the same capability point to the "spend" access right,
or they can have different capabilities associated with it.

> 
> > Hmmm, strange.  HP is promising 4 9s (99.99%) uptime this 
> year and 5 9s
> next
> > year.  That's only 5 minutes a year of unscheduled down time.
> 
> When you get to 6.5 9s you are reaching KeyKOS reliability, 
> and I expect
> that EROS may prove to do better. E and E-speak are 
> principally limited by
> the reliability of their underlying OS. This means that their 
> reliability
> sucks but its not their fault.

OK.  When you're ready, we'll layer e-speak on top of EROS.:-}

> 
> > Reliability
> > is the main reason Amazon gave for replacing its Sun servers with HP
> > machines.  Probably, their environment is more controlled 
> than the one you
> > work in.  I know my HP-UX desktop machine only goes down if 
> there's a
> power
> > failure, but again, I don't know enough about operating 
> systems to be
> > dangerous.
> 
> Part of our problem at Penn was that we bought into the HP clustering
> technology, which wasn't soup yet. Various people were heard 
> to say that we
> had a new appreciation of the term "cluster fucked". I think 
> the real issue
> was long-term administrative neglect. Clusters are quite hard 
> to upgrade
> consistently in that sort of environment, and I'm not 
> convinced that it was
> done correctly. Independently, HP was unwilling to disclose 
> enough to let us
> build drivers and such for those machines, so we didn't have 
> a whole lot of
> incentive to keep them running well.

I'm familiar with that work.  When it was DUX in HP Labs, it was wonderful,
even better than Apollo Domain.  By the time it got to market, it had been
gutted.  By the way, don't feel bad.  I couldn't get them to disclose much
to me, either.

> 
> We should probably pick up the reliability thread in the 
> eros-arch list if
> it is of interest to you. It's kind of off topic here.
> 

Nah.  I was just being a good HP soldier and defending the home turf.

> 
> shap
> 

_________________________
Alan Karp
Decision Technology Department
Hewlett-Packard Laboratories MS 1U-2
1501 Page Mill Road
Palo Alto, CA 94304
(650) 857-3967, fax (650) 857-6278