Split Capabilities: Making Capabilities Scale
Karp, Alan
alan_karp@hp.com
Thu, 6 Jul 2000 11:11:58 -0700
> -----Original Message-----
> From: Jonathan S. Shapiro [mailto:shap@eros-os.org]
> Sent: Saturday, July 01, 2000 4:42 AM
> To: Karp, Alan; e-lang@eros-os.org
> Subject: Re: Split Capabilities: Making Capabilities Scale
>
>
> I've held off on responding to this thread, partly because I
> wanted to let
> the idea sink in and partly because I'm packing a house at
> the moment, which
> has me a bit distracted. These comments span several of the
> emails that have
> gone back and forth.
>
> First off: Hi, Alan. I'm very glad that you can now talk
> about e-speak. It
> looks like a strong piece of work.
Thanks. It's a pleasure finally to be able to get some feedback from the
outside world.
>
> > The problem is the number of capabilities I need to deal
> with. After all,
> > my PC has over 60,000 files on it. In the most general
> case, I need a
> > conventional capability for each operation, e.g., read,
> write, execute,
> for
> > each file. Some applications, SAP comes to mind, have hundreds or
> thousands
> > of methods, each of which I might want to control separately.
>
> There are really two issues hiding in this statement: total
> capability count
> and manageability.
>
> I think that the total count is not per-se a problem. To
> expand on your
> example, your file system on your PC currently *has* 60,000
> capabilities,
> only they are called inodes or vfat entries. Encrypting these
> entries does
> not inherently alter any of the issues arising from the
> number of them you
> have. Distribution may, and I haven't thought that issue through.
If capabilities are cryptographically secure, they're bigger than encrypted
inodes. For example, SPKI capabilities are of the order of 1 KB at a
minimum. Each delegation adds as much as another KB. So, a large number of
capabilities can present a problem. For example, what if I want to search
my PC's file system from a Palm Pilot? In a pure capability system, I'd put
the capabilities in the Palm Pilot, but I can't if they're going to take up
180 MB of space. (In the worst case, I'll need separate read, write,
execute capabilities). I'll have to set up a proxy scheme so the Palm Pilot
needs only a few capabilities. That's just another piece of code to write
and another link in the security chain that I'd like to avoid.
However, the amount of space taken up by the capabilities is a minor matter.
It's managing them that's a potential problem. I certainly don't want to
submit all my capabilities on every request, even if I'm holding fewer than
180,000. Keeping them straight is another piece of code to write. It may
be that I don't even know what capabilities to present. One scheme for
distributing capabilities uses an authenticating agent. I present my
credentials, and get back my access rights in the form of capabilities. I'd
have to parse the capabilities to know which ones go with which objects,
something I might not even be able to do if name aliasing is used.
>
> The manageability argument is more compelling, but if we
> examine your file
> system we soon note that the file system itself doesn't treat
> its objects as
> a flat space. It has internal organization (directories) and
> deals with
> subsets of the larger capability pool. To the extent that
> directories in the
> file system grow too large, they become unmanageable by human
> users. This,
> however, is not a result of using capabilities. It is a
> result of having too
> many things to designate. Whether the designations are capabilities or
> filename paths, there are two many distinct names for the
> human to manage. I
> claim that this issue is fundamental: in any system having a
> large number of
> objects that must be designated individually for operational
> purposes the
> namespace management problem will arise.
That's where wildcards come in, but wildcards require a structured name
space, and they're dangerous. Files provide the structured name space, and
wild cards are used in the security system for the e-speak Virtual File
System. However, any capability that denotes an object with one of these
structured names, particularly if wildcards are used, is subject to a wide
variety of errors. A new file put in a directory implicitly grants
permissions specified in capabilities issued for all directories in the path
to the file. Similarly, changing the name of a file or a directory may
inadvertantly grant permissions that were issued for a previously existing
file. In fact, these problems are so serious, that the best policy is to
make sure that names are never reused. Note that these problems have
nothing to do directories growing too large; they have to do with using
names that have meaning as the object handle in the capability.
An alternative is to allow people to use a structured name, but denote the
file by some arbitrary designator in the capability. All that's needed is a
mapping table somewhere in the system. However, this approach precludes
using wildcards in the capability, forcing me to have a separate capability
for each file I need to access.
>
> Finally, I think that the "general case of one capability per
> method" claim
> is mistaken, though it's a natural mistake given the way that
> the literature
> has presented capabilities. Contrary to what the historical
> presentation has
> claimed, capabilities do not authorize operations in real
> systems. I think
> that the following is a better way to think about what is going on:
>
> Each object has an interface containing all
> of the methods that exist on that object.
> For each object, there are some number
> of "thinnings" of that interface. For
> example, there may be a thinning for
> operations that can be performed under
> read-only authority.
> Each of these thinnings has a corresponding
> capability.
>
> While shifting to this model doesn't change your claim in
> abstract, it does
> shift attention in a way that makes clearer why the abstract
> claim doesn't
> matter. Databases may have thousands of methods, but if they *were*
> individually controlled the database would have too many
> interfaces to be an
> engineerable software artifact, and the product would soon
> die for reasons
> having nothing to do with capabilities.
The choice of using a unique capability for each method or capabilities that
contain combinations of methods, thinnings, depends on how they are issued.
An object with 10 methods needs only 10 capabilities if they are protected
individually; it needs a very large number if every possible combination is
needed. Since the former approach grows linearly with the number of
methods, and the latter grows exponentially, if we don't know ahead of time,
I'd probably plan on one capability per method. On the other hand, in real
life we only issue a few sets of permissions, so it makes sense to issue
them with combinations of access rights. Still, it wouldn't surpise me if
we often needed 3 capabilities per object.
There are a lot of things we can't do with databases now that we could do
with a well designed capability system. The real problem isn't just the
number of methods, it's also in the fields and the records. For example, I
should be able to find anyone's phone number, but I should only see the
salaries of people reporting to me, and I can see my salary but not change
it. We can't do that kind of thing very well today, but giving me a the
right set of capabilities makes it feasible.
By the way, a relational database is not the best example here; SAP is. It
has a very large number of modules, each with quite a few methods. Most
installations have quite complex LDAP databases set up to control exactly
who can invoke with methods on which of those modules. It's quite a mess.
>
> That is, either the number of capability variants per object
> is very small
> or the system design won't work for other reasons.
I don't think that whether an interface is separately controllable or not
really has anything to do with how buildable an application is. The
designers put in all the interfaces they think are needed, regardless of any
security concerns. I believe that all capability systems, and e-speak I
know, separate the security policy from the application interface. That
simplifies the design. The security policies can be as simple or as complex
as the administrator wants.
>
> > Wildcards are often used to reduce the number of
> capabilities needed, but
> > this approach is dangerous and not general enough. What
> happens if I put
> a
> > private file in a directory that has an outstanding
> wildcard capability
> > granting read access to general users?
>
> By wildcard, I assume that you mean wildcard on user
> identity. This implies
> a hybrid access control design. I'm not inherently opposed to hybrid
> systems, except insofar as bastardizing the protection model
> leads to the
> kinds of problems you identify (along with a host of others).
> Wildcards have
> been used in a few hybrid systems, but not in any capability
> system, because
> there is nothing to wildcard *on*.
See my reply to Norm Hardy. I'm wildcarding on the object handle, something
that's meaningless in most capability systems.
>
> The most thoroughly developed hybrid of this form is probably Karger's
> thesis (SCAP). Paul went to a hybrid model because neither he
> nor Boebert
> could see how to do multilevel security in an "unmodified"
> (Karger's term)
> capability system. It turns out that adding transparent
> indirection is all
> you really need, though a "weak" access right definitely
> helps. See the
> paper by Weber and Shapiro in the 2000 Symposium on Security
> and Privacy.
>
> E is a "pure" capability model, and therefore has no
> wildcarding. So far, it
> does not appear to need any.
Until the number of separate capabilities becomes a problem.
>
> > We could also
> > list all the relevant objects in the capability, but I
> don't think I want
> to
> > pass around a capability listing the 50,000 files on my
> system that you
> can
> > read.
>
> You seem to be assuming that this is frequently necessary. In
> my experience
> it is not. These capabilities can almost invariably be stored in some
> "directory" object, and then a single capability to the
> directory object can
> be generated.
The important point is that we don't want that directory object in the
critical path of every request. It adds latency and becomes a bottleneck if
it's handling a large number of resources. Besides, the directory object is
only delaying the time that the problem arises. How many capabilities do I
need to grep from the root directory?
I certainly hope you're not suggesting that the directory object perform the
operation on behalf of the user. Such an approach represents a security
threat. Since the directory object holds capabilities that the user does
not, it is able to perform an action the user could not. I always ask
myself what could happen if I could get an intermediary, the directory
object in this case, to execute an arbitrary piece of code.
>
> As an aside, I'ld argue that it is *rarely* the case that
> users may wish to
> share objects in bulk, and *never* the case that correctly
> designed programs
> should do so. Up to a point, the fact that capabilities means
> that they
> reinforce good design practices.
Actually, I think the situation is quite common. Many times when I fork a
process, I'd like the child process to have a substantial fraction of my
privileges. Not every time, but many timmes. For example, a word processor
I spawn will need access to fonts. A build will need access to source and
object files and the required executables. Each of these requires transfer
of a potentially large number of capabilities.
>
> > The repository has an entry containing a number of fields for each
> > registered resource. One of these fields is a list of pairs each
> > consisting of a reference to an e-speak capability and the access
> > rights granted when that capability is presented.
>
> I think this is an interesting design. It's not appropriate
> in an operating
> system because of the number of dynamically allocated access records
> involved, but it has much to recommend it at the language level. A few
> questions:
I don't see why it's not appropriate for an OS. They seem to have a very
large number of dynamic data structures.
>
> If you pass an e-speak capability to me, and I pass it to
> Mary, how do you
> revoke mine without revoking hers? Injecting a wildcard
> filter can probably
> handle that, but now what about the case where you want to revoke
> capabilities that I hold except those held on my behalf by a
> particular
> trusted program?
I can't revoke the privileges separately unless there's some way to identify
the requester, and then I have to fall back on an ACL scheme, something
that, personally, I'm not willing to do. Of course, there's nothing to
prevent your application API requiring some unforgeable identity be passed
with the request.
Selective revocation in e-speak requires that I plan for it ahead of time.
I can clone an e-speak capability and give one to you and one to the trusted
program. All I need to do to revoke your privilege is to unregister the
clone you have from the e-speak repository. Of course, someone needs to
keep track of which capability was given to which user, but there's no
escaping it.
>
> How are access rights protected, given that they do not appear to be
> cryptographically secured? One of the important qualities of
> a vat is that
> if you compromise one vat you cannot compromise the next. I
> can imagine wire
> protocols that would achieve similar protections for e-speak,
> by shipping
> traditional capabilities over the wire, but all of the
> methods that I can
> see for doing so raise hairy consistency challenges in the
> context of object
> caching. In particular, one cannot guarantee that permission downgrade
> occurs promptly in the distributed case.
OK, here we get into the difference between E-speak Beta 2.2 and E-speak
Beta 3.0. The latter uses SPKI certificates as capabilities. They are
cryptographically secure. The only way to compromise an e-speak vat is to
get its private key, just as with E. These capabilities can be passed
around by any means, including publishing them in the newspaper. Each is
issued to a specific principal who can use it only by proving knowledge of
his private key.
E-speak Beta 2.2 uses a different scheme that is intimately connected with
the naming model. The capabilities themselves (We called them keys because
they opened locks.) exist only on the machine owning the resource being
protected. Permissions were granted by giving a user a name bound to this
key or putting the key on a key ring that the user had a name bound to.
(Since names were purely local to the task, no guessing attacks were
possible.) The user would name one or more key rings as part of the
request, and the e-speak engine would use the keys on the key rings to open
locks associated with access rights. That's the local case.
The remote case is more complicated than I care to go into here, but it does
work. Suffice it to say, that the machine owning the resource would give
the user's machine a name for the key. Users on that machine could then
unlock the corresponding permissions. End to end security was still
possible by having the user and provider set up an encrypted channel for
communication. Thus, only someone able to properly encrypt the request
could make use of the permissions represented by the e-speak key.
The system structure of e-speak was designed so that that compromising one
engine doesn't help you compromise another. All accesses in E-speak Beta
2.2 are through a proxy that appears to the engine to be just a local user.
A remote request passes through the proxy, which repeats it as a local
request. No one, not even a hacker who's penetrated another machine can get
the proxy to do something that wasn't explicitly allowed.
>
> > In the case of multilevel security, I can cover
> > all cases with only one e-speak capability per
> > security level, and each user gets only one
> > capability.
>
> I think that this illustrates a flaw, and possibly a fatal
> flaw. If there is
> one fundamental advantage to true capabilities, it is that
> they make exactly
> this sort of aggregation hard. The aggregation you propose
> violates the
> principal of least privilege. Worse, it encourages the construction of
> programs that violate this principal routinely.
I agree that conventional capability systems make it hard to give a person
with secret clearance the proper set of access rights. That just makes it
harder to get it right. Such complexity is the enemy of security.
I don't see how aggregation violates the principal of least privilege. The
rules are that someone with security level 2 can read from level 1, write to
level 3, and read and write from level 2. What additional privilege is
there? If you want to limit which documents a level 2 user can see, e-speak
can do that as well.
>
> The reason this may be fatal is that one cannot assume that
> the programs are
> well behaved. Indeed, where security is taken seriously, we
> must assume that
> at some point somebody will slip us a trojan horse. In the design you
> propose, a single trojan horse can destroy everything I have access to
> *unless* I carefully restrict the capabilities that are
> handed to the trojan
> program, at which point we start to move to a design pattern
> in which I have
> to go to a lot of extra work to reconstruct traditional
> capabilities using
> the split capability mechanism.
Yes, if you can get the one capability, you've got all the permissions.
However, E-speak 2.2 provided a mechanism to prevent transfer of resources
using a special flag in the repository entry. A "grant authorized" resource
could only be transferred to another user by a user having the grant
authority. Any other attempted transfer resulted in an exception. We can
assume that the user with grant authority would only transfer the resource
when the proper credentials were presented. If not, then all security is
broken anyway.
>
> > There's still a problem with the approach you describe.
> Namely, over time
> > I'll accumulate a large number of capabilities. Either I'll have to
> present
> > them all on each request, in which case the permission
> checking will take
> a
> > long time, or I'll need a way to track which ones I need
> for each request.
In e-speak Beta 2.2, I present my capabilities by naming one or more key
rings. Each can contain a large number of keys. Each key is tested against
the locks in the repository entry of the object. As I showed in my earlier
note, the scaling is quite favorable, even if I have thousands of keys. For
example, I need only 30 integer comparisons to check 1,024 keys against 3
locks.
>
> First, let's be clear that "I" is a program, not a user. It
> is very easy for
> programs to track what capability goes to what. When was the
> last time a
> production version of a program you ran got its file
> descriptors confused?
It's still a piece of code I have to write, debug, and maintain. Also, how
does my program determine which object a capability is referring to? If I
get my capabilities from an authentication server instead of the object's
owner, there's no obvious way to make the connection.
>
> Also, I'ld suggest that as a design pattern you want the
> program to present
> specifically the authorizing capability with each operation that it
> performs. This provides a form of scoping around the exercise
> of authority,
> and this scoping appears to be a very useful thing in reducing
> security-related errors.
I agree, but it's a trade-off between simplicity and limiting the scope.
Either I present all my capabilities on each request, or I track which ones
enable what. One reason for allowing multiple key rings to be specified is
to simplify the tracking. The example we always used was the test program
that should not have write access even though I hold the capability.
>
> > Besides, we can't really use that approach [the directory
> of capabilities
> > approach] in e-speak because of the
> > additional latency. The two step process you describe doubles the
> latency.
> > That's because in e-speak the holder of the strong capability is a
> service,
> > not an object.
>
> In KeyKOS/EROS it's also a service, though it helps that the
> systems are not
> distributed.
It's not so much that it's distributed, although that magnifies the problem.
It's the context switches that kill us. Since everybody runs in a separate
address space, each round trip costs us 4 context switches. On NT, that's
over 1 ms.
>
>
> Jonathan Shapiro
>
_________________________
Alan Karp
Decision Technology Department
Hewlett-Packard Laboratories MS 1U-2
1501 Page Mill Road
Palo Alto, CA 94304
(650) 857-3967, fax (650) 857-6278