Re: Split Capabilities: Making Capabilities Scale Jonathan S. Shapiro (shap@eros-os.org)
Sat, 8 Jul 2000 21:29:03 -0400

Just a reminder, since I write in haste and have sometimes excessively blunt writing style: e-Speak looks like a really good piece of work, and my purpose in raising these issues and asking these questions is to achieve better understanding of a different design.

> If capabilities are cryptographically secure, they're bigger than
encrypted
> inodes. For example, SPKI capabilities are of the order of 1 KB at a
> minimum.

You have referred to SPKI capabilities a couple of times, and it is worth asking if, given their size, they are the right things to use. It is only necessary to use encrypted capabilities at all when crossing machine boundaries, and only a small fraction of capabilities do that. Internal capabilities can be protected by partitioning. Whether the engineering cost of this is acceptable is something I hope you can shed some light on.

But my immediate question is: given the size of SPKI capabilities, are they the right thing to use?

> I certainly don't want to
> submit all my capabilities on every request, even if I'm holding fewer
than
> 180,000.

You made this point before, but if you responded to my comments about programs not misplacing file descriptors I lost it in the reply indentation... Oh. And then I found it below.

> > It
> > is very easy for
> > programs to track what capability goes to what. When was the
> > last time a
> > production version of a program you ran got its file
> > descriptors confused?
>
> It's still a piece of code I have to write, debug, and maintain. Also,
how
> does my program determine which object a capability is referring to

If a program cannot keep track of what it is referencing as it makes each request we have far far bigger problems than figuring out how many capabilities to present on a given request. You appear to be starting from the design assumption that denotation isn't feasible. I have to be missing something that seems obvious to you.

I think I'm making the assumption that at worst the relevant capability can be used similarly to the way file descriptors are used, and there must be something about your architecture that makes this approach seem unnatural to you. Can you articulate what this might be?

> One scheme for
> distributing capabilities uses an authenticating agent. I present my
> credentials, and get back my access rights in the form of capabilities.
I'd
> have to parse the capabilities to know which ones go with which objects,
> something I might not even be able to do if name aliasing is used.

If you are going to do this, it seems to me that it can be tied to a namespace. The caller presents the (human) object name and the credentials for that name, and is returned a singleton capability. The *last* thing you want, given the proliferation of buffer overruns, is for any program to have broad authority.

[My comnments about file systems elided]

> In fact, these problems are so serious, that the best policy is to
> make sure that names are never reused. Note that these problems have
> nothing to do directories growing too large; they have to do with using
> names that have meaning as the object handle in the capability.

This is a good point, and one that we need to make more clearly in the various capability writings that are springing up. We talk about a capability as "name + permissions", and we need to be clearer about the layering of namespaces and the opacity of the namespace used by the capabilities. It's obvious to all of us who are intimately involved in the area, and a source of confusion (or so I have found) to people learning the subject.

There is a level at which names must be reused, which is the level at which the system names physical resources. In EROS/KeyKOS, name reuse is prevented by ensuring that every system object (page or node) carries a version number, and that every capability also carries a version number. We call this number an allocation count. If the count in the capability and the count in the object do not match, the capability is void. Modulo the possibi lity of GC to address rollover, this satisfies your "never reuse names" suggestion. In fact, we've yet to see a rollover happen.

> An alternative is to allow people to use a structured name, but denote the
> file by some arbitrary designator in the capability. All that's needed is
a
> mapping table somewhere in the system. However, this approach precludes
> using wildcards in the capability, forcing me to have a separate
capability
> for each file I need to access.

It also creates the problem that a mapping table entry is an accountable resource, and some discipline must be inserted to control allocation. If it can be guaranteed that there is one entry per object then the accounting is pretty easy, but if the relationship between mapping entries and objects is many to one you get into accounting troubles.

One of the really serious design errors in VMS that is only partially remedied in NT is the large number of object types and the consequent need for a large number of quotas whose interactions even the experts do not understand. One source of beauty in KeyKOS/EROS is the small number of object types and the consequent simplifications around quota issues. In fact, I'm concerned that in introducing a notion of physical resource pools and management thereof I need to be careful no to mess up this property.

I'ld add that having only one conceptual level in the storage hierarchy significantly reduces the number of visible objects in the system, and that the advantages of this are easy to overlook. It's going to make the resource management logic much simpler not to have to project everything across two layers of mismatched storage hierarchy.

> Since the former approach grows linearly with the number of
> methods, and the latter grows exponentially, if we don't know ahead of
time,
> I'd probably plan on one capability per method. On the other hand, in
real
> life we only issue a few sets of permissions, so it makes sense to issue
> them with combinations of access rights. Still, it wouldn't surpise me if
> we often needed 3 capabilities per object.

To the extent that we are interested in protections for purposes of security, I have observed two "design patterns" here. One is the read-write/read-only distinction (two capabilities) the other is the controller/client distinction. In practice, the controller tends to convey authority to create read-write capabilities, so a three-layer permission hierarchy rapidly emerges as the common case.

> There are a lot of things we can't do with databases now that we could do
> with a well designed capability system. The real problem isn't just the
> number of methods, it's also in the fields and the records. For example,
I
> should be able to find anyone's phone number, but I should only see the
> salaries of people reporting to me, and I can see my salary but not change
> it. We can't do that kind of thing very well today, but giving me a the
> right set of capabilities makes it feasible.

I believe that you are engaging in the high-level language machine fallacy. I submit that the recent CISC/RISC wars distracted our attention such that we failed to notice the *other* piece of the HLL machines that bit the dust: segments. Segments were originally conceived as HLL-language object grain protection.

From a research perspective, the EROS/KeyKOS thesis is that performance is had by choosing capability types that map directly to what hardware can efficiently support. There are things that should be protected at the system level, and there are things that can only be protected by knowing things about the application semantics (as in your example). The primitive capability mechanism needs to provide a hook that the language runtime can use to bind behavior to a capability, but that is ALL it should to.

> I believe that all capability systems, and e-speak I
> know, separate the security policy from the application interface. That
> simplifies the design. The security policies can be as simple or as
complex
> as the administrator wants.

I think that I understand what you are trying to say, but I think you've framed it in a way that is a bit off. The application interface can be thought of as a set of thinned signatures for the application. If you don't address the security of these interfaces, your design is DOA for fine-grain protection. By the time the administrator gets into the picture, all that can be accomplished is to do policy on the interfaces provided by the designers. The real security issues must be addressed at design time if the administrator is to have a hope of success.

This is part of why I argue that too many interfaces reflects flawed and unmaintainable design.

> The important point is that we don't want that directory object in the
> critical path of every request. It adds latency and becomes a bottleneck
if
> it's handling a large number of resources.

That has not been our experience. Our experience is that programs actively manipulate only a limited number of capabilities at a time, and that directory accesses occur (if at all) only during regime shifts in program behavior, as when resetting a database thread to accept a new connection.

I think we are back to my question about why you feel that programs should ever possess large numbers of capabilities, and I should await your answer to that.

> Besides, the directory object is
> only delaying the time that the problem arises. How many capabilities do
I
> need to grep from the root directory?

If your architecture *has* a root directory and a shared mutable namespace, it is in principle unsecurable (see Harrison, Ruzzo, and Ullman's proof) and we have bigger problems to worry about than how many capabilities we are lugging around.

> Since the directory object holds capabilities that the user does
> not, it is able to perform an action the user could not. I always ask
> myself what could happen if I could get an intermediary, the directory
> object in this case, to execute an arbitrary piece of code.

In KeyKOS/EROS, the directory does not hold capabilities that are inaccessable to the user. If two users should have different authorities, they have different directories. That is, there is no universally shared namespace. This seems to be a major difference between the designs. The comments on persistence below may be relevant as well.

> > As an aside, I'ld argue that it is *rarely* the case that
> > users may wish to
> > share objects in bulk, and *never* the case that correctly
> > designed programs
> > should do so.
>
> Actually, I think the situation is quite common. Many times when I fork a
> process, I'd like the child process to have a substantial fraction of my
> privileges. Not every time, but many timmes. For example, a word
processor
> I spawn will need access to fonts. A build will need access to source and
> object files and the required executables. Each of these requires
transfer
> of a potentially large number of capabilities.

You are clearly thinking in a mold wherein monolithic applications are acceptable. I tend to view them as something we have to live with for transitional compatibility reasons, but not a mode of application construction that I want to support.

I also think that this design issue is impacted by lack of persistence. In a non-persistent system, it is conceptually difficult to break applications into small pieces because they must continuously be recreated as the system goes up and down. Also, capabilities cannot be bound to the agents that actually use them (the processes) because these resources have no existence across restarts. Unless this can be done the monolithic design is clearly favored. I find this a compelling argument for process persistence. It is, of course, convenient that a good checkpointing design seems likely to yield more efficient I/O across the board, so you don't pay a penalty for it.

> > I think this is an interesting design. It's not appropriate
> > in an operating
> > system because of the number of dynamically allocated access records
> > involved, but it has much to recommend it at the language level. A few
> > questions:
>
> I don't see why it's not appropriate for an OS. They seem to have a very
> large number of dynamic data structures.

Most do. None of those have an MTBF exceeding one month under load, and it took staff-decades to bring them to that level of reliability. KeyKOS MTBF exceeded 3 years almost from the beginning. Much of the reason lies in the absence of dynamic data structures within the kernel.

If these data structures are dynamic, how is their storage billed and accounted for?

> Selective revocation in e-speak requires that I plan for it ahead of time.

That's about what I thought was the case. The boundary cases in your system are a little different than the boundary cases in EROS/KeyKOS, but both require preplanning. I still think that an indirection object is sufficient, and I'll address that further on the indirection object thread of discussion.

> However, E-speak 2.2 provided a mechanism to prevent transfer of resources
> using a special flag in the repository entry. A "grant authorized"
resource
> could only be transferred to another user by a user having the grant
> authority. Any other attempted transfer resulted in an exception.

Trojan horses will proxy. Are you aware of Matt Bishop's paper on Theft and Conspiracy in the take/grant model? It's summarized in the diminish-take chapters in my thesis, and you can find a URL on the EROS papers page at www.eros-os.org. The reader's digest summary in this context is that a "do not copy" bit provides no marginal protection.

Jonathan