Where Capabilities Come From

 

A number of people, after reading What is a Capability, Anyway?, have asked ``But where do capabilities come from?''

That question can mean a lot of different things:

  1. How are primitive objects created in a capability system?

  2. Where does the system as a whole get its capabilities from, and how does it get new ones?

  3. How are higher level objects, including user objects, created in a capability system?

  4. Where does a particular program in a capability system get its capabilities from, and how does it get new ones?

  5. How are capabilities stored?

Each of these questions has a simple answer, and each raises further questions that can be challenging. In this note I'll try to answer as many of them as I can.

Because the answers are much clearer with a concrete example, I rely heavily in this note on examples from EROS illustrate these points. The ideas behind these examples can be used in other systems. For further information on EROS, you can have a look at the EROS Home Page.

1.  How Are Primitive Objects Created?

First, let me say what I mean when I talk about a ``primitive object.''

In real systems, there is some place where the rubber meets the road. Processes, files, directories, and so forth are abstractions provided by the system, but they are also ``objects'' that the system lets you manipulate. Ultimately, these objects are constructed out of other objects -- mainly disk storage.

Disk sectors, in contrast, are primitive objects. A disk drive manufacturer can tell you what they are made of, but as far as the operating system is concerned a disk sector is a basic building block. An indivisible, basic building block of this sort is a primitive object.

The choice of primitive objects, and the mechanism(s) used to name them, are among the most basic decisions in any operating system design. The following subsections illustrate two very different alternatives.

Some thoughts from Lewis Caroll on the subject of on naming, identity, and confusing the two:

 

"The name of the song is called `Haddocks' Eyes.'"

"Oh, that's the name of the song, is it?" Alice said, trying to feel interested.

"No, you don't understand," the knight said, looking a little vexed. "That's what the name is called. The name really is `The Aged Aged Man.'"

"Then I ought to have said `That's what the song is called'?" Alice corrected herself.

"No, you oughtn't: that's quite another thing! The song is called `Ways And Means': but that's only what it's called you know!"

"Well, what is the song, then?" said Alice, who was by this time completely bewildered.

"I was coming to that," the Knight said. "The song realy is `A-sitting On A Gate': and the tune's my own invention."

Through the Looking Glass
Lewis Carroll

 

1.1  Primitive Objects in EROS

In EROS, the primary objects are pages, nodes, and numbers. Pages and Nodes are both low-level disk objects. Numbers represent integers in the range [0..296-1]. These objects are not composed out of more basic objects. There are also some objects (really services) that are implemented by the kernel that might be considered primitive. These kernel objects have no state.

These primitive objects need to be named by the system, and the names need to fit within a capability. It is desirable that capabilities be of some fixed size, so the name can only be a fixed number of bits long (in EROS they are 64 bits).

This tends to result in a sort of odd view of things: all of the objects that are ever going to exist exist when the system is first installed. Some of them (like pages on a disk that hasn't been attached yet) are going to take a really long time to respond if you try to access them.

Think of disk pages as being like electrons: there is some fixed number (however large) of electrons in the universe. Some of them are near at hand (like a connected disk) and some of them are pretty far away (like a disk we haven't added yet). We don't create new electrons (yet); we just rearrange the ones that we have in new configurations.

In EROS, then, primitive objects are neither created nor destroyed. Your access to them can be invalidated, but the object itself conceptually exists forever.

1.2  Primitive Objects in Other Systems

EROS is persistent, so it is fairly easy to say what the primitive objects are and to account for what happens to them. Other systems do not adopt this approach, and their stories about primitive objects are greatly complicated as a result. While I use Amoeba as an example here, the questions raised should be asked about almost every operating system out there today.

The Amoeba, distributed system, for example, is not persistent. When a program wants to create a new process, it performs an exec_process() call, which fabricates a new process from an executable file and returns a capability to the new process.

In Amoeba, the process is manufactured out of whole cloth. This raises a number of questions:

  1. From what storage was the process created, and by what authority was this storage allocated?

    Answers:

    The process was created from swap space, which is a finite resource that is not properly accounted for in most systems.

    No special authority was required to create the process, which means that I can crash the system by allocating too many processes.

    Adding space quotas doesn't really help, as it increases the number of operations in the system that can fail due to lack of space. A better design would require the creating program to present a capability for the space that will be used.

  2. The file presumably came from a file system, which raises some interesting security issues. How do we account for where the file storage came from and why we can trust the supplier? How do we know that the file we intended to run has not been replaced?

In Amoeba, the capability to the process is made up on demand. The saving grace is that capabilities are not stored to the disk, so it suffices for the allocator (in this case, the operating system) to remember which object numbers have been used during the current execution.

2.  Where Does the System get its Capabilities From?

The answer to this question is system-specific, but most systems are pretty similar. There is some storage manager, which may be the kernel, and this storage manager has the authority to create capabilities to any primitive object. It is the responsibility of the storage manager to make sure that it doesn't give the same primitive object out to multiple parties.

The following is a description of how this actually works in EROS.

2.1  Adding a New Drive Under EROS

In EROS, when a new drive is connected to the system, a utility called the disk formatter is used to create EROS partitions and to define EROS ranges within those partitions. A range is a contiguous sequence of pages on the disk, and has a known starting object identifier (OID). Pages within the range are numbered sequentially starting from this object identifier. The disk formatter performs a high-level format on each of these ranges to make it suitable for use by EROS.

The disk formatter imposes some rules about the object identifiers. Either

  • They do not overlap any existing identifiers at all, in which case the new range is considered "new" storage.

or

  • They exactly match some existing range, in which case the new range is a duplex of the old range. Duplexing is a low-level function of the EROS kernel.

Under the covers, an EROS page capability contains an object identifier, a type (page or node) and a version number. The object identifiers come from a predefined space. In EROS, a page OID is a number in the range [0..264). When you invoke a page capability, the range table is consulted to find the appropriate range, and then a lookup is done within that range to find the actual page.

Once a new range is defined, it is either handed to the master EROS storage manager (the prime space bank) for use as new pages, or a special sub-bank is created and handed back to the caller of the disk formatting utility. The second case is used for things like database programs that want to manage their own storage.

2.2  Fabricating New Pages

The next question to ask is: ``Where do the new page capabilities come from?''

The answer is that the disk formatter doesn't actually create any new page capabilities. The EROS storage manager (the prime space bank) holds a special capability known as a range capability. A range capability specifies a range of OIDs, and allows the holder to manufacture page and node capabilities with any OID in that range. If you like, you can think of the range capability as representing a set of all pages and nodes in the corresponding range.

The prime space bank holds a range capability for the entire range of OIDs. It avoids giving out capabilities to nonexistent pages by remembering which ranges are valid. It knows the list of valid ranges because it trusts the disk formatting utility.

In summary, then: The system never gets any new capabilities.

3.  How are Higher Level Objects Created?

A higher level object is simply one that is built out of primitive objects. In EROS, these include processes, address spaces, system services, files, directories, and so forth. Since a higher-level object is built out of primitive objects, there is always some unique primitive object that we can use to identify the higher-level object. Two examples:

3.1  Process Creation in EROS

In EROS, a process is made up of some number of nodes (how many is architecture specific), one of which serves as the process root.

In some sense, you will note, we haven't actually created any objects when we make a process out of nodes. What has really happened is that we have agreed to pretend that these nodes constitute a process for the moment. A natural way to name the process is by taking the object identifier of the process root and creating a capability with a new type that tells the kernel to use the process abstraction instead of the node abstraction.

In EROS, this is exactly how a process capability is built. There is a kernel service called the process tool. Given a node capability, the process tool returns a process capability to that node. This effectively turns the node into a process root.

Two observations about this design:

  • The EROS kernel is not harmed if both process capabilities and node capabilities exist at the same time for the same node. It therefore does not care if this is so.

  • There are ways that a user program might be harmed if a node capability exists to one of your processes. There is a standard system utility, called the process creator that provides this guarantee to programs that need it.

A similar technique is used for address space objects, though in that case the protection of a process tool is unnecessary.

Note, however, that in the case of processes, the higher level object was not ``created'' in the usual sense: nothing is built from scratch when an EROS process comes into the world.

3.2  User-Defined Objects

In most capability systems (and for that matter in most microkernel systems), programs can define and implement new objects, which are often referred to as user-defined objects.

From the operating system's perspective, no new object is defined. A user-defined object is simply a capability that invokes the program that a process obeys. The operating system provides a special capability type, referred to as a start capability (EROS), or an entry port (most others). When this capability is invoked, the passed information is supplied to the program (as distinct from the process).

What the program does with the supplied information is entirely up to the program. Among other things, this allows programs to define new object types. To the holder of the start capability, it appears that they hold a capability to a new object.

In addition to a distinct capability type, a start capability typically includes some extra bits that are supplied to the program by the operating system when it is invoked. This bits can be used to distinguish one object from another, or to distinguish between multiple services or service interface versions implemented by the same program.

One tricky issue raised by user-defined objects is ``So what is an object?'' EROS takes the view that an object is something that has a well defined interface. While two capabilities can be compared for identity (they are the same thing), there is no means to compare them for equality (they do the same thing).

4.  How Does My Program Get Capabilities?

The last question is the one that people generally seem to be asking when they ask where capabilities come from. The answer is simple, but people seem to have a hard time figuring out how it translates into practice.

In a capability system, a new program (we'll call it new) gets its capabilities in two ways:

  1. New receives an initial set of capabilities when it is created. These are supplied by new's creator (which is another program).

  2. If another program receives a capability to new, it can invoke new and pass it additional capabilities if it desires.

4.1  Example: Modeling UNIX Authority

This may not seem like a lot of flexibility, so let's look at how this could be used to implement the authorities that a UNIX program normally holds. Since UNIX has no ability to transfer capabilities (or at least none that is widely used), all of this authority comes from the initial authorities.

A UNIX program has the following authority:

  • Access to a file system root (/), which is like a capability to a directory object.

  • Access to a ``current working directory'', which is also like a capability to a directory object.

  • The right to send signals to its children (a capability set) and in some cases to members of its process group and terminal group (also capability sets). Since a UNIX program does not have other authorities over such programs, these sets must be wrapped by ``signal manager'' objects, and the UNIX program is given a capability to the signal manager.

    Note that the UNIX emulator can add and remove things from these sets to model the creation and destruction of new processes.

  • Access to a source of storage from which page allocations are satisfied and new processes are created (a space bank).

  • Access to the network subsystem.

  • A process holds file descriptors to open files, which again are like capabilities to file objects.

5.  How are Capabilities Stored?

Finally, we get to the last question, which is how capabilities are to be stored. Usually, the real question behind this question is: ``Where do capabilities go when the system is shut down, and how can I arrange to get them back?''

In a persistent system like EROS, processes do not go away until they exit voluntarily. There is no need to take special measures to get capabilities back: they are never lost in the first place.

In a non-persistent system like Amoeba, some arrangement must be made to store capabilities in a permanent store. In non-persistent systems, this is usually accomplished with encrypted capabilities, which can be stored as normal data.

The main problem in non-persistent systems is making sure that the wrong people cannot then read the files and thereby get their hands on the capabilities. In the end, non-persistent systems are forced to resort to some sort of access list based mechanism to protect the file system, which is insecure, for reasons described here. This is why EROS is persistent.

UNIX systems get around this problem by not letting you store capabilities at all. Open file descriptors are simply lost when the system crashes; the corresponding files must be reopened by way of the file system.


Copyright 1999 by Jonathan Shapiro. All rights reserved. For terms of redistribution, see the GNU General Public License