EROS: A Novel Combination
EROS combines an unusual collection of facilities into a single package, hopefully in a novel way. Each of these faclities is, in our view, essential to providing scalable reliability, and all of them have appeared in prior systems. No prior system, however, has integrated this particular combination of features in quite the same way.
Architecturally, the EROS system is most closely descended from an earlier system known as KeyKOS. We maintain a KeyKOS Home Page containing a variety of public KeyKOS documents.
The key features of the EROS system are:
Each of these issues is discussed in greater detail in the sections below.
EROS is a pure capability system. A capability uniquely identifies an object and a set of access rights. Processes holding a capability can perform the operations permitted by those access rights on the named object. Holding a capability is a necessary and sufficient condition for accessing the associated object with the authority granted by that capability. There is no other way to perform operations on an object.
One advantage to the capability approach is that the EROS kernel does not need to support any notion of user identity. The login agent hands each user their initial authorities, from which they can access whatever objects are (transitively) reachable.
Most capabilities can be rescinded. For example, a process holding access to a terminal port loses its authority on that port each time the system is restarted. This is necessary to ensure that connections are re-established when appropriate.
A common confusion about capabilities is that they are incompatible with more conventional protection models. While the EROS kernel knows nothing about capabilities, user domains (processes) are free to implement whatever authentication mechanisms they wish. The EROS unix emulator, for example, implements the customary unix semantics based on user identity.
The basic idea of orthogonal global persistence is quite simple: on a periodic basis, or when requested by an authorized application, a consistent snapshot of the entire system state is taken. This consistent snapshot includes the state of all running programs, the contents of main memory, and any necessary supporting data structures.
Global persistence means that the state of all processes is captured at one instant; in the event that the system is forced to recover from the snapshot, all applications are consistent with respect to each other. Several research systems provide the ability to snapshot a single process. EROS and a few others snapshot the entire system state.
Orthogonal persistence means that applications do not need to take any special action to be a consistent part of the snapshot. The EROS unix emulator, for example, runs unmodified unix binaries. When EROS performs a checkpoint, these programs are checkpointed without their ever being aware that a snapshot was taken. Perhaps more important, no special action is needed by these applications to recover if the system fails.
EROS persistence is a kernel service. While this makes the EROS kernel larger than non-persistent kernels such as QNX, non-persistent kernels do not provide a comparable means of transparent recovery.
A (true) story about keykos may provide some sense of the value of orthogonal persistence:
At the 1990 uniforum vendor exhibition, key logic, inc. found that their booth was next to the novell booth. Novell, it seems, had been bragging in their advertisements about their recovery speed. Being basically neighborly folks, the key logic team suggested the following friendly challenge to the novell exhibitionists: let's both pull the plugs, and see who is up and running first.
Now one thing Novell is not is stupid. They refused.
Somehow, the story of the challenge got around the exhibition floor, and a crowd assembled. Perhaps it was gremlins. Never eager to pass up an opportunity, the keykos staff happily spent the next hour kicking their plug out of the wall. Each time, the system would come back within 30 seconds (15 of which were spent in the bios prom, which was embarassing, but not really key logic's fault). Each time key logic did this, more of the audience would give novell a dubious look.
Eventually, the novell folks couldn't take it anymore, and gritting their teeth they carefully turned the power off on their machine, hoping that nothing would go wrong. As you might expect, the machine successfully stopped running. Very reliable.
Having successfully stopped their machine, novell crossed their fingers and turned the machine back on. 40 minutes later, they were still checking their file systems. Not a single useful program had been started.
Figuring they probably had made their point, and not wanting to cause undeserved embarassment, the keykos folks stopped pulling the plug after five or six recoveries.
In the end, the issue comes down to this.
Suppose you had perfect software and hardware (if you find some, be sure to let us know). Even so, your computer will fail four to five times a year due to random background radiation.
So when your computer fails, do you want to be told that all your files are intact and you can now resume your painstaking work (having lost your latest session), or would you rather have all of your windows, (complete with word processor, web browser, and solitaire) reappear with a few minutes lost work. Take your pick.
The EROS kernel is multithreaded. While this isn't obvious to an application user, it has several useful consequences to kernel and application developers:
The traditional objection to this of design has been the cost of context switches. Context switch cost can be divided into three parts:
In spite of the apparent expense, threaded drivers offer a performance advantage over re-entrant drivers. Each time it is entered, a re-entrant driver must check a series of conditions to determine the state of the hardware. This typically requires tens of instructions. By contrast, reloading the register set of the i486 and pentium processors takes on the order of 10 cycles. While less dramatic, an advantage still exists on processors with larger register sets (e.g. mips, sparc, alpha).
EROS avoids the cost of switching address spaces by running driver threads "parasitically." Kernel threads run entirely within kernel space, and the EROS kernel is mapped into every process addess space at a common virtual address. As a result, a context switch from a user thread to a kernel thread does not require an address space change. Since the common case is that the same user thread is resumed when the kernel thread has completed, this is a crucial optimization.
Security experts worry greatly about confinment. Confinement is the process of making sure that information does not leak to unauthorized users.
While confinement remains a crucial problem in secure and mission-critical applications, the confinement problem is not currently very important. The on-going fall in hardware prices, coupled with the shift to client-server technology, has made this problem fairly easy to solve. To build a secure server, buy a dedicated machine, turn off all network services except the one you intend to run, and let that service performs its own authorization.
Given this approach, the remaining security hazards come from three sources:
Hardware bugs present important difficulties that have received inadequate attention. Implementation bugs can usually be located by suitable quality assurance procedures, rapidly isolatable, and easily fixed. Misused authority errors are extremely subtle, take a long time to locate, and sometimes require redesign of the software. A clear example of a misused authority problem is presented in The Confused Deputy.
Because EROS domains (processes) are first-class objects, they are able to hold capabilities that are independent of the user's capabilities. EROS has no equivalent of the UNIX superuser authority. Capabilities refer to a specific object, so there is little risk of abuse in a typically-designed domain. The EROS equivalent to sendmail, for example, holds specific authorities for mail boxes and the sendmail configuration files. No matter how badly implemented sendmail may be, it is provably impossible for it to modify any other objects in the system.
4.1. The Renaissance of Confinement
While confinement is not a terribly active problem in general computing today, we are moving inevitably to a world in which it is a vital concern. Today's example is information services, which are profitable only if they can control what you access.
In the next few years, however, we expect to see a rise in algorithm services. Where an information services provides access to a data set, an algorithm service runs a program on demand, possibly on your data, but more likely on third party data.
As an example, consider an investment advisory program. You (the user) propose a set of investments with buy and sell dates, and the program makes a prediction as to the expected profit from the transaction. There are two important privacy issues in such a transaction:
Neither side is willing to trust the other with their proprietary information, but both sides wish to engage in the transaction. The EROS escrow agent mechanism provides a way to do this that guarantees security to both sides.
Copyright 1998 by Jonathan Shapiro. All rights reserved. For terms of redistribution, see the GNU General Public License