|
UP
EROS Web Developer Documentation |
A Programmer's Introduction to EROSNote: This document is a work in progress. It's current contents are mostly correct, but it remains subject to substantial content revision. It should stabilize by 2/7/96. This document provides an introduction to the EROS system from a programmer's point of view. If you are building programs that will run under an operating environment, such as Dionysix, see the documentation for that operating environment. This note does not try to provide an exhaustive description of the EROS environment. Instead, it tries to provide enough information to get you started, leaving details to be addressed by other documents. 1. ProcesssApplications in EROS are typically constructed out of several independent processes, which we call processs. A process consists of:
Each process implements a specific, well-defined collection of related services. Each process carries exactly the authority that it requires to perform its function and no more. While it is common to build several new processs for each application, the EROS system comes with a toolkit of useful processs. Some of these provide high-level facilities such as sorting and data management. Others implement various supporting policies for such things as page fault handling. In addition to constructing new components, an application designer will typically make use of several such ``off the shelf'' processs in constructing an application. Processs are more than just libraries. In addition to code and data, each process has the ability to remember information from one session to the next, and runs in it's own protected environment. Services implemented using processs can run in parallel with the application they serve. We refer to such protected, independent services as active persistent agents. They are one of the keys to EROS's reliability. Placing a service in a seperate process has several advantages:
The primary disadvantage is that invoking a process is somewhat more expensive than performing a library call. 1.1. PersistenceAll objects in EROS, including processs, are persistent. If the system is shut down, or for some reason crashes, the state of all running programs is preserved by the checkpoint mechanism. The EROS system periodically saves a consistent copy of the entire system state to disk. When the system restarts, it resumes from the most recently saved system image. Typically, no more than 5 minutes work is lost in such a failure. A consequence of persistence is that processs exist forever. Until a process is explicitly dismantled, the services it implements remain available to clients with suitable keys. Persistent processes enable a whole new aproach to application design. Instead of building shared libraries, common services in EROS are placed in seperate, persistent processs. Because processs are persistent, there is no need to recreate them each time an application is started, which would be necessary in a non-persistent system. In addition, fault isolation and security are greatly improved by persistence. For example, EROS programs can hold authorities that are different from the user's authorities (see The Confused Deputy for a discussion of why this is so important). 1.2. Views of ProcesssFrom the programmer's perspective, there are (at least) two useful views of processs:
The first view is the traditional view of application programs: a UNIX application, for example, is a program that invokes operating system and other services to perform some (hopefully useful) task for a user. EROS applications are no different. When the user interface software starts a new application, such as a text editor, it does not wait for that application to return to the user interface program. Application processs may retain a great deal of state, but are typically dismantled when the user is done with them. The vast majority of processs, however, are designed to operate as services, and retain minimal state from one invocation to the next. A service process is typically a loop of the form: forever wait for a request process the request, possibly invoking other processs Service processs remain available for invocation indefinitely. Even shutting down the machine and restarting it will not clear out the state of a service process. This means that process code must be written correctly; simply assuming that the program will exit before you get into trouble will not do. 1.3. Process StatesIn addition to providing services, processs provide a mechanism for synchronization. A process can be occupied by at most one thread. If a client invokes a process that is currently running, the client blocks until the process becomes available. More precisely, processs can be in one of three states:
A more detailed discussion of key invocation and process states can be found below. 1.4. Process KeepersOnce in a while, the best of programs will take exceptions. Some of these, such as misaligned reference faults, can be handled with software emulation. Others indicate a flaw in the process that must be corrected by hand. The EROS kernel does not directly handle such exceptions. Instead, it reports them to the process's process keeper. Most processs have an associated keeper. The basic idea behind keepers is that exceptions should be recoverable. The kernel knows enough to be able to shoot the process in the head, but not enough to be able to fix it and allow it to proceed. Since a distinct keeper can be established for each program, it is possible to construct a more knowledgeable fault handler that might be able to recover the process to a reasonable state. In practice, most faults delivered to the process keeper are address space access faults, which are described below along with the description of address spaces. 2. Keys and ObjectsA key grants the authority to use a particular service or manipulate an object. Keys can be loosely broken into a small number of categories:
Most programs use only object and start keys. A few of the more commonly used objects and their keys are described below. A discussion of address spaces and start keys can be found in subsequent sections. 2.1. Number KeysA number key contains an 96 bit unsigned quantity. Number keys are used to describe offsets into memory address spaces, machine register values, and what have you. The basic operations on a number key are to set its value and retrieve its value. 2.2. PagesA page is the basic unit of data storage in EROS. Every page holds an architecture-defined number of bytes (4096 on most implementations), and is manipulated through a page key. The holder of a page key can install the page into a larger address space, or directly manipulate the content of the page by invoking the page key. Address Spaces are discussed below. Page keys exist in both read-write and read-only forms. A read-write page key conveys the right to read or modify the contents of a page of storage. A read-only page key can read, but not modify, the content of a page. 2.3. NodesJust as a page holds a fixed number of bytes, a nodes holds a fixed number (currently 16 on all implementations) of keys, and is manipulated through a nodes key. As with page keys, node keys exist in read-write and read-only variants.
Unlike pages, the content of a node is not directly
accessable to the programmer. Where a page can be memory
mapped and manipulated directly with
In addition, there is a very closly held key called keybits that allows the bits of a key to be examined directly. This key is primarily used for debugging the system itself, and does not permit the keys to be modified. 2.3.1. Sensory KeysA read-only node key does not permit a node to be modified, but if that node contains a read-write key, the read-write key can be copied into the application's key register and used to modify the object it denotes. This is analogous to extracting a read-write pointer from a read-only structure in a language like C. Sometimes it is necessary to hand another party a key to a node with an assurance that no matter what they do they will be unable to obtain a read-write key through that initial key. To address this need, the EROS kernel provides a key known as a sensory key. Sensory keys denote nodes, and are similar to read-only keys, but any read-write key returned by a sensory key will first be weakened. If it is a read-write node key, a sensory key to that node will be returned instead. Read-write page and address space keys will be weakened to read-only keys. Number keys are copied unchanged.
3. Address SpacesA address space provides a mechanism for assembling pages into a larger, addressable pool. The current EROS implementation supports address space trees of up to 264 bytes, and 64 bit implementations will probably increase this limit (the current address space structure can handle 2128 byte address spaces, but the number key size allows only 96 bits of offset to be requested). 3.1. Address Space TreesAn address space tree is a tree of nodes whose leaves are pages. For example, a two page address space tree consists of a single node containing page keys in slots 0 and 1:
Similarly, a 19 page address space tree is made up of three nodes and 19 pages. This one has a "hole" in it; pages 17 through 30 (inclusive) are not present.
In addition to page, node, and address space keys, a slot in an address space node may contain a number key. A zero number key is used to identify an absent subaddress space, such as the hole in the above diagram. Other number keys serve as window keys, which are described below. 3.1.1. Short-Circuit Address Space TreesEvery key to an address space tree has embedded within it the size of the address space it names. More precisely, the key contains the biased log of the address space size. The biased log is defined as (log16(size) - 1). This isn't as hard to compute as it might seem. If an address space contains 220 bytes, the biased log of the address space size is (20/4 - 1) or 4. The blss value can be used to allow address space trees to skip levels in certain cases. Suppose that you are building the address space tree for an address space having two code pages starting at virtual address 0x1000 and a single page of data starting at 0x80000000. In the absence of short circuiting, this would require an address space tree with 9 nodes:
By taking advantage of the short circuiting feature, nodes whose only valid slot is slot 0 can be removed:
Note that the BLSS values of the keys in the uppermost node are 3! This is what tips off the kernel that the address space has been short-circuited. When an access is performed to a short circuited tree, the offset bits corresponding to the "missing" nodes must be zero. If they are not, the address space's keeper (if any) is invoked to handle the error as though the offset was to an invalid page. 3.2. Access RightsTo access an address space, the program provides an offset into that address space, a byte count, a data buffer, and the operation that is to be performed (read or write). This information is bundled up into a message (described below), and the key to the address space is invoked with the message. To determine if the access is permitted, the kernel treats the offset as a path through the tree, and walks down the tree along that path. As it traverses each node, the kernel uses the next 4 bits out of the offset to choose which slot to follow. Ultimately, the kernel arrives at a valid page key or a zero number key. At this point, it has to determine if the access desired is allowed. When the kernel starts the address space traversal, read-write access is assumed. If any key along the path, including the initial address space tree key, does not permit write access, the access authority for the operation is downgraded to read-only. If a zero number key is found, the offset is invalid, and the address space's keeper is invoked. If a page key is found, the operation is performed if the user has sufficient authority. In either case the program will receive a reply message indicating the result of the operation. 3.3. Types of Address Space KeysAn address space is constructed out of nodes and pages. Once an address space is constructed, there are two sorts of keys that can be handed to clients: node keys and address space keys. Given a node key to an address space tree, one can ask for an address space key for the same tree. A node key supports all of the address space protocol, but also allows a client to examine the internal construction of the address space. Using the node key, the client can traverse the address space "by hand" to examine it's content. If read-write authority to the address space is permitted, the client can also modify the structure of the address space. An address space key, by contrast, allows only address space accesses to the address space tree. The client can request data from a particular offset in the address space, but is unable to traverse or modify the internal structure of the address space. For this reason, the common convention is to construct the address space tree out of nodes, and hand the client an address space key to the root of the tree. 3.4. Kept Address SpacesWe have mentioned that an address space can have a keeper, but so far have given no indication as to how the keeper is identified. Keepers are identified by a special sort of address space tree node known as a kept address space node. A kept address space key is distinguished from an address space key by a bit in it's key data field. As with all other nodes in an address space tree, the key to a kept address space node can be either an address space key or a node key. The contents of a kept address space node are interpreted according to a format key that is found in slot 15. The format key contains three currently defined fields:
If an invalid access to an address space is made (invalid page or access violation), the kernel encapsulates a description of the fault into a message, and sends this information to the keeper. The keeper receives the type of the invalid access (invalid, access violation), a key to the address space in question, and a number key giving the offset into the kept address space at which the access occurred. In addition, the kernel synthesizes a special key called a restart key to the process that took the fault. Using the restart key, the address space keeper can either resume the process as though it had not faulted or cause the fault information to be passed to the process keeper. The restart key does not allow the address space keeper to send a message to the process itself. It is possible for kept address spaces to nest. For example, one might define a region of an address space to have "heap" semantics, in which pages will be appended automatically by the keeper as they are referenced. Such a heap would be a part of the larger data region of the program, and the entire data region might be subject to copy-on-write semantics. Though you might expect that it would be done the other way, access faults are reported to the outermost keeper. The reason for this is simple: it is possible for the keeper to pass the fault on to an inner keeper, but not vice versa. If no address space keeper can be found that keeps the address space in which the access fault occurred, the acccess fault is reported to the process keeper. 3.5. Window KeysAs mentioned above, number keys play multiple roles in address spaces:
In order to be a valid window key, the offset must conform to certain constraints. Specifically, if the window key is found in a node that spans a 2n byte address space, the offset must be a multiple of 2n-4. Conceptually, a node whose BLSS is B is expected to contain keys whose BLSS is B-1, and the window key must describe a window at a suitable offset. Constraining the offset in this way does not in practice induce substantial complexity on keepers, and ensures that it possible to share mapping subtables. Use of number keys other than zero number keys and valid window keys is strongly discouraged, as new sorts of windows may be desirable in the future. One use of background address spaces is to support copy-on-write address spaces. The new address space (i.e. the copy) is crafted as an initially empty kept address space, with a read-only background key to the original address space. Initially, the only content of the new address space is a background window key in slot 0 referencing the original address space. Whenever a write reference occurs, the client will take an access violation, and the address space keeper will expand the kept address space as necessary. It installs background keys to smaller and smaller windows on the original address space, and replaces the written pages with read-write page keys to copies of the original address space's pages. Note: the above description is flawed, because once the background address space is read-only it becomes impossible to tell which pages ought to have been read-write. While it is possible to store a true address space key in a non-initial slot of the kept address space, there must surely be a better way. Inclusion of read-write, read-only, and don't-call attributes for window keys must be decided. 4. Messages and Key InvocationA service is used by invoking a key to that service. Invoking a key is done by sending a message to the service denoted by that key. A message contains:
In contrast to other microkernels, EROS messages are fairly minimal. There is no way to directly send more than a single page of data or more than 4 keys. In practice, this minimal message is more than adequate for the vast majority of invocations. If more data must be sent, the sender can construct an address space and send a key to the address space instead. If more keys need to be sent, a node containing those keys can be transmitted. In addition, there is a commonly implemented convention (which is beyond the scope of this document) for transmitting larger buffers by ping-ponging calls between the sender and the receiver. 4.1. Invocation TypesEROS implements three types of invocations: call, return, and send. 4.1.1. Call InvocationsThe call invocation can send only three keys, because the kernel generates a resume key and places it in the fourth key slot of the message. A unique return key is generated for each call, and the kernel guarantees that every call will receive at most one return. Return keys are, in effect, self-consuming. If any copy of a given return key is invoked all others are effectively invalidated. If the recipient is running or waiting for another process, the caller will block until the recipient becomes available. Along with the caller's message, the recipient recieves a copy of the data byte from the service key, indicating which service was invoked. In addition to specifying the location and size of the keys and data to be sent, the caller specifies where any keys and data returned from the service should be placed. When the service returns, it's returned information will be placed in the buffer and key slots specified by the caller.
4.1.2. Return Invocations
4.1.3. Send Invocations
As an example, the send invocation is used when the user interface wishes to initiate a new application. Note that the send invocation has the effect of creating a new, independent thread of execution. The invoking thread remains in it's original process, and a new thread is allocated for the recipient. 4.2. Not Just Client ServerThe call and return invocations are named for their most common usage, which is to handle client/server interactions between two processs, but this is not the only way in which these invocations can be used. In particular, a call can be performed on a resume key, and a return can be performed on a start key. This has some very useful implications. 4.2.1. Multithreaded DispatchOne example is implementing a user level multiplexor. Suppose you have some service that must be low-latency. Because a process can only service one caller at a time, execution will serialize. This may not meet the needs of the application. To handle this problem, you decide to build a multithreaded service. To do so, you construct several processs sharing a common address space. If a way can be found to dispatch callers to a free process, delay will be reduced. What is needed is a dispatcher process to route calls to available processs. To implement such a dispatcher process, you simply add one more process to the pool, and reserve a common page in memory. This page has N words (one for each server thread). Just before the dispatcher starts one of the service threads it sets the word to zero to indicate that the service thread is in use. Just before returning to the user, the service thread sets it back to a nonzero value to indicate that it is now available. The dispatcher itself will run very quickly, but it would be nice to let the service threads return directly to the caller rather than returning through the dispatcher. With clever use of call, send, and return this is fairly easy to do.
5. Escrow AgentsOne of the important features of the EROS system is the ability to provide controlled and secure access to programs and data. Escrow Agents provide one of the mechanisms for doing this. Escrow agents provide a new and unique solution to the Confinement Problem. Simply stated, the confinement problem is this: Given an untrusted program executing on behalf of a user, ensure that the untrusted program is unable to divulge information to an unauthorized third party. In practice, we implement a transitive version of this requirement. The result is the same, but slightly more useful: Given a subsystem, possibly made up of multiple program, executing on behalf of a user, demonstrate that the subsytem is unable to divulge information to any party not authorized by the user. We assume that the client user is trusted by themselves. An EROS escrow agent is a trusted process. The provider of an untrusted program or object installs its pieces into an escrow agent along with an assembler. When the user wishes to obtain a confined version of the untrusted program, they ask the escrow agent to build one by applying the assembler to the components. In order to show that the resulting program is confined, the escrow agent must show that
Provided that both of these constraints are satisfied, the escrow agent is able to certify that the constructed service is confined. The escrow agent holds sufficient authority to determine whether the pieces of the untrusted object are able to be a path for information disclosure. To do this, it uses some simple rules:
To determine that the assembler is discreet, the escrow agent simply asks (transitively) the escrow agent that built it. The EROS system includes a few trivial assemblers that are trusted. This mechanism allows an untrusted provider to provide an arbitrarily complex service to a user while ensuring that the service is fully confined. The mechanism is a more general variant of the KeyKOS factory mechanism. 6. Some Commonly Used ProcesssAs mentioned at the start of this document, there are some commonly used processs that provide generic services for use by application builders. Detailed documentation of these processs is beyond the scope of this document, but a brief rundown of a random sample may be useful as an indication of what is available.
This is not a complete list of services. Others will be added as time goes on, and an object manual will be written to describe them. Copyright 1998 by Jonathan Shapiro. All rights reserved. For terms of redistribution, see the GNU General Public License |