Re: Architecture of Backing Store Descriptors William S. Frantz (frantz@netcom.com)
Wed, 23 Nov 1994 17:39:57 -0800 (PST)

Now that the blizzard of email has died down for Thanksgiving, maybe I can put my $.02 in before someone else provides these answers.

>As a general design principle, the kernel should not depend on a
>non-kernel task for kernel correctness.
KeyKOS did. We depended on an external migrator to specify migration optimizations. If this domain failed, migration would not complete and no further checkpoints could be taken. This would be a significant failure. We did think about a non-optimized kernel migrator which would take over if the special migrate key remained unused for too long.

>The KeyKOS user-level process model (and, in general,
>the traditional model) is that servers directly "own" all the resources
>they use, even the resources they use on behalf of their clients. For
>example, if a client sends an SQL query message to a database server, it's
>the database server's responsibility to allocate CPU time to service the
>request, to allocate memory to hold the request and the stack for the
>server to run on while processing it, etc. This traditional model fits
>well with the pure message-passing model, but isn't adequate for good
>resource management in the presence of complex client-server relationships.
>The alternate, less conventional model is that of clients "supplying" the
>necessary resources for the servers to do their jobs: the server runs on
>the client's CPU time, charges the client for server memory allocated on
>its behalf, etc. Of course, combinations of these two models are common,
>for example if the server runs on the client's CPU time while servicing
>requests but still maintains its own private memory pool.
>
>The interesting thing about KeyKOS in this regard is that it is
>schizophrenic. It supports the first model for relationships among
>user-level programs, but the KeyKOS kernel itself, considered as a "server"
>whose clients are all the user-level processes in the system, is designed
>very strongly around the second model.
The KeyKOS model allowed the designer a great deal of flexability about which of these two models would be used. They could even be combined.

The standard way of passing resources was to pass a space bank and a meter. The space bank provided trusted pages and nodes. (Since there could be many space banks in a system, the a program could trust some and not trust others. Some banks might provide replicated pages while others did not. Or pages on unsecure devices etc.)

  >The server will probably trust certain

  >external memory managers, including some that aren't part of the trusted
  >computing base, but not others.  The KeyKOS model has no support for this,
  >and it'll be needed if any kind of accurate, reliable resource management
  >system is going to be provided by Mach or NewSys.
The above is how we implemented this requirement.

The standard factory call required a space bank the factory trusted, a meter, and another space bank which was not necessarly trusted. Out of these resources the factory build a domain which ran a program which was determined by the factory builder. This program could be an instance of a query object.

The same factory could provide additional keys to the instances it created. These keys could provide access to a program which ran on the factory builder's resources and provided the instances with critical, must not be blocked, services.

>I think there is. The problem with assignment of fault responsibility
>is serious. One wants to be able to make a clear distinction between
>a fault in the object and a fault in the mapping. Segments make that
>distinction impossible. If memory objects nest, the assignment of the
>proper scope to a fault becomes ambiguous very quickly.
I'm (Bill Frantz) not sure I understand the problem. Perhaps a concrete example would help.

>I conclude with some reluctance that there are really three categories
>of page in the system we're discussing:
Perhaps a fourth kind of page is a page under the control of a external manager (e.g. a network remote page), but which has local disk backing storage. The kernel can put it on the local disk when main storage pressure gets too high, but its manager can always get it back to return it through the network to its "real" home.

>>The storage really is qualitatively different, however -- see below.
>>
>> It seems to me that the "node" abstraction really just represents
>> storage used by a trusted entity but allocated by an untrusted
>> entity: or, in other words, storage depended upon by a trusted
>> server but used on behalf of an untrusted client.
>>
>>In KeyKOS, for a variety of reasons, capabilities cannot be stored in
>>user data. Consider, for example, the difficulty of discovering when
>>the last reference to an object goes away. The issue does not arise
>>in systems like Mach, where ports are not persistent.

>How does KeyKOS do this? Is there a reason why traditional techniques
>like reference counting won't work?
All the systems which use reference counts I know of, spend a lot of time checking their disks during crash recovery. We also did not implement garbage collection because it makes performance poorly predictable. A KeyKOS which garbage collected pages and nodes is quite feasable, but would have different performance characterists.

>I never suggested that kernel data be made available to untrusted code.
>The point was that the mechanisms used to protect kernel data from
>access by untrusted code, while still allowing untrusted code to transfer
>and exchange kernel storage like a currency, is something often needed
>by servers as well for their own internal state, and it would be better
>if the same abstractions and implementations could serve both needs
>in the overall system design.
There are several ways to do this in the KeyKOS model. Perhaps the most generally useful technique is the "red segment". You can build a red segment which defines no address space, and is a good substitute for a invocable object capability (start key). The slots of the red node are not available to the holder of the segment key, but they are available to the invoked program (the keeper). You can put a whole bunch of data there. One implementation of the space bank used those slots to keep track of which pages and nodes it had allocated. Note that this use of the red segment was discovered after it had been architected for allowing a program to manage a part of an address space. Perhaps a separate object type would help people learn the system, but one mechanism for two radically different uses is kind of elegant.

>Similarly, you might object that the stopped thread may be holding locks in
>the server and preventing other clients' threads from making progress; but
>in fact this is just the classic unbounded priority inversion problem, and
>any traditional priority inheritance scheme will solve the problem.>
I assume you are saying that I, a client of the server, should give resources to some other flat-out-broke client so his thread can proceed and release locks so I can obtain them.

>For the kernel to entrust privileged state to a user-mode device
>driver that is part of the TCB is fine. I wasn't arguing against
>that. In the example you give, the kernel can still make progress on
>other tasks while the pageout is in progress.
>
>The question is whether it is "safe" for the kernel to *block* on such
>a process. In my opinion this is unsound.
The KeyKOS kernel did not block on I/O. There were no "kernel processes". Instead, it aranged for the process on whose behalf it was running, to sleep on the I/O operation. After the I/O completed, the process would re-issue its request (or not if someone with a domain key changed its mind), and the operation would be reconsidered from the start. This technique allows all kernel locks to be released and eliminates the need to re-validate memory addresses in the processes local storage (stack, registers, etc.) Charlie and I have been trying to get a paper published on this subject.


Bill Frantz                   Periwinkle  --  Computer Consulting
(408)356-8506                 16345 Englewood Ave.
frantz@netcom.com             Los Gatos, CA 95032, USA