[cap-talk] Another "core" principle - virtualizing memory

Jed Donnelley capability at webstart.com
Tue Jan 2 12:15:50 CST 2007


At 08:31 AM 1/2/2007, Jonathan S. Shapiro wrote:
>Jed wrote the following privately, but I want to respond publicly:
>
> > ... you seem to feel that an object-capability
> > operating system can't be built with effective virtual memory
> > support that also has fully virtualizable (wrappable) objects.
> > That is, where all (and I assume there are relatively few) the
> > kernel supported objects can be simulated by the extension
> > mechanism.
>
>In case anybody else believes this, this is NOT what I have said.

Good.  Since that's all I need for the Responsible Delegation
paper I'm willing to leave it there.

>What I
>have said is that pulling off virtualization correctly for memory
>objects is much trickier than the DSM-style technique of playing with
>the live mappings to ensure mutual exclusion.
>
>So back to that:

So as not to waste your time in responding I'll try to fill in some
of these details at a high level.

>On Tue, 2007-01-02 at 00:16 -0800, Jed Donnelley wrote:
> > >You have reduced the problem to a previously unsolved problem, which is
> > >the construction of "files, segments, or whatever". As I stated in my
> > >earlier mail, each of these is an example of an address space. So your
> > >answer amounts to "you build an address space by mapping address
> > >spaces".
> >
> > No.  Just because a data segment has an address space (0...n)
> > doesn't mean that it can't map into a separate process address
> > space.
>
>[Late edit: check the OH WAIT at the bottom of the following list; we
>may have another simple disconnect.]
>
>Before going any further, let me state three claims. If you disagree
>with these then we have a much more fundamental disagreement and we may
>want to set the mapping discussion aside temporarily.

I'm for that.  As I say, the above object-capability OS with wrappable
objects is what I was looking for in any case.  While I've implemented
an operating system that dealt with these memory issues, I'm afraid
there is enough of a disconnect in our terminology (not surprising
after 20+ years) that most of your statements have little meaning
to me.

>Claim 1:  A robust kernel must operate from fixed resource, ignoring
>the fact that it may make startup-time decisions about how to partition
>the available memory into individual resource types (what David Wagner
>and I have referred to elsewhere as type-specific heaps, and David
>Hopwood has described as preallocated vectors of object).

Yes.

>It follows that if a process performs a durable dynamic allocation
>within the kernel, that allocation must be accounted somewhere, and it
>must be subject to limits (resource quotas). The key issue is to
>determine where the quota checks should be implemented.

At any point in time an object creation or expansion request can fail
due to resource exhaustion.  Sad but true.

>Claim 2: If the system in question uses explicit persistence, it is
>feasible to place this allocation and quota checking function in the
>kernel. If the system uses implicit persistence (as in KeyKOS), then
>*all* dynamically allocated kernel state must be non-definitive, in the
>sense that it is a cache of some definitive state which has its real
>home in some object that can be written to disk.

If I'm understanding the above then I believe that the NLTSS system
would be regarded as using implicit persistence.  A typical system
warm start would pick up the state of all running processes from
disk state that was labeled as consistent.  We also had a mechanism
for recovering some state from system memory.  I believe you would
regard this as "disturbing."

>Claim 3: It is unknown how to construct an explicitly persistent pure
>capability system that is either consistent or secure (in the
>mathematical sense), because the state recorded on the store may not
>represent a consistent cut. A non-consistent cut is one that cannot be
>described as a sequence of correctness-preserving operations applied to
>an initially correct capability graph. That is: the state evolution
>induction is lost, and in consequence the argument that the system
>remains in an authorized state is lost. Because of this, the
>requirements of the information flow analysis that underlies the systems
>overall security foundation are not upheld in an explicitly persistent
>system.

In what sense is being a "pure capability" system relevant to the above?

>Of these, I suspect that claims (1,2) are not controversial, but claim 3
>may be. If you know of another way to preserve consistent cuts, I'm
>*very* interested to learn of it!

Sorry.  I doubt any work we did would contribute to this area.

>I predict that any discussion of claim
>3 will devolve to either (a) a very clever, previously unknown system
>bootstrap approach,

While I consider some of the bootstrap work that we did clever, I
don't think expanding on it here would be helpful.

>or (b) an argument that preserving the induction
>isn't important in practice. The first is welcome (to me). The second
>would be very disturbing.

I don't think you should be disturbed Jonathan.

>OH WAIT
>
>There is a possibility that I have been overlooking. It is possible to
>imagine a system implementation where the functionality of the KeyKOS
>space bank (i.e. the disk storage allocator and quota manager) is
>performed in the kernel. Such a system could satisfy claim (2) and still
>manage the allocations from within the kernel. I suspect that such a
>system would converge rapidly on a fully monolithic kernel
>implementation.
>
>Aside: this pretty much tosses my claims about reifying mapping
>structures out the window.
>
>Jed: Are you assuming such a system?

Hmmm.  I'm afraid I have some difficulty mapping that term "kernel"
to our NLTSS work.  The closest thing we had to a "kernel" would be
the combination of the device drivers (disk, cpu, and network),
the message system (provided for communication between processes),
and some system initialization that was overwritten.  Much of what
is typically considered "kernel" (e.g. process server and file
server) were "user" processes that were trusted by the CPU
driver and by the disk driver respectively.  Another process, the
directory server (c-list server if you like) was really an ordinary
user process that was just widely depended on.

> > >So you seem to propose that there is a relatively high-level kernel
> > >operation "map", which accepts as arguments:
> > >
> > >    a process (implicitly: the invoking process),
> > >    an address relative to the process's address space, and
> > >    an address space to be mapped at that address (a "file, segment,
> > >      or whatever") whose construction happens by unspecified means.
> >
> > No.  I propose that there is an operation on a Process capability,
> > "map", that accepts a data object (one that has read and write
> > operations, and as I suggest lock operations).  The operation
> > specifies where the data in the data object should be mapped
> > into the Process's address space.  Of course a process might
> > have access to it's own capability, but it also may not.
>
>This is substantially the operation that I described. The key issues on
>which we still are disagreeing:
>
>1. What is the least atomic unit of read/write: byte or page? [This
>    determines whether simulation of load/store is required for
>    complete virtualization.]

I argue that page suffices.

>2. How is storage for the list of such mappings allocated in such a
>    way that it is later persistable? [the discussion raised above]

I don't understand the term "persistable".  I'm not sure I want to.

> > Can you define your "accountability requirement"?  In my solution
> > all storage devolves to rotating storage that is accounted for
> > as is any object.  Real memory use for us was tied into "CPU"
> > charges and depended on real memory residency * time.
>
>This seems reasonable, but memory must be persistable if it contains
>definitive state. Address space mappings are clearly definitive.
>Therefore, I consider them to need to live on rotating (or at least,
>durable :-) storage. It is the allocation of this persistent storage at
>mapping time that ultimately concerns me.

As I mentioned in the NLTSS system process memory was a single
segment, so we really didn't have to deal with any significant
address mapping issues.  The RATS system had some such issues,
but Charlie Landau would be the expert on that system, not me.

> > >...
> > >Can you explain how your DSM-style solution accounts for the behavior of
> > >load and store instructions, which *must* be reifiable as capability
> > >invocations if the system is to remain a pure object-capability system?
> >
> > Load and store instruction act according to the processor architecture
> > on real memory - or trap.  I fail to understand why you seem to
> > argue that each individual load and store instruction must
> > act as a capability invocation.  From my perspective it's perfectly
> > adequate to have the invocations on the storage (rotating storage)
> > objects whose data is mapped into memory appear as capability
> > invocations when rotating storage is read into or written from
> > memory.  This happens at a larger granularity (generally at
> > page faults).  This approach mirrors what's actually going on.
> > What's the problem?
>
>I think that we differ on our assumptions about the atomic unit of I/O.
>
> >From a formal system safety perspective, there are no load and store
>instructions and there is no hardware architecture. There are only
>capability invocations. In order to argue that a system is safe, the
>operations of the hardware need to be mapped onto the operations of the
>capability-based system model.

Sure.

>I agree (and I have agreed several times) that for the vast majority of
>real virtualization applications, it is sufficiently faithful to
>virtualize at page granularity and ignore the detail that many
>operations at the load/store granularity are being collapsed at this
>level of virtualization.

That seems like a good idea to me.

>What I am arguing here is that **in the limit** (which is something we
>may never implement, but need to test conceptually) we really DO want to
>be able to put an entry capability into a page table slot and have the
>actual load and store instructions get handled one at a time through
>messaging.

I simply don't see what having the load and store instructions handled one
at a time through messaging, even conceptually, adds to system functionality.

>I can even give a real use case: watchpoints. On real systems, hardware
>watchpoints are very limited and a software mechanism for watchpoints is
>essential.

I believe I understand what you're describing above.  Such watchpoints
can be done without requiring even conceptual capability invocations
at the load/store level.

>I have stated that *implementing* virtualization at the load/store
>granularity is irrationally hard on a few badly conceived hardware
>architectures. It is not impossible. This is a flaw in the respective
>architectures, not in the notions of object-capability systems.

I can imagine hardware that would lend itself to a conceptual
capability invocation at the load/store level.  I don't have any
problem with pursuing that approach, as long as it doesn't stand
in the way of wrappability of all objects.  I regard that wrapping
requirement as more important than any byte level conceptual
access to storage by capability invocation.  I certainly believe
that it's important to provide access to all the relevant hardware
features available on a system, but I don't believe that need
necessarily be by capability invocations for load/store operations.

>I'll note (with a broad smile) that you're in pretty deep kim-chee
>arguing on this one, because we actually *implemented* this level of
>virtualization in IRIX (me) and Solaris (Roger Faulkner) in later
>refinements of /proc. We actually saw cases where people would
>remote-mount the /proc file system and debug from a second machine. We
>turned a little green when we learned that, but it was a surprisingly
>useful thing to do. If nothing else, it's a pretty good anecdotal
>confirmation that we successfully virtualized the memory interface.

Delightful.  As long as the wrapping for all objects is supported,
then I'm happy.  Actually as long as we agree that it's possible to
implement effective (e.g. efficient, functional) object-capability
systems with all objects being wrappable then I'm content.

--Jed  http://www.webstart.com/jed-signature.html 




More information about the cap-talk mailing list