[cap-talk] kernel object knowledge

Jed Donnelley capability at webstart.com
Tue May 29 13:26:24 EDT 2007


At 08:43 AM 5/29/2007, Karp, Alan H wrote:
> >
> > Isn't there a similar situation that occurs in any
> > capability system?  As soon as you accept that all
> > capabilities need to be able to be emulated by the
> > extension mechanism, then it seems to me natural
> > to actually implement all object service through
> > 'ordinary' user level processes - except perhaps
> > for performance reasons.
> >
>Yes.  That's the point I was trying to make.

We may simply be agreeing in this area, but I still find the
discussion of interest.  Specifically with regard to:

>...
>All operating systems treat certain kind of things, most commonly the
>file system, as things that it understands.  For example, when you say
>"open w foo" in Unix, the kernel verifies that you have write permission
>on file foo.  That means that the kernel must know the meaning of "open"
>and "w".  It also means that the kernel must be changed if you want to
>introduce a new kind of access right, say "append".   Norm's remark that
>some capability systems have bits related to specific permissions and
>that he couldn't always find meaningful semantics for them means that
>issue is not limited to ACL systems.  Note that Norm was not talking
>about an object capability system, in which there is only the object
>reference, but one in which there are extra control bits in the
>capability.

I believe the way an "open" call was handled in the NLTSS system
is illustrative of the extreme case of little being in the kernel
of the system.

The NLTSS system had only one system call - "communicate":

http://en.wikipedia.org/wiki/NLTSS

This call accepted a linked list of "buffer tables" - each a send or
a receive.  The kernel knew about network addresses, how to begin
and end messages (e.g. across multiple buffers), and how to "listen"
(termed a receive 'any').

That was about it for the "kernel" supported functions.  Note that
NLTSS was a capabilities as data system in a rather extreme sense
in that the kernel of the systems knew nothing about capabilities.
I don't recommend this approach as it does not support confinement,
but I believe little would be needed to add confinement (extend
the network 'address' notion to a capability that would be
interpreted by the communication kernel during communication).

To illustrate how the servers, as user processes, are separated
off then consider how the "open" call worked in NLTSS:

The application process doing an 'open' call accesses the directory
capability (data in memory) that it is going to 'open' against (e.g.
a user's "home" directory).  It builds a message to send to the
directory server specifying the path of the file that it wishes
to 'open'.  What this really amounts to is a fetch of a file
capability that may require doing fetches through multiple
directories.  The 'open' library routine would also build a receive
'buffer table' for the reply from the directory server.

In the NLTSS implementation the 'open' library call was
responsible for accesses that cross multiple directory servers.
Once the request was sent to the initial directory server,
that server would execute multiple fetches internally (as
long as the directories were it's own).  If it encountered
a 'directory' capability that was not it's own, it returns
to the 'open' library and at that point and lets the library
call iterate (e.g. through another directory server) until
the 'open' is complete or an error occurs.

In principle the 'open' library call could do all the iterations
through directories, even those through a single server, but we
found that handling iteration through a single server's directories
internally to the server was a significant optimization.

All the directory server does is to map directory capabilities
to files (it manages internal mapping tables) and it stores
capabilities as data named in files.  It can manipulate access
permissions such as 'insert' (write), delete, fetch, and
the previously mentioned "free access" permission.

The directory server was an ordinary user process.  Any user
application could perform the same service.  It depended on
the file server for "off line" storage, but that was it.
Each 'directory' was simply a file that stored capabilities
by name.  Most of the work in the directory server went into
caching for optimization.

The file server was a rather similar user process that mapped
file capabilities to blocks on disk.  Most of it's work was also
its heavy caching for optimization.  In principle all it did
was to accept read and write requests to files and map them
to areas of the disk that it can, for example, read and reply
with the data back to the requesting process (e.g. a directory
server).

In the NLTSS implementation of file I/O a read or write request
used three 'buffer table's - one for the request, one for the
reply, and the last for the data transfer.  The way this was
optimized the data transfer actually happened between the disk
driver and the application process.  We had a mechanism to
lock application memory and to transfer data directly from
disk to application memory.  From the viewpoint of the application
process, however, (e.g. a directory server) all it was doing
was communicating via messages - the request, the reply, and
the data transfer.  The kernel of the system (the "message
system") did know about the "I/O

I go into so much detail in this example to illustrate just
how little 'kernel' involvement one can get away with in implementing
what in most systems are typically part of the kernel.

I believe any capability system can do about the same.  As I
noted, since the base extension mechanism (however it works)
can proxy any capability, it can be used by an ordinary
'user' process to support even the lowest level capabilities.

I should note that at one point in the NLTSS development,
for optimization purposes on some systems, we pushed most
of the process and file servers into what amounted to a
single shared memory 'kernel' process.  We did this to avoid
the rather costly context exchange instructions on the
hardware (Cray supercomputers) that we were using.  Even
with this optimization the processes remained logically
separate, though their internal communication was optimized.

>The kernel doesn't need to understand the semantics for either ACLs or
>capabilities.  The kernel could simply be the trusted forwarder.  In an
>ACL system, the kernel could forward to the handler the user's ACL entry
>for the resource and let the handler figure out what to do.  In a
>capability system, the interpretation of the bits in the capability can
>be left to the object it references.  So, for example, bit 3 in the
>capability could be interpreted as RO for a file and something
>completely different for a start key.

When you say the kernel doesn't need to understand the
semantics of capabilities - that was certainly true in
NLTSS.  However, in capability as descriptor systems
I believe the kernel does need to understand and indeed
manage the format of capabilities - at least to the
extent of interpreting the "invoke" and "reply"
mechanisms and storing the capabilities in c-lists.
Are we on the same page there?

I included so much detail in the above partly because I
thought it might interest Peter Amstutz who started this
thread.

--Jed  http://www.webstart.com/jed-signature.html 




More information about the cap-talk mailing list