[cap-talk] Capability-based Projects - theory vs. practice
Jed Donnelley
JEDonnelley at lbl.gov
Thu Aug 16 07:56:24 EDT 2007
----- Original Message -----
From: "Jonathan S. Shapiro" <shap at eros-os.com>
Date: Wednesday, August 15, 2007 5:13 pm
Subject: Re: [cap-talk] Capability-based Projects - theory vs. practice
To: "General discussions concerning capability systems." <cap-talk at mail.eros-os.org>
> >In the same situation with Keykos the integrity of your
> application
> >depends only
> >on the way you use the file system. I can corrupt only my own files.
>
> Thbis is fundamentally a complexity argument. An FS built on a
> persistent substrate and absolved of directory (name binding)
> management is MUCH simpler than its Mach counterpart. The storage
> overhead of the KeyKos approach is, on average, greater than 250%
> of average file size. That seemed wasteful to me, and of course
> nothing prevents instantiating one FS per user in EROS, or even per
> file, with no substantially greater overhead than KeyKos, but
> incurred only where warranted.
>
> Charlie writes:
>
> >If I'm not mistaken, in EROS (to the extent it had a file system)
> > there was a single process serving all files. I believe this was
> done>for performance reasons - traversing directory paths can be done
> >without context switches.
>
> This is incorrect. The "nfile" server served multiple files, but
> not directories. No intrinsic rstio of nfile instances to file
> instances was assumed.
>
> The directory traversal overhead issue is serious, however. One
> process per directory is flatly prohibitive on both space and
> performance grounds.
Just to add another data point to this discussion, I'll describe
how the NLTSS:
http://en.wikipedia.org/wiki/NLTSS
file system worked in this regard.
On the NLTSS system there was one file server and one
directory server on each component computer. The directory
server used an underlying file for each directory into which
it stored name/capability pairs. The NLTSS directory structure
was naturally a directed graph (e.g. vs. a tree) with no limits
on loops and no internal garbage collection (which was
dealt with at an accounting level - where users could
recover 'lost' objects - e.g. to destroy them if so desired).
In terms of resolving path names, in the NLTSS directory
structure of course path names could cross directory
servers. Generally, however, such crossing were relatively
uncommon. The NLTSS directory structure supported
what might be referred to as a parameter marshalling
and unmarshalling mechanism that permitted a whole
path name to be submitted in what could be considered
a single "fetch" operation. However, a single directory
server would only follow the path of such a fetch as long
at it was the server for all the directory objects. If it ran
into a directory object served by another directory server,
then it would return the directory of some other server
that terminated it's ability to follow the path.
There was an NLTSS library routine that would fetch
an object from a directory path. This library routine
would recursively submit the whole path to each
directory server in sequence. If the whole path was
serviced by a single directory server (by far the most
common case - though sadly because there was
relatively little cross machine sharing with NLTSS),
then such a fetch would be handled entirely within
a single directory server (rather efficiently as directory
contents were heavily cached) and the target
object would be returned directly. The worse
case from a performance viewpoint would be a
situation where a pathname crossed directory
servers with every link in the path. I don't recall
testing such a path, but it's pretty easy to guess
at it's performance, even if all the path elements
were cached in the memory of the directory
servers (also by far the most common situation).
That performance would be limited by the
performance of the message system, with
on the order of 4-5 microseconds per internal
link on a single computer (e.g. say, in the case
that somebody implemented their own second
directory server on a single machine) and
rather more latency for the network case
(NLTSS ran on the Hyperchannel LAN):
J. E. Donnelley and J. W. Yeh, Interaction Between Protocol Levels in a Prioritized CSMA Broadcast Network, Proceedings of the Third Berkeley Workshop on Distributed Data Management and Computer Networks, August 1978, pp. 123-143. Also in Computer Networks 3 (1979) 9-23.
J. E. Donnelley and J. W. Yeh, Simulation Studies of Round Robin Contention in a Prioritized CSMA Broadcast Network, Third University of Minnesota Conference on Local Area Networks, October 1978.
We of course spent a great deal of time considering performance issues with the NLTSS file system. Directory overhead was never a significant overhead for us, though as mentioned, cross machine directory structures were rather uncommon and as far as I know there was never a situation with more than one directory server running on a single machine - except when we did so for testing.
--JED http://www.nersc.gov/~jed/
More information about the cap-talk
mailing list