Re: CAOS - A CApable OS? [was: Process Authentication Groups (PAGs)]
Jonathan S. Shapiro (jsshapiro@earthlink.net)
Sun, 22 Mar 1998 17:42:17 -0500
Oh, boy. This discussion of capabilities has clearly gone wide of the
mark, starting with bad definitions of terms and proceeding downhill.
There are a lot of good ideas within the discussion, and some complex
suggestions for how to do some things that have relatively simple
solutions once the definitions get cleared up.
Three apologies in advance to all involved:
For length: I'm going to try to be precise, and I'm sure this will
lead to further rounds.
For assumptions: I'm clearly coming in to the middle of a larger
discussion.
For confusions: At some points I'm unclear on who the speaker was,
so please bear with me if I misattributed or misunderstood.
Also, I request in the strongest possible terms that you do *not*
confuse the term "capability" by misapplying it in the way CAOS
appears to be doing. I'll talk about why below.
Finally, if somebody will tell me the newsgroup this started in, I'll
post this to the newsgroup as well.
> > Jim Dennis wrote:
> >>> .. current uid dependant authentication mechanisms (and similar)
> >>> could be replaced in whole by a general 'capability access control
> >>> mechanism'...
This could mean two things:
- the UID approach could be replaced by a capability approach,
yielding a more secure system
- the UID approach can be implemented on a capability substrate
Both statements are correct. The second is useful for reasons of
backward compatibility. KeyKOS, for example, has implemented a fairly
complete UNIX binary compatibility system on top of a pure capability
substrate.
DEFINITIONS:
A 'capability' is an unforgeable token that designates (names) an
object and conveys the right to perform a set of actions on the
designated object. Possession of a capability by a process is a
necessary *and sufficient* condition to perform the authorized actions
on the designated object.
The question of how capabilities are obtained is a separate issue that
was getting conflated in the discussion. I will address it in a
moment.
Please note that the POSIX "capability" model is completely confused.
They are using the term 'capability' incorrectly to describe the right
to perform an action on *any* object. While restricting the operation
set accessable to a process is a substantial improvement on the
current UNIX model, it isn't a capability system and shouldn't be
called such. Such restrictions are much closer in spirit to the VMS
authorizations bitmap than they are to capabilities.
COMMENTS:
Taking into account correct definitions, the phrase:
"a process ... must be allowed to use the 'open file'
capability ... on the given object"
is nonsensical. "open file" is an action. In order to be a
capability, both the action and the object must be identified.
Obtaining capabilities:
Capabilities can be obtained in two ways:
- A process P1 can transmit a capability C that it holds to a process
P2, assuming that P1 holds a capability to P2.
[deprecated] A process can make a request of some system agent to
grant it certain capabilities on the basis of the program being
run, a challenge-response protocol, or some such.
The correct way to think about this is that the process had an
intrinsic capability to some other process that holds desired
capabilities and acts as an authentication agent.
Such an authentication agent is necessary only when the system as a
whole is not persistent -- the problem is how to reacquire
capabilities when the system restarts. Authentication processes of
this kind present some serious security design issues and render
security analysis quite difficult.
> >>> For
> >>> example, for the file to be opened, the process must request the 'open
> >>> file' capability from the system, and the system can then evaluate if
> >>> the process meets the cryteria to be allowed to use the
> >>> capability...
It should now be clear that this statement is confused, on several
grounds:
- There can be no such thing as an 'open file capability' -- it
fails the definition.
- A process does not "request" a capability from the system.
There are two problems in this notion:
- In a capability system, a process has no intrinsic
authority to request anything from anyone. It can only
invoke capabilities. Unless the process has some
capability to some agent holding useful authorities, it
has no way to ask for more capabilities.
- It begs the question of how the system might decide if
the process is authorized to request that capability.
A process either possesses a capability or it does not. If a
process P1 holds a capability to a process P2, it may ask P2
for capabilities held by P2. P2 may in fact turn out to be
implemented by the operating system (P1 should not, in
principle, be able to tell this).
Simply possessing the capability to P2 is sufficient proof of
authority to request things of P2. P2 may then choose to
implement additional requirements, ranging from
challenge/response to ACLs to any number of other policies.
A better solution is to simply make the whole system persistent, at
which point the need for an authority agent essentially goes away.
The best way to keep hold of the capabilities that a process needs is
to grant them from the beginning and ensure that they are never lost.
> >> No! That's the whole problem with ACL/UID based
> >> models -- the system must be omniscient and the
> >> process is "requesting" access on its own behalf.
I hope it's clear that the approach to an authentication agent that I
outlined above, however undesirable, does not require omniscience.
> >> In a capabilities model there is some *other* process
> >> or mechanism that grants access to the system.
While a capability system can certainly include such an authentication
agent, there is nothing in the model that requires one. The most
successful capability system built to date -- KeyKOS -- did not have
such an agent.
> >> This
> >> can be a Kereberos "ticket granting server" or a
> >> "reference monitor daemon" or it can be some "meta
> >> information" in a "resource fork" (filesystem based).
Several systems have been built using these mechanisms. All have
proven to incorporate significantly compromised security. When the
capabilities get put into a file system (for this purpose, the
kerberos server may be thought of as a remote file system), the
question must then be asked:
"So how did you get the authority to talk to the file system?"
The answer invariably boils down to "by fiat." Universal access to a
shared file system means that all programs have channels to all other
programs and are therefore unsecurable [the file system is itself a
channel]. This is why universal persistence is a better answer. It
also turns out to be a big performance win.
> >> It might even be the parent process.
This is closer. Here are some rules from the capability system
perspective:
- The creating process can grant at most those capabilities that
it possesses, and should be able to do so selectively.
- Once created, the new process should have no further access to
its parent unless the parent granted it the authority (i.e. a
capability) to do so.
Rule (2) implies that access to the parent is capability controlled,
and parenthood therefore should not be used intrinsically as a way of
determining capability access.
One tricky part about (1): If a process A calls a fabrication agent F
to build a new process B, F is free to hand to B capabilities that are
held by F but not by A.
This sort of collusion is entirely permissable, and actually fairly
important. It provides the means for me (a programmer) to build a
program that can access, say, the password database. I can then give
you access to the password manipulator fabrication agent. Assume
that
both of us trust the fabricator agent by virtue of the fabricator
being a system-provided utility, as in the KeyKOS factory or the EROS
constructor.
The EROS web site, by the way, can be found at:
http://www.cis.upenn.edu/~eros
There is a *lot* of documentation there.
In any case, the fabricator gets the capability to the password
database from the developer. Whenever it fabricates a new password
program instance, it copies this capability into the password program.
The client of the password program has access to the password database
only by way of the password program, which mediates things.
> >> In order to prevent hostile or subverted code from
> >> requesting access to resources beyond the intentions
> >> of the administrator...
In many situations where we are concerned about security, the
administrator is not cleared to know the authorities held by certain
programs. Appeals to the administrator therefore break down with
dismaying speed.
Consider, for example, a "public access" system. I've contracted for
a certain amount of space and CPU. The administrator should not be
authorized to learn what I do with that. The recent lawsuit against
the Australian service provided could not have won if it was
demonstrably impossible for the service provider to censor their
users.
> > In the model I had in mind, the parent is the one that allows or denies
> > its child a specific capability (the capabilities can be thought of as
> > individual functions)....
With the addition of the fabrication example above, this works fine,
and is essentially how KeyKOS and EROS work.
> > As far as a child is concerned, it cannot count on being able
> > to do _anything_ at all (not even execute a single instruction of its
> > code) and it doesn't inherit _anything_ at all from the parent.
Doing good so far...
> > If it wants to do so, it must explicitly request it and the parent must
> > explicitly grant it.
In a capability system, the child has no authority to ask any such
thing of the parent unless the parent granted the child a capability
to the parent. Assume instead that the parent receives an initial
capability to the child, and can then elect to send it capabilities
(or not). This is simpler than the mechanism you propose, and equally
good.
Note also that once created, the parent/child relationship is NOT the
hierarchical relationship assumed by UNIX. The relationships are
defined entirely by who holds what capabilities, and are therefore a
potentially arbitrary graph.
> > the parent ... may at any time for any reason terminate access to
> > the capability ...
In a capability system it cannot. The child either holds a capability
or it doesn't. There is nothing in the capability that records where
the capability came from. Once transferred, the sender has no control
over how the receiver uses the capability.
Observation: while you could build a modified capability system that
used hierarchically constrained access rights, you would create to
problems in doing so:
- Undesired communication channels inherent in the access control
hierarchy itself.
- Variable-length capabilities.
>From an implementation perspective, capabilities *really* want to be
fixed size. Imagine programming a system with variable length
pointers...
> This sound more like the "virtual subkernel" concept that
> I've discussed with others several times.
I agree that this is closer to what the previous author was
describing.
>
> The problem with doing it at a resolution finer than
> the system call level is that the performance starts to
> suffer unacceptably.
It depends entirely on the speed of the invocation mechanism. What
you are really asking is: "How small can the reasonable granularity of
a protected invocation be?" The faster the boundary-crossing logic
is, the finer the operation that a protected invocation can protect
while retaining acceptable performance. Liedtke (L3, L4) has a nice
slide of this. He says:
Suppose you are willing to devote 2% (or X%) of your system
resources to protection boundary crossings. Here is a graph
showing the number of crossings you can do as a function of
crossing costs.
The graph shows lines for several choices of X, several crossing
speeds, and several numbers of crossings.
> Another approach is to have the processes all running
> in "virtual machines" (a la Java) -- and allowing the
> "parent" (or some other specified "nanny" process) arbitrate
> each access (of each type) to each resource.
>
> The principle problem with this approach is that existing
> software would have to be ported to the VM.
Painting the porting problem as black and white is too strong. For
the Java VM you need to port because it provides a different processor
architecture. For KeyKOS and EROS, many UNIX programs will run
unmodified within a UNIX emulator. A few important programs will be
revised to know more about the underlying security model. This is
because the KeyKOS/EROS "virtual" machine incorporates the user-mode
instruction set of the underlying processor architecture.
> I don't see much opportunity in these techniques to
> substantially improve security at the OS level. You
> still have the same problems of "subversion"....
I'm a bit unclear on this. The only "subversion" problem I see in
KeyKOS, EROS, Java, or E is that the security kernel can be modified
by an unscrupulous user who has sufficient authority on the machine in
question. The operative words, however, were "sufficient authority on
the machine in question." If the opponent is in a position to reload
the operating system you're pretty thoroughly screwed no matter what
system you are running.
The OS (i.e. supervisor) level needs to be trusted. The question is:
"What new kinds of useful policies and controls can now be implemented
by application level code?" In a capability system, a surprising
number of policies are entirely exportable from the kernel.
> > As you may have already noticed, I use a very broad definition of a
> > 'capability' and it probably differs from what it means on other
> > 'capability oriented' operating systems. The capability in CAOS may be
> > thought of as any function...
Actually, it's worse than that. Your function-oriented model isn't
really secure. The problem is that functions are far too powerful.
In order to say anything with confidence about what authority is
granted by providing access to a particular function, you need to be
able to say quite a lot about the environment in which that function
runs.
True capabilities can be thought of as functions that have 'closed
over' (in the scheme sense) their first argument (i.e. the object).
What isn't often stated explicitly is that the choice of actions for
the primitive capabilities (the functions) must be tightly constrained
to result in a securable system. We did a paper that, among other
things, formally described some of the issues in implementing a
particular security policy on a capability system:
http://www.cis.upenn.edu/~shap/EROS/popl98.300dpi.ps
> We shouldn't try to refer to this sort of thing (active
> process monitoring by "parent" or other processes) as
> ``capabilities'' since that will serve to confuse some
> and irritate others.
Misusing the term will have several undesirable consequences:
- It will lead others to misunderstand what you have done.
- It will lead *you* to misread the existing literature on
capabilities and therefore misunderstand it.
- It will discredit something that works.
By giving something fundamentally impossible to secure (for
reasons described in that paper) the name "capability", you will
lead a lot of ignorant people to ignore the *real* capabilities,
which *do* work. To say that it will "irritate" the people who
work on capability systems is a bit of an understatement.
I really like some of the ideas you are considering. If they are
combined with a true capability substrate I think they can be made
secure.
A personal plea:
I have spent the past 7 years building a secure capability system.
Your function-transfer security approach is completely different.
For this reason it should not be called 'capabilities'. any
OS-oriented dictionary will make it obvious that it isn't
capabilities. Surely it's better all around to chooose a new term
for a different idea?
I ask that you not jeopardize my work and the work of all of the other
people in the capabilities field by misusing the name.
> As I understand it the distinction between a ``capability''
> and an ACE (access control entry) is that a capability is
> "specific and *sufficient*" for each form of access
> (read, write, execute, append, stat, etc) to each
> resource (file, TCP port, "privileged" system call, socket,
> memory block, etc) there is a single ``capability''.
Mostly correct. Actually, there can be multiple capabilities for the
same object. Two capabilities are the same if they designate the same
object and authorize the same operations. Two capabilities can also
authorize distinct operations (e.g. read-only vs. read-write
capabilities to the same page of memory).
Actually, this means that holding the right collection of capabilities
subsumes the function-oriented approach that CAOS is considering. The
"authorized actions" are equivalent to CAOS functions. A process then
needs to have access to capabilities for all of the objects it is
allowed to do those operations
on.
> *Any* process with "possession" of that ``capability'' can
> gain that form of access to that resource -- there are no
> other "checks" to be performed. That is the simplicity
> of them.
Correct. Also the source of their high performance. Under the
covers, for example, the UNIX runtime core is in some places a
capability system. A "file descriptor" is actually a capability with
some gunk welded onto the side. The reason that access lists are not
consulted for each read and write is that they are too inefficient.
> Here's also where you can complicate issues a bit.
> If you have capabilities *on* other capabilities you
> can require one capability to "execute" another. This
> allows you to have "revocable" capabilities.
It does, but it's completely unnecessary.
There are two kinds of revocation to consider:
- Revoke ALL access to object X (your mechanism won't help).
- Revoke a particular capability to object X.
In KeyKOS/EROS, the first operation is called 'rescind.' It requires
fairly primitive level support.
The latter is best accomplished by means of a "forwarding" object
(best because it's simple and because the forwarding object has other
uses while the secondary capability does not). You can then use type
(1) on the indirection object to accomplish type (2). Please note
that type (2) is a rare case, and it's utility drops significantly in
a capability system because capabilities can be transferred [which
means you don't know who you are really revoking].
Note that requiring a second capability adds no marginal security.
Once a party holds both capabilities they can transmit them to others
as a pair. The security of capabilities lies in their unforgeability.
> Let's make up an example:
>
> I want to have something like 'finger' and
> give it the ``capability'' to read or execute
> a file (analogous to my .plan file).
Presumably you meant "... and ONLY my .plan file".
See my description of password file access above as an example of how
to accomplish this with no additional capability types.
> (thus I'm granted all "public" capabilities
> merely be logging in). It might also be accomplished
> by "binding" the "append" capability to a small program
> (like 'chfn') and "publishing" the "execute"
> capability to that.
The existence of a directory of public capabilities is quite
dangerous, and must be handled with care. Suppose I publish an object
that is the write end of a pipe. I now give you a trojan horse
program that obtains this capability from the public directory.
Unknown to you, it copies your data to me.
A simpler solution to the ".plan" problem is as follows:
Imagine each capability has an "ID" field. This is simply a number
that is passed to the recipient process whenever a capability to that
process is invoked.
There are two capabilities to the finger directory: insert and replace.
The read capability allows the operation
fetch(name) -> capability.
The replace capability allows the operation
replace(name, old capability, new capability)
All users hold a copy of the replace capability and a capability to
their current .plan file. The finger daemon only holds the read
capability.
If used carefully, this mechanism provides all of the access
control
guarantees that yours does.
As an aside, however, I'ld suggest that the .plan file really wants to
be considered non-optional. What the user requires is the authority
to modify it (which implies the ability to empty it).
> Any process that I
> entrust with a given capability can use it, or give it
> to agents *other than* by intended recipient. However it
> is possible to provide mechanisms to prevent that.
No, it is not. If you give a capability C to any turing-complete
collaborator process P1, the collaborator P1 can collude with a third
party P2 to invoke the capability on behalf of P2.
The 'do not copy' bit, or the capability splitting mechanism you
propose, simply do not add any security.
> That is capabilities as I think I understand them.
> I still don't know quite how you'd achieve some
> forms of control (such as the 'chinese wall'
> or variations of the "Clark-Wilson triples").
The fabricator provides something akin to a Chinese Wall. The idea
can be extended. If someone will describe "Clark-Wilson Triples" to
me I will describe how to provide them in the pure model.
> EROS and KeyKOS both require a form of process state
> "persistence." This apparently obviates the need
> for "devine" intervention ('root') to solve the
> "chicken and egg" problem posed by "shutdown" and
> "rebooting." I have no first hand experience of
> either of these systems so here my image is *really
> fuzzy*.
Eliminates the need for *divine* intervention too. :-)
I'll be happy to answer questions about either system. I'm actually
close to releasing EROS -- just getting the net stack and the web
server running.
> After e-mail with Jonathan and conversations with
> Hugh I'm convinced that "persistence of process state"
> is required in a "pure capabilities" model.
Not really. The problem is that the reconstruction of the security
model on restart requires special case handling. It is probably
possible to build a secure system without per-process persistence. We
haven't done it, because the prospect of doing the security analysis
involved was too horrendous to contemplate.
We then realized that persistence gives better performance than
conventional file systems. At that point the incentive to build the
hard system pretty much evaporated. Between the analysis problem and
the improved performance, persistence seemed the obvious win.
> If I can "shut the machine down" and "bring it
> up single-user" (create a discontinuity in the
> state of the processes) than I can go in a
> 'steal' (or modify) the state and I'll be 'root'
If you have physical access to the machine, there's all sorts of shit
you can do regardless of model. Effectively, you have a
meta-capability or a meta-ACL.
> (Despite DEC's and MS' protestations to the
> contrary VMS and NT have an omnipotent account.
> It is the "backup" -- actually the "restore" operator!)
>
> I don't have time to think about how a "backup/restore"
> subsystem would work under a pure capabilities system.
Some facility is needed for archival storage. Archival requires that
capabilities and ACLs be able to be serialized and deserialized. Such
software must be able to reconstruct the result, and is therefore
effectively omnipotent and universally trusted. The real issue is not
whether the *program* is trusted, but whether the archival media is
proof from forgery.
I do not know how KeyKOS handled this problem.
The EROS backup utility will maintain a copy of each unique capability
that it writes to tape, and will use a cryptographic signature of some
sort to ensure that the content of the tape is trustworthy when
reading it back
in.
> >> These both assume that you can create a specific
> >> list of resources prior to execution of the program.
> >> It would be very important (so far as I'm concerned)
> >> to allow multiple differing sets of capabilities for
> >> a given program.
I think you mean "to allow different *instances* of a program to hold
distinct sets of capabilities?" That is certainly important.
> Note --- failing this (having prior knowlege of
> the precise forms of access required of each resource)
Can you give a case in which prior knowledge of the access required is
not possible?
I think we can achieve what you need in EROS without a secure
attention mechanism of the type that you describe. Secure attention,
in any case, isn't the right name for what you are talking about.
> >> In addition the "pure" capabilities subsystem it should
> >> be possible for users to delegate access to specific
> >> files and programs without undo risk to their other
> >> files.
Umm, that *is* the pure capabilities model.
> I notice that your discussion mentions features to
> restrict access to specific times of day...
>
> Since capabilities only grant access (they don't "deny it"
> which I guess would have to be called a ``disability'')
> it isn't obvious how they can be used to provide the
> desired level of control.
In a pure capability system, you can grant access to a forwarding
object, and then change what the object forwards to. A restriction
agent of the type you describe is such a forwarding object.
The forwarding object does not need to be passive. Instead of giving
me access to the printer, give me access to a process that does the
printer protocol, but only between 9 and 5. If the time check passes,
the process actually passes my requests through to the real printer.
If a capability system is properly designed there should be no way for
me to tell (programmatically) that I am not speaking to the real
printer.
Enough for now. I hope that some of this is useful, and I'll be happy
to answer followups.
Jonathan Shapiro