Re: CAOS - A CApable OS? [was: Process Authentication Groups (PAGs)] Jim Dennis (jimd@starshine.org)
Sat, 21 Mar 1998 21:34:32 -0800

>
> Jim Dennis wrote:
>>> Actually, current uid dependant authentication mechanisms (and similar)
>>> could be replaced in whole by a general 'capability access control
>>> mechanism' - for example, to be able (capable) to successfully open a
>>> file, process must be allowed to use the 'open file' capability (=
>>> open()) on the given object (file). This can be controlled by the system
>>> selectively giving each capability as processes request them. For
>>> example, for the file to be opened, the process must request the 'open
>>> file' capability from the system, and the system can then evaluate if
>>> the process meets the cryteria to be allowed to use the capability (for
>>
>> No! That's the whole problem with ACL/UID based
>> models -- the system must be omniscient and the
>> process is "requesting" access on its own behalf.
>>
>> In a capabilities model there is some *other* process
>> or mechanism that grants access to the system. This
>> can be a Kereberos "ticket granting server" or a
>> "reference monitor daemon" or it can be some "meta
>> information" in a "resource fork" (filesystem based).
>> It might even be the parent process.
>>
>> In order to prevent hostile or subverted code from
>> requesting access to resources beyond the intentions
>> of the administrator (and even the core code's author
>> in the case of subverted binaries) we want the ability
>> to say *with confidence* that a given program cannot
>> request or gain access to anything that's not on
>> it's "list."
>
> In the model I had in mind, the parent is the one that allows or denies
> its child a specific capability (the capabilities can be thought of as
> individual functions). Every layer in the system is a system on its own
> and is operating with the set of capabilities that the parent layer
> granted. As far as a child is concerned, it cannot count on being able
> to do _anything_ at all (not even execute a single instruction of its
> code) and it doesn't inherit _anything_ at all from the parent. If it
> wants to do so, it must explicitly request it and the parent must
> explicitly grant it. The child is allowed to state the capabilities that
> it wants and if _all_ of the capabilities that it requests are granted,
> it is executed and the only functions that it is allowed to call, are
> the capabilities that the parent granted. The process can only request
> capabilities on its own behalf.
>
> The child process knows absolutely nothing about the environment where
> it is being run and is in fact none of its business anyway. As far as
> the parent (the system) is concerned, it may provide the child with the
> capabilities that the child wants, but it may at any time for any reason
> terminate access to the capability, unless the child explicitly requests
> that it had complete control over itself and over parent and the parent
> (and all the layers above) grants it.

	This sound more like the "virtual subkernel" concept that
	I've discussed with others several times.

	I've discussed it in terms of deferring all *system* calls
	to the "virtual kernel" (parent or other process) -- which 
	is similar to the approach taken by the Janus project -- that
	was a paper writting by David Wagner (et all?) in that the
	monitoring process is actively involved in the execution process.

	The problem with doing it at a resolution finer than 
	the system call level is that the performance starts to 
	suffer unacceptably.

	Another approach is to have the processes all running
	in "virtual machines" (a la Java) -- and allowing the 
	"parent" (or some other specified "nanny" process) arbitrate
	each access (of each type) to each resource.  

	The principle problem with this approach is that existing
	software would have to be ported to the VM.  We've already
	got some of that going on with Java -- and I think that
	some of Norm Hardy's ongoing work with 'e' and with future
	enhancements to Java are likely to give us that.

	I don't see much opportunity in these techniques to 
	substantially improve security at the OS level.  You 
	still have the same problems of "subversion" regardless
	of whether the interprocess communication is via 
	sockets, pipes, shared memory, environment variables
	and command lines, or via some sort of PSL (protected
	shared libraries -- another Multics feature that's 
	coming back from the dead in research projects unde Unix).

	The ``capabilities'' model allows one 
	
 

> As you may have already noticed, I use a very broad definition of a
> 'capability' and it probably differs from what it means on other
> 'capability oriented' operating systems. The capability in CAOS may be
> thought of as any function, that the child is unable to perform by
> itself, including executing the child's own code. You can think of this
> as if every Linux process had to explicitly request each of the syscalls
> that it intends to call and the system must explicitly grant it. This
> way it would be very easy to stop the stack overflow vulnerability with
> the suid binaries - just deny these binaries the exec() syscall and stop
> worrying (of course there are suid binaries that need exec() but this is
> only an example).

	I was thinking along the same lines for awhile.  I've
	come to the conclusion that I was wrong (largely due
	to my conversations with Hugh Daniel).   This might be
	an interesting feature -- but it *isn't* a capabilities
	model.
	
	We shouldn't try to refer to this sort of thing (active
	process monitoring by "parent" or other processes) as
	``capabilities'' since that will serve to confuse some
	and irritate others.  Worst, it is most likely to
	irritate the few people who really understand 
	the existing concepts of the ``capabilities'' model.

> The set of allowed capabilities in CAOS is determined by an access
> policy. The access policy is a set of parameters that must match to
> grant (or perform) a specific capability (an example of an access policy
> would be: to be able to execute this particular file, the file's owner
> and the owner of the process who tries to execute the file must be the
> same and the time of the execution must be within the working hours; if
> any of these don't match, the request for the execution is denied).

That brings us back to an access control list.

	As I understand it the distinction between a ``capability''
	and an ACE (access control entry) is that a capability is
	"specific and *sufficient*"  for each form of access 
	(read, write, execute, append, stat, etc) to each 
	resource (file, TCP port, "privileged" system call, socket, 
	memory block, etc) there is a single ``capability''.  

	*Any* process with "possession" of that ``capability'' can 
	gain that form of access to that resource -- there are no 
	other "checks" to be performed.  That is the simplicity 
	of them.

	Here's also where you can complicate issues a bit.
	If you have capabilities *on* other capabilities you 
	can require one capability to "execute" another.  This
	allows you to have "revocable" capabilities.  

	Let's make up an example:

		I want to have something like 'finger' and 
		give it the ``capability'' to read or execute
		a file (analogous to my .plan file).

		In order for me to give 'finger' that 
		``capability'' I have to create the file or
		script, and I have to have the ``capability''
		to extract the "read" or the "execute"
		``capability'' from my file (or "bind" either
		of these *to* it -- which ever terminology you
		prefer).  I also have to have the ``capability''
		to give capabilities to 'finger' (probably given
		when 'finger' publishes the capability for doing
		this -- by granting it to some process like
		'login' (thus I'm granted all "public" capabilities
		merely be logging in).  It might also be accomplished
		by "binding" the "append" capability to a small program
		(like 'chfn') and "publishing" the "execute"
		capability to that.

		To revoke this 'chfn' capability the 
		sysadmin just removes the program.  (In this sense I 
		suspect that capabilities would be bound to "links" 
		(or their analog) rather than to "inodes" (or their 
		analog).

	In practice this might work something like the following:

		connection request comes in kernel
		this opens socket (creating a set of capabilities)
		kernel hands read, write, and destroy capabilities to
		 an inetd thread/process.
		the inetd process has a number of execute capabilities
		 one of which provides execute access to a 'fingerd'
		the inetd grants all of those socket caps to the
		 'fingerd process' that it's forking (but none of the
		 other execute caps).  Thus fingerd can talk to the
		 socket but it can't necessary execute ftpd, rlogind,
		 or *anything else* on the system.
		The 'fingerd' does have a set of capabilities, read
		 or execute.  These were generated by each user who
		 wanted to allow access to a .plan file or script.
		 it reads the name of the target (a user to be fingered)
		 and looks for a corresponding cap.  It then "forgets"
		 (destroys, free()'s, whatever) all other capabilities
		 (so this fingerd process can no longer access *anyone
		 else's .plan*) and continues.  It might also "forget"
		 the "read cap" on the socket (since it no longer needs
		 to recieve *any* info from there).
		If the .plan is a simple file, to which fingerd only has
		 read access -- it reads the file, writes to the socket
		 destroys the socket, and exits.
		If the .plan is a script (which might carry other
		 capabilities of it's own) than the fingerd executes
		 it.
		(Note that the newly executed program can only write
		 to the socket -- it can't subvert the fingerd's 
		 "privileges" to create a covert channel since we 
		 destroyed the ability to read from this socket and
		 we never had the ability to create any other sockets.
		 also this child process can't access any other user's
		 .plan files or scripts since those capabilities were
		 also dropped.  The only capability that fingerd gave
		 this hypothetical .plan script was the ability to write
		 to the existing socket, and maybe the ability to 
		 destroy it).
		This .plan script probably needs other capabilities
		 -- those would be *granted to it by its creator*
		 (analogous *but not identical* to a given user).
		 This list of other capabilities might be quite
		 small, such as read/write access to a counter
		 ("You are the 4,960,837th person to finger me") or
		 append only access to a log.  It might also include
		 a small set of other capabilities that allow it to
		 do things like execute a reverse ident/finger process
		 -- so that *that* could be logged.

	Note that this whole scenario completely lacks any concept of
	"identity."  The  'fingerd' isn't running *as* root.  The
	.plan file isn't "world" or "group" readable or executable.  
	It doesn't have to be named ".plan" (the 'chfn' like program 
	can convey the file's "name" as part of the ``capability'' 
	granting  process).  The executable .plan script doesn't
	run *as* the user nor *as* root.

	We start to edge towards "identities" (users, accounts) 
	when we think about the 'chfn' program.  It presumably 
	has read/write privilege to the list of our fingerd's
	capabilities.  It might get "append" only access -- but
	this would lead to a problem  with an ever growing 
	file (or directory or meta-file resource, however these 
	capabilities are stored by our hypothetical system).  We'd 
	then have to postulate some other 'skulker' or '.plan remover'
	-- which might be important for some cases but doesn't sound
	useful to this case.

	Now we consider how the 'chfn' would prevent a user or
	account holder from modifying or destroying any other 
	user's .plan entry in the fingerd's cap resource list.
	The first idea that comes to mind is to give each 'finger'
	enabled account/user a capability over one resource
	record (remember, a cap that provides access to another
	cap).  Thus my shell/login process (which holds my 
	"execute" and "shell access" type capabilities for 
	everything that I can "do" on the system (sort of)) can 
	present the execute cap to launch chfn, and leave my
	".plan entry" cap and my ".plan" cap available when it
	forks.  (Note that it doesn't leave any other caps laying
	around, to the chfn binary can't read my mail or do any
	other evil things with my other resources (or the other 
	resources to which other users/process/accounts have 
	given *me* access.  I'd probably leave some sort of
	write access to my terminal -- to allow this 'chfn' 
	process to provide me with error messages, or I might
	allow read/write to allow it to go into an interactive
	mode with me, or I might create some new terminal like
	channel (like a 'screen' pty or unix domain socket) and
	allow it specific capabilities to *that*)).  

	Since *no access* is given without a specific capability
	then I can be readily assured that 'chfn' can't be 
	subverted to change my .plan file/script (it only has
	read/execute access which it is *supposed* to pass along
	to the 'fingerd.'  In addition my .plan script cannot
	be subverted to read my mail or access any of the
	capabilities in "my" login "shell."  (In fact I can
	potentially have several different login shells that can
	be mutually untrusting).  Perhaps I maintain one profile
	that manages the caps for all of my other "roles" and
	it, or another one, that has execute caps for all of them.

	Note that this concept of "roles" is much different from
	"accounts" -- only the sysadmin can create new accounts
	(on most systems) and grant the initial access to resources.
	These "roles" can be created by any user or process -- but
	each can only accumulate capabilities by being *granted* 
	them by some entity that already has them.
	
	This leads to another observation.  Any process that I 
	entrust with a given capability can use it, or give it
	to agents *other than* by intended recipient.  However it
	is possible to provide mechanisms to prevent that.  If
	we think of the capability as a cryptographically secure
	hashed ticket (which is precisely how some of them are
	implemented) we can envision "splitting" it and using 
	*two or more different* agents to convey these parts to
	the intended recipient.  (We can also envision various
	complex public key exchanges to encrypt and pass these
	caps around).

	In my 'finger/chfn' example it's unlikely to make sense.
	Yes, a 'chfn' implementation might use the .plan cap to 
	read or execute on it's own behalf.  It can't do anything
	that fingerd couldn't do (with "my" .plan).  More 
	interestingly the fingerd cannot add to its own capabilities 
	list -- so even a process that "stole" a read cap for my 
	mailbox couldn't subvert fingerd to use it -- they'd have to 
	get execute access to something like chfn.

	Naturally I'm glossing over lots of details -- in particular
	I have no idea how many different caps are needed or
	how many resources they would govern.  In an ideal OS we
	could reduce just about everything to the directory/filesystem 
	abstraction and there would be no penalty for having every
	entry in the analog to  /etc/passwd be a "subdirectory" and
	every "field" (username, passwd, fullname, etc) be a separate
	small file or subdirectory, and in having a full suite of 
	capabilities (to be granted) associated with each of those 
	and having packages of capabilities (to grant to other resources)
	associated with each program/link.

	Another abstraction that might be useful to think of is
	that of the "rdbms" where we have databases of tables 
	which consist of rows and columns.  We can envision a
	dbms system that had capability "portions" for each row 
	and column (both would be required to access a given field
	in a given table, in a given database) and "full" 
	capabilities for each row, column, table and database (where 
	the cap gives access to the whole entity in question --
	all of the fields of a given row (record) or all of the
	rows of a given column (given the ability to perform various
	sorts of calculation and average without the capability to
	leach any other data from the table for example) et cetera.

	This needn't be supported directly at the OS layer. 
	However, attempts to implement it solely in the application
	layer leave us with the possibility that any db program might
	subvert the dbms server for further access.

	In my description I'm assuming that we could make multiple
	links to a program, and yet allow each link to have it's
	own capabilities granted to it.  I also start to get really
	tangled up (read: confused) when I think about these 
	capabilities that apply to other capabilities sets.  We
	can emulate tables using directory trees -- particularly
	trees with "cross links" (symlinks).  As a practical 
	issue in existing OS' this is "expensive" (in space and
	performance).  However, I'm still speaking on a purely
	theoretical basis -- and without benefit of any formal
	training or mathematical analysis!

	In a sense my "account" (or some of my "roles") "own"
	each of the resources to which they have a set of 
	capabilities that allows me to "destroy" (modify, revoke
	whatever) the capabilities that grant access to it
	*and* which allow me to "create" and "transfer" a 
	new set of other capabilities to it.  I get dizzy 
	thing to guess what the feasible level of granularity
	and recursion would/should be.

	That is capabilities as I think I understand them.
	I still don't know quite how you'd achieve some
	forms of control (such as the 'chinese wall'
	or variations of the "Clark-Wilson triples").


>> One of these I refer to as a "contract inventory"
>> model. The admin creates an inventory of the
>> resources to which the process requires access --
>> and that becomes a "contract" under which the
>> binary runs. Any attempt to access a resource that's
>> not listed in the "inventory" is a "breach" of the
>> "contract." The "contract" is "meta data" and is
>> store in some filesystem table that is not accessible
>> to the program itself. For this we might want to
>> create a "resource fork" option in our filesystem --
>> which could store ACL's, capabilities, and have
>> an extensible structure to allow storage of other
>> "meta data resources" as needed.
>
> What I believe you are refering to (please correct me if I'm wrong) is
> an extension to the ACLs. The process is assigned (for example) a list
> of files (data objects, resources) that it may access in a particular
> way - if it tries to access a file (data object, resource) that is not
> on its list or if it tries to access it in a way not allowed, the
> request is denied? Can you please describe what you mean when you use
> the term 'capability'? It seems to me that I really should have read
> some of the documentation that you mentioned as it looks like we're
> talking about different things:)
	I hope the (long!) description I just gave is adequate.
	Warning, it might be *wrong*.  I've copied a couple of
	people with an interest in the topic (Hugh, Jonathon,
	sorry for the long-windedness).  They'll correct me
	(if they have the time and inclination).

> What I propose is similar to what I described in the last paragraf, but
> on a different level. Each and every capability has two access times:
> granting and execution. When the process requests a capability (in my
> definition this is a function), the system may grant it so that the
> process may call it. This is the granting time. If the process is denied
> any of the capabilities it requests at granting time, the process cannot
> execute at all. When the process calls the capability (the function),
> the arguments to the capability are examined to see if it is called in
> an agreed manner and on agreed files (according to the 'contract' in
> your terms). This is the capability execution time. If the capability is
> denied at this time, the process receives an error but may still
> execute.
>
>> Another method is to have a parent process configure
>> all of the capabilities prior to spawning the package.
>
> This is required in CAOS anyway. The child must explicitly request the
> environment that it needs to execute.

	In my description the "child" (any process in fact)
	simply goes about its business attempting the forms of
	access to the resources that it requires.  It doesn't
	"request" access (although it might do a "stat" or
	access() like call to verify it's access prior to
	attempting a given access -- this might be to allow
	it to interactively request the a capability from its
	client (which might be a user or a process)).  This
	"client" might be the "parent process" or it might
	be a live user at a keyboard (who presumably would 
	have some means to grant/transfer capabilities around
	-- although I'm pretty sketchy on how this works).

	EROS and KeyKOS both require a form of process state
	"persistence."  This apparently obviates the need
	for "devine" intervention ('root') to solve the 
	"chicken and egg" problem posed by "shutdown" and
	"rebooting."  I have no first hand experience of 
	either of these systems so here my image is *really
	fuzzy*.

	After e-mail with Jonathan and conversations with
	Hugh I'm convinced that "persistence of process state"
	is required in a "pure capabilities" model.

	If I can "shut the machine down" and "bring it 
	up single-user" (create a discontinuity in the
	state of the processes) than I can go in a 
	'steal' (or modify) the state and I'll be 'root'

	(Despite DEC's and MS' protestations to the 
	contrary VMS and NT have an omnipotent account.
	It is the "backup" -- actually the "restore" operator!)

	I don't have time to think about how a "backup/restore"
	subsystem would work under a pure capabilities system.
	I suspect that it would involve a set of agents, authorized
	by each "user" on the system to accept capabilities
	for each restored file -- and that there are some 
	sorts of "logging/audit trail" files that these	
	automated agents would refuse to "adopt."


>> These work because the program's resources are all
>> approved before the program is "exposed" to any
>> data (particularly hostile data that may exploit
>> buffer overflows etc). *Any* attempt to access
>> other resources *must* fail or this model doesn't
>> provide any substantially better assurance than the
>> current model.
>>
>> These both assume that you can create a specific
>> list of resources prior to execution of the program.
>> It would be very important (so far as I'm concerned)
>> to allow multiple differing sets of capabilities for
>> a given program.
Note --- failing this (having prior knowlege of the precise forms of access required of each resource) we have to have some way for any entity (user or process) to "interactively" review and grant such capabilities as they have. This implies some sort of "trusted path" (possibly via some SAK -- secure attention key) for the user/process to verify *which* entity is requesting which specific form of access to which resource).
>> For this and other reasons I'd set it up so that the
>> "resource fork" or "meta data" was associated with
>> each directory entry (link!) rather than each inode.
>
> The access policy can be setup in a way to check anything at all
> (process' current directory, file inode, file path, file owner, process
> owner, system time, portion of the file being accessed) and that's why I
> propos a database oriented filesystem, where each file (data object) is
> acompanied with access policy - the access policy should be easily
> extensible and possibly different for each data object (file).
>
>> If we implement a good model then we can put the
>> authorization where it belongs -- in the hands of
>> the owners of each resource (users). In addition
>> the "pure" capabilities subsystem it should be
>> possible for users to delegate access to specific
>> files and programs without undo risk to their other
>> files.
>
> Every user controls who and in what ways uses the capabilities that his
> capability modules provide.
	I think we are still using radically different meanings of
	the term here.

	I notice that your discussion mentions features to
	restrict access to specific times of day (and presumably
	you might want to require that the access be through certain
	channels and/or from certain sources -- such as from a 
	"securetty" or from somewhere "behind the firewall" or on
	the *local* area network).

	Since capabilities only grant access (they don't "deny it"
	which I guess would have to be called a ``disability'')
	it isn't obvious how they can be used to provide the
	desired level of control.

	I think it could work like this.  I create a "restriction agent" 
	(a script that starts with the hypothetical equivalent of
	"#!/usr/bin/restrict.access").  This has a set of restriction
	(time, location, password/capability requirements -- possibly
	*multiple capabilities*).  I grant this script the required
	capability to the target resource.  Now it can act as a proxy
	for the access (and revocation is simply a matter of having
	the agent *stop* proxying).


>>> As far as I understand ACLs and the Privs project, the capability
>>> oriented system would provide the same (and more) functionality that
>>> these two projects are set to provide, only that with a capability
>>> oriented system there would be only ONE interface.
>>
>> I think so as well. However I think we'll have trouble
>> explaining it to anyone else.
>
> Agreed very much.
> Andrej
	I suspect I can't explain it adequately because I don't 
	understand it sufficiently.  Oh well.  I only heard about
	"capabilities" a couple of years ago -- and there doesn't
	seem to be much written about them.

	Hugh? Jonathan?    Any corrections, comments or flames?

--
Jim Dennis  (800) 938-4078		consulting@starshine.org
Proprietor, Starshine Technical Services:  http://www.starshine.org
        PGP  1024/2ABF03B1 Jim Dennis <jim@starshine.org>
        Key fingerprint =  2524E3FEF0922A84  A27BDEDB38EBB95A