[E-Lang] New in 0.8.9k: Locally Untrusted Code & Confinement! (Part 1)

Mark S. Miller markm@caplet.com
Wed, 29 Nov 2000 13:52:54 -0800


For conciseness, in this message we'll refer to support for "locally 
untrusted code and confinement" as LUCAC.

Originally we expected LUCAC to be one of the features we'd implement late 
on the way to 1.0, as 1) we thought it would require auditors in order to 
check code for confinement, but auditors are not currently supported; and 2) 
that it would require a careful analysis of the Java API to determine what 
subset follows capability discipline, in order to outlaw that Java outside 
that subset, but that would be a lot of work.

Fortunately, with a few changes in the concrete plan for LUCAC which leaves 
the abstract logic intact, we are able to provide LUCAC painlessly in this 
release, though 1) we still haven't implemented auditors, nor 2) carefully 
examined most of the Java API for adherence to capability discipline.  This 
message deals with issue #1.  The next one will deal with #2.

First, the logic in common between the old and new plans, which corresponds 
to stopping at step #2 of the four step derivation 
http://www.erights.org/elib/capability/factory.html of the KeyKOS Factory 
mechanism:

Some definitions:

In our standard confinement scenario 
http://www.erights.org/elib/capability/confinement.html , we speak loosely 
of Bob being "confined", and Alice being able to know that Bob is 
"confined".  Let's say that Bob is an instantiation of code written by 
Mallet, and that this code isn't necessarily visible to Alice.  However, the 
code must be running on a TCB Alice trusts.

In all 4 concrete capability confinement systems I know of (KeyKOS 
Factories, EROS Builders, original E LUCAC plan, and the LUCAC support in 
0.8.9k), the object Alice checks in order to be confident of Bob's 
"confinement" isn't Bob, but an object that makes Bobs.  KeyKOS calls the 
Bob-making object a Factory.  Mallet makes a BobProgFactory and gives it to 
Alice.  Alice checks that BobProgFactory will make Bobs which are 
*initially* safe, and then Alice asks it to make a Bob.

On creation, Bob may (and typically will) have stateful read-write memory, 
but this memory is private to each Bob (each yield of the BobProgFactory).  
On creation, the only authority Bob has is that provided by Alice in the 
creation-request message she sent to the BobProgFactory (as well as, in EROS 
& KeyKOS, authorities she approves of but doesn't necessarily have access 
to).  Once created, Bob has no intrinsic checkable properties that give 
Alice confidence.  Alice's confidence in Bob's confinement comes purely from 
checking the BobProgFactory, and then being careful to use Bob in restricted 
ways.

Therefore, since we wish to eventually have a ":confined" guard that 
actually checks something, we define BobProgFactory as being "checkably 
confined" or just "confined", and we define Bob as being "indirectly 
confined".  I don't love these terms, and better ones are welcome.

By stopping at step #2, the property Alice needs to know about 
BobProgFactory is very simple and natural for a lambda language to provide: 
Is the graph of objects rooted in BobProgFactory transitively immutable and 
transitively without authority to effect the world outside their graph?  
(Actually, we can reduce this to transitive immutability, since we wouldn't 
consider an authority-providing capability to be immutable.)  This is 
actually the property that the ":confined" guard will guarantee, so in the E 
context, this is what we mean by "confined". (Though it is actually a 
stronger property, providing durability 
http://www.cap-lore.com/CapTheory/KK/durability.html as well.  Note: in 
this context, "stronger property" is a disadvantage, because it narrows the 
range of programs Mallet may provide to Alice such that Alice can determine 
whether they are safe as far as she is concerned).

If BobProgFactory is transitively immutable and without authority, both 
guarantees Alice needs about each Bob instance follow from normal capability 
rules:  1) Although the made Bobs may be born stateful, they aren't born 
sharing stateful memory with each other or anything else -- since the 
BobProgFactory cannot hold onto any such state, it cannot arrange for such 
sharing.  2) They only have authorities given to them by Alice (or derived 
by capability rules from authorities given to them by Alice), since an 
object's creator can only endow it with capabilities it has, either as 
instance variables, or as creation-request arguments. BobProgFactory's 
immutability prevents it from retaining these arguments.

So how does an E programmer use this stuff?

The plan was and remains that an E application normally be divided into two 
pieces -- *.e files and *.emaker files.  

A *.e file corresponds to a C or Java "main" function.  It is the entry 
point for running as E program as an application in the host OS.  A *.e file 
should be executable and begin with "#!!/usr/bin/env e", so that it will be 
interpreted by the E interpreter.  Code in a "*.e" file evaluates in a scope 
which includes the magic powers that represent the authority given to the 
process as a whole by the host OS. (For conventional ACL-based OSes, this is 
normally all the authority associated with the user-account running the 
program.) This is called the "privileged scope", and is analogous to the set 
of capabilities provided to the start program of the KeyKOS Big Bang. An 
example is "file__uriGetter", used in the expansion of the "<file:...>" 
syntax, which provides all the process's access to the host file system.  

A well-constructed E program does very little in it's *.e file other than 
subset these authorities, import other objects, and hand these constrained 
authorities to these other objects in order to instantiate the application 
according to the principle of least authority.  Finally, it should set the 
application going if needed.  Only an application's *.e file need be 
examined to determine the extent of it's user's authorities that the 
application may abuse.

By contrast, code in a "*.emaker" file evaluates in a scope which provides 
no authority -- the "universal scope". The only objects in the universal 
scope are clearly harmless ones such as "true", but not "file__uriGetter".  
*.emaker files are typically placed on the CLASSPATH.  E code (whether in 
*.e files or *.emaker files) can import *.emakers according to their 
position on the CLASSPATH.  Ignoring Java classes for the moment, 
<import:foo.bar.Baz> will read and evaluate a file with path 
"foo/bar/Baz.emaker" relative to the CLASSPATH.  The contents of the file 
are taken as a single E expression, which is evaluated in the universal 
scope.  We define an "emaker" to be the value this expression evaluates to.  
The value of the <import:...> expression is this emaker.

In the old LUCAC plan, the emaker would be constrained to be checkably 
confined.  Since would imply that the emaker is transitively immutable, the 
import__uriGetter (the object behind the "<import:...>" sugar) would 
evaluate any file at most once and cache the result.  This would require 
auditors.  This seemed natural, in that emakers and factories both exist to 
make useful objects with known safety properties.

                  "All problems in computer science can be solved with 
                             a few more levels of indirection"
                                              --Alan Perlis

In our new LUCAC plan, we impose no constraint on the expression in the 
*.emaker file other than that it be a valid E expression and that it be 
evaluated in the universal scope.  This means that the emaker it evaluates 
to may be stateful, but, on creation, cannot hold any authority.  Therefore, 
the expected common pattern of use remains that emaker are used to make 
useful objects, since useful object will usually be born holding some 
authorities.  But since the emakers are stateful, they may come to hold 
authorities as well.  Therefore, emaker are no longer checkably confined.

So, didn't we just lose LUCAC?  We would if the import__uriGetter continued 
to cache emakers as it did before.  Instead, we make import__uriGetter be 
the checkably confined object, requiring it not to cache emakers.  (In fact, 
in order to resolve cyclic imports, we do allow it to cache just during a 
single top-level importing event.  Since this violates the formalism, we 
need to look for a security danger here, but so far I haven't been able to 
find one.)  We further require the universal scope as a whole, and 
therefore all values within it, to be checkably confined.

As a short term solution, this is fine, but it has a bad performance cost
we'd like to get rid of in the long run: Since the typical emaker will 
actually be stateless, we'd like to avoid instantiating them per import: 
event.  Once we have auditors, it seems we can still allow import__uriGetter 
to cache emakers that pass the ":confined" guard.  The old story becomes an 
optimized special case within the new story.  This has the strange 
consequence that

    <import:foo.bar.Baz> == <import:foo.bar.Baz>

may be false now but true later (when the emaker can be made checkably 
confined).  This will make the caching non-transparent, but AFAIK not in a 
way that makes the statefulness of the caching a covert channel enabling 
Bobs to violate confinement.  If this speculation is true, we probably need 
a finer grained taxonomy of kinds of statefulness.  I'm reminded of the 
peculiar quantum experiments (EPR) that observe evidence of faster-than-light 
causality, but not in a way that we can use to transmit causal signals 
faster than light.


Next message: But what about all that non-capability oriented Java API

Note: I still need to explain some interesting security issues in deciding 
to place a *.emaker file on the CLASSPATH.