[EROS-Arch] The Environment Problem

Jonathan S. Shapiro shap@eros-os.org
Sun, 23 Sep 2001 15:27:21 -0400


It is becoming apparent that EROS hasn't paid enough attention to what I am
coming to think of as "the environment problem". The question is:

    What set of standardized facilities should
    the EROS runtime environment assume is
    available?

    How can/should it obtain them?

    What trust issues arise in so doing?

I am not suggesting that every object will receive capabilities for all of
the facilities identified in the convention, nor that every object will
necessarily use them. Rather, I am suggesting that just as a library may
publish certain very low-level global entry points whose presence the
language runtime and compiler rely on (e.g. "__assert()"), there is probably
a small list of interfaces that the runtime and compilers similarly rely on,
and that these interfaces are global in the same sense that the "__assert()"
routine is global. A particular application may "stub out" such an interface
by placing a void key in the appropriate position in the environment, but
the runtime library needs to know where to find such keys so that they can
be used if present.

In the context of EROS, some examples might include:

    [ Things from the constructor: ]
    1. The error logging interface
    2. A process key to the currently executing process
    3. The sepuku interfaces:
            process creator for the current process
            protospace
    4. The currently selected space bank
    5. The metacontstructor key
    6. The space bank verifier key
    7. The constituents node

    [ Things from the user: ]
    8. The window system interface
    9. stdin/stdout streams
    10. A read-only directory of discrete constructors
       used to support dynamic instantiation when
       deserializing.

Many applications will not require stdin, stdout, or window-system, and will
not be given them, but in each case there are elements of the standard
runtime library that will not work without them, and in each case there is
some existing, well-established library standard which dictates that the
location of the capability must be globally known. E.g. printf() knows
internally that it should use stdout, and the "key cache" logic must know
the location of the supernode or node tree that serves as the container for
the cache.

This raises three issues:

1. How should such an environment be passed to an application?
2. What are the trust issues from the perspective of the application? [The
app creator can usually arrange to pass a void key if desired.]
3. Might there be other standard components not enumerated above [note: the
list has grown slowly over the years], and how should we allow for
expansion?
4. How should name binding between a standard interface and its name be
handled in such an environment? Sub-issues here are (a) collision avoidance
and (b) attacks on the component by subverting its environment.

The following are my thoughts on these issues, and I invite comment on them.
Please note that the logging interface has hiding within it a notional
design that I will send out next. We can debate whether it should come from
the constructor or the user, and what it's specification should be, but
let's do that in the logging discussion rather than in this one.


PASSING THE ENVIRONMENT

In actuality, there are two distinct environments here. One consists of
items supplied by the constructor. These are trustworthy by virtue of the
system design, and not by virtue of being supplied by the object's creator.
Items 1 through 7 fall into this category. At present, these items are
delivered to the application in well-known capability registers. In
abstract, this mechanism is adequate, but the capability registers will soon
start to run out, and I have come to feel that they should be passed in a
separate node. This node should be allocated out of the same space bank that
the yield is allocated from.

Aside: to allow for the possibility of future expansion of the list, correct
applications should allow for the possibility that this node will at some
point evolve into a node tree, and should use the previously described
extended fetch/store interface when manipulating this environment "node".

The user-supplied environment can either be passed via a well-known slot in
the constructor-supplied environment or via the already-reserved "arguments"
key slot.

TRUST ISSUES FOR THE NEW OBJECT

One of the things that always bugged me about KeyKOS was the fact that you
had to trust the system administrator to give you a couple of keys: the
space bank verifier and the metaconstructor. The constructor needs both of
these anyway, and you'll note that I've moved them into the
constructor-supplied environment, where the system administrator can't screw
with them in the absence of offline disk forensics.

The logging agent, however, is a potential channel to somewhere. The problem
with having it come from the user environment is that the application may
not trust the user with that information, or may trust the user (client)
only in a conditional way. This follows from the fact that the yield is
encapsulated and may embody proprietary content. I suggest that there is a
class of such capabilities, that the logger is one of this class, and that
this class should probably be supplied by the constructor at construction
time. I'll take up that discussion in my next note (on the logging
mechanism).

EXPANSION

I believe that there will be expansion of both sets as we learn. Expansion
of the constructor-supplied environment can be handled by expanding the node
tree and using the extended fetch/swap interface. In practice, I do not
anticipate any large degree of expansion in this list, as it has been
relatively stable since about 1978.

For the user-supplied environment, I think that the issue is closely related
to the name binding issue.

NAME BINDING

Name binding devolves into three distinct problems:

1. The simple binding of names (strings) to capabilities. This can basically
be handled by a directory.
2. The problem that a component may not wish to trust it's creator in
certain cases, but must nonetheless obtain trusted interfaces. I think of
this as the "trusted path" problem. For example, a component may not trust
its caller to supply the source for the "open file" dialog box.
3. The problem of dynamic binding -- how can the executing code know that
the namespace of constructors from which it instantiates things has not been
tampered with (i.e. the namespace tampered with, not the constructors). The
concern here is that the instantiating component would be ill-advised to
give outward communication channels to an arbitrary piece of code.

At the moment, I think that I understand how to do everything except the
name binding part, and I'ld really appreciate comments and thoughts on this.


Jonathan