kernels and Objective-C
Mon, 12 Jul 1999 11:14:01 -0400

[Rick: since this is of interest to others, I'm forwarding this specific
response to cap-talk as well as to you. ]

>> Thus, the EROS kernel does not need a per-process stack.  On a uniprocessor,
>> there is a single kernel stack that is used to hold the state of the current
>> invocation.
>It sounds like process state is some other structure, like a tree or a
>list. ?

The *only* per-process state known to the kernel is whether a process is
currently running, or if not, what in-kernel queue it is sleeping on.  While the
kernel does cache process state internally for ease of reference, the definitive
version of the process state is the state that resides in nodes.

>> If an (EROS) process suspends within the kernel, every mutation it has caused
>> is undone and the process is punted all the way out.  On resumption, the
>> resumes at user level by re-invoking the trap instruction.
>What's the cost of "every mutation it has caused is undone" ?

Essentially nothing.

First, I slightly misstated.

All kernel operations require a bounded number of objects to be in memory to
complete.  On every kernel-implemented operation, the operation proceeds in two
phases.    First, any entities (pages, nodes, processes) needed are forced into
memory and "pinned" for the duration of the operation.  If all needed objects
are in memory and successfully pinned, the operation is then "committed" and
runs to completion.  Thus, if some necessary component is missing there haven't
actually been any mutations to undo.

While at OSDI in New Orleans, Jay Lepreau and I had a discussion about the
overhead of re-entering the kernel versus keeping an in-kernel stack.  The short
version is that if you're a little careful about it it's actually better to blow
away the in-kernel stack.  The problem is locks.  The main reason to keep an
in-kernel stack is so that an in-kernel process can hang onto locks across
suspensions.  The catch is that some other process may need to steal these locks
(if it cannot, you can deadlock), at which point you end up needing to
re-acquire the locks when the process resumes.  All of the code to manage and
perform this acquisition and theft and to deal with the consequences of lost
locks costs a lot.  If the operations are small and indivisible, you're better
off just redoing the whole thing.

One exception to this statement is re-crossing the user/supervisor boundary.
There are some tricks to avoid doing this.

There is an ugly exception to the no-undo rule buried in the I/O system, where
initiating an I/O requires several data structures to be successfully allocated
because of the duplexing design (one for the overall request, one for each disk
queue).  It proves that they all happen in rapid sequence, and undoing them
appropriately isn't really so bad.  Actually, this is more of an issue for
duplexed reads than for duplexed writes, and we have a paper design to eliminate

Jonathan S. Shapiro, Ph. D.
IBM T.J. Watson Research Center
Phone: +1 914 784 7085  (Tieline: 863)
Fax: +1 914 784 7595