[Cap-Talk] immutable data
Mark S. Miller
markm@caplet.com
Thu, 12 Jul 2001 15:01:05 -0700
At 12:27 PM Thursday 7/12/01, Richard Uhtenwoldt wrote:
>I've been browsing eros-os.org et al for years but did not become
>enthusiastic about strong capabilities till I read Rees's paper. (Rees's
>paper requires a knowledge of Scheme and is at
>href="http://mumble.net/jar/pubs/secureos/".
It is truly an excellent paper (and thesis). Not only is it a great
explanation of capabilities, but it explains capabilities the "right" way.
AFAIK, there have been two threads of capability thinking in computer
science ( http://www.erights.org/history/overview.html ):
1) that rooted in operating systems and citing Butler Lampson's paper
"Protection" (or papers derived from this paper) either as the correct
definition or as the definition gone bad, or
2) that rooted in the lambda calculus.
The lambda thread yields much better formalisms as well as better ways of
thinking about capabilities. The best capability OSes, KeyKOS and EROS,
though typically described with Lampson-esque models, are actually much
better described in terms of the lambda calculus and Actors (as has been
discussed on the EROS list). Via Algol-68, the lambda calculus was a
significant influence on Norm's early thinking, but not enough on his
descriptions.
Hewitt's original Actors still seem to me like the clearest formal statement
of the capability computation model. But, as far as I remember, there are
no references between his work of the 70s and the capability OS work
happening at the same time. And I don't think Hewitt ever says
"capability". It looks to me like a completely independent discovery. From
Hewitt's writings of the time, he clearly saw the security implications of
the formalism, and may have been the first to realize that it could be a
basis for transparent secure distributed computing without a shared TCB.
Unfortunately, Hewitt is not a great communicator, and, being in the MIT AI
culture, he eventually stopped caring about security.
Scheme, which was inspired by Actors, seemed to only pick up about half of
what was valuable about Actors -- full lexical closures, first class
continuations, control flow as message passing, minimalism. Most modern
languages considered good have learned from Scheme, but have not bothered to
learn from Scheme's ancestor. The half Scheme forgot includes capability
security, transparent distribution, event-loop concurrency, everything is an
object (pervasive message dispatch). However, by sticking faithfully to the
spirit of the lambda calculus, Scheme miraculously remained almost secure
during the decades when no one cared about this.
Rees' paper is a glorious rediscovery of the capability security latent in
Scheme, and in similarly almost pure lambda languages. This time the
connection is superbly well explained. Between this explanation and several
capability secure lambda languages (W7, Joule, E), perhaps, this time, the
connection won't get lost. Now if we can only get the OS folks to think
first in these terms, rather than the broken Lampson ones.
>They are. In Scheme, a variable is always bound to a location. a
>location is always mutable and contains a value.
>
>At least two people on the e-lang list, Rees being one of them, have
>asserted that this is a design flaw of Scheme (I can point you at the
>messages if you like--it was like 6 months ago), and I agree.
>[...]
>ML lets you do that with minimal fuss. in particular, ML's variables are
>by default bound to (immutable) values; to get mutability, you bind the
>variable not to an ordinary value like an (immutable) integer but rather
>to what called a ref (although "loc" would have been a more apt term,
>IMO). refs and mutable arrays are probably the only mutable data types
>in ML.
For comparison, as a result of the discussion on the e-lang list referred to
above, a variable in E is by default immutable. A mutable variable is seen
as being, under the covers, an unnamed immutable variable designating a Slot
object. http://www.erights.org/javadoc/org/erights/e/elib/slot/Slot.html A
Slot is a first class reification of Scheme's notion of location, and
therefore, I take it, to something like ML's "ref"s.
>[...] Specifically, noting that ML is a statically typed language,
>instead of declaring
>
> i : integer
In E:
def i :integer := #..initial value expression
As you mention below, E is a dynamically typed language. If I understand
modern language design terminology correctly, E is a variety of dynamic
typing called "soft typing". The "integer" in the above expression is a
"ValueGuard" expression
http://www.erights.org/javadoc/org/erights/e/elib/slot/ValueGuard.html .
The definition is evaluated by evaluating the ValueGuard expression
("integer") to a ValueGuard, evaluating the initial value expression to an
initial value, and asking the ValueGuard to coerce the initial value. If
this succeeds, "i" is bound to the result. A soft type dynamically fails to
type check when a ValueGuard fails to coerce.
>one would declare
>
> ri : ref integer ("ri is of type ref containing an integer")
In E:
var i :integer := #..initial value expression
Now the expression after the colon is a SlotGuard expression. It is
evaluated to produce a SlotGuard
http://www.erights.org/javadoc/org/erights/e/elib/slot/SlotGuard.html . The
definition's evaluation proceeds much like before, but the SlotGuard's
"makeSlot" method is invoked rather than a "coerce" method. The resulting
first-class Slot object is then used to hold the value of the variable "i".
Type-like objects, such as that bound to "integer" in the initial scope, are
expected to respond to both SlotGuard and ValueGuard protocol, enabling them
to be used in both contexts.
There is nothing fundamental about the "var" declaration. One could
transform it away as follows:
var i :vge := ive => def ri := vge makeSlot(ive, null)
i := newValue => ri setValue(newValue)
... i ... => ... (ri getValue()) ...
... &i ... => ri
An E program in which all "var"s have been transformed away is in "final
normal form" (FNF).
>and then one would write !ri (pronounced "deref ri" I suggest) to fetch
>from the ref and ri:=53 to store into the ref. i:=53 is a type error.
>so is
>
> ri * 3
>
>--one has to write
>
> !ri * 3
In normal E, "i * 3" is fine. The FNF form of this expression would be
"ri getValue() * 3"
>The folks on the e-lang list seem to prefer dynamically typed languages.
Speaking just for myself, I chose to make E dynamically types. Although I'm
committed to that path for the next many years of my life, I frankly don't
know which I prefer. It's one of the hardest tradeoffs a language designer
faces.
>There is no reason why a dynamically typed language could not use its
>type system to distinguish between mutable and immutable data the way
>FPLs do, but I do not know of one with a currently maintained
>implementation.
If I understand you correctly, E qualifies for the reasons explained above,
and I am maintaining its implementation. Perhaps not well, but with the
recent Combex announcement (briefly, we got money), things should improve
rapidly.
> certainly no language used by more than 100 programmers
>at any give time.
Ok, not yet. Watch this space.
>And I know nothing about Smalltalk
It does not. All variables are mutable.
>Prolog might do it but Prolog is unsuitable for cap programming
>for other reasons.
Curiously, Concurrent Prolog and most of its progeny (such as ToonTalk) are
fine capability systems. The invention of Concurrent Prolog probably
represents the third independent invention of capability computation, though
this time in a horn-clause inference context. However, because Udi Shapiro
is an honorable scholar, you wouldn't know it from his papers. He became
aware of Actors before the first paper ("A Subset of Concurrent Prolog and
its Interpreter") was done, realized that Hewitt had already stated the
important properties many years earlier. So he cites it as prior work. I
know a bit more of the history only by conversations with Udi.
Cheers,
--MarkM