[Cap-Talk] immutable data

Mark S. Miller markm@caplet.com
Thu, 12 Jul 2001 15:01:05 -0700


At 12:27 PM Thursday 7/12/01, Richard Uhtenwoldt wrote:
>I've been browsing eros-os.org et al for years but did not become
>enthusiastic about strong capabilities till I read Rees's paper.  (Rees's
>paper requires a knowledge of Scheme and is at
>href="http://mumble.net/jar/pubs/secureos/".

It is truly an excellent paper (and thesis).  Not only is it a great 
explanation of capabilities, but it explains capabilities the "right" way.  
AFAIK, there have been two threads of capability thinking in computer 
science ( http://www.erights.org/history/overview.html ):

1) that rooted in operating systems and citing Butler Lampson's paper 
"Protection" (or papers derived from this paper) either as the correct 
definition or as the definition gone bad, or

2) that rooted in the lambda calculus. 

The lambda thread yields much better formalisms as well as better ways of 
thinking about capabilities.  The best capability OSes, KeyKOS and EROS, 
though typically described with Lampson-esque models, are actually much 
better described in terms of the lambda calculus and Actors (as has been 
discussed on the EROS list).  Via Algol-68, the lambda calculus was a 
significant influence on Norm's early thinking, but not enough on his 
descriptions.

Hewitt's original Actors still seem to me like the clearest formal statement 
of the capability computation model.  But, as far as I remember, there are 
no references between his work of the 70s and the capability OS work 
happening at the same time.  And I don't think Hewitt ever says 
"capability".  It looks to me like a completely independent discovery.  From 
Hewitt's writings of the time, he clearly saw the security implications of 
the formalism, and may have been the first to realize that it could be a 
basis for transparent secure distributed computing without a shared TCB.
Unfortunately, Hewitt is not a great communicator, and, being in the MIT AI 
culture, he eventually stopped caring about security.

Scheme, which was inspired by Actors, seemed to only pick up about half of 
what was valuable about Actors -- full lexical closures, first class 
continuations, control flow as message passing, minimalism.  Most modern 
languages considered good have learned from Scheme, but have not bothered to 
learn from Scheme's ancestor.  The half Scheme forgot includes capability 
security, transparent distribution, event-loop concurrency, everything is an 
object (pervasive message dispatch).  However, by sticking faithfully to the 
spirit of the lambda calculus, Scheme miraculously remained almost secure 
during the decades when no one cared about this.

Rees' paper is a glorious rediscovery of the capability security latent in 
Scheme, and in similarly almost pure lambda languages.  This time the 
connection is superbly well explained.  Between this explanation and several 
capability secure lambda languages (W7, Joule, E), perhaps, this time, the 
connection won't get lost.  Now if we can only get the OS folks to think 
first in these terms, rather than the broken Lampson ones.


>They are.  In Scheme, a variable is always bound to a location.  a
>location is always mutable and contains a value.
>
>At least two people on the e-lang list, Rees being one of them, have
>asserted that this is a design flaw of Scheme (I can point you at the
>messages if you like--it was like 6 months ago), and I agree.
>[...]
>ML lets you do that with minimal fuss.  in particular, ML's variables are
>by default bound to (immutable) values; to get mutability, you bind the
>variable not to an ordinary value like an (immutable) integer but rather
>to what called a ref (although "loc" would have been a more apt term,
>IMO).  refs and mutable arrays are probably the only mutable data types
>in ML.

For comparison, as a result of the discussion on the e-lang list referred to 
above, a variable in E is by default immutable.  A mutable variable is seen 
as being, under the covers, an unnamed immutable variable designating a Slot 
object.  http://www.erights.org/javadoc/org/erights/e/elib/slot/Slot.html  A 
Slot is a first class reification of Scheme's notion of location, and 
therefore, I take it, to something like ML's "ref"s.


>[...] Specifically, noting that ML is a statically typed language,
>instead of declaring
>
>  i : integer

In E:

    def i :integer := #..initial value expression

As you mention below, E is a dynamically typed language.  If I understand 
modern language design terminology correctly, E is a variety of dynamic 
typing called "soft typing".  The "integer" in the above expression is a 
"ValueGuard" expression 
http://www.erights.org/javadoc/org/erights/e/elib/slot/ValueGuard.html .  
The definition is evaluated by evaluating the ValueGuard expression 
("integer") to a ValueGuard, evaluating the initial value expression to an 
initial value, and asking the ValueGuard to coerce the initial value.  If 
this succeeds, "i" is bound to the result.  A soft type dynamically fails to 
type check when a ValueGuard fails to coerce.


>one would declare
>
>  ri : ref integer         ("ri is of type ref containing an integer")

In E:

    var i :integer := #..initial value expression

Now the expression after the colon is a SlotGuard expression.  It is 
evaluated to produce a SlotGuard 
http://www.erights.org/javadoc/org/erights/e/elib/slot/SlotGuard.html .  The 
definition's evaluation proceeds much like before, but the SlotGuard's 
"makeSlot" method is invoked rather than a "coerce" method.  The resulting 
first-class Slot object is then used to hold the value of the variable "i".  
Type-like objects, such as that bound to "integer" in the initial scope, are 
expected to respond to both SlotGuard and ValueGuard protocol, enabling them 
to be used in both contexts.

There is nothing fundamental about the "var" declaration.  One could 
transform it away as follows:

    var i :vge := ive      =>      def ri := vge makeSlot(ive, null)
    i := newValue       =>       ri setValue(newValue)
    ... i ...                  =>       ... (ri getValue()) ...
    ... &i ...                =>       ri

An E program in which all "var"s have been transformed away is in "final 
normal form" (FNF).

>and then one would write !ri (pronounced "deref ri" I suggest) to fetch
>from the ref and ri:=53 to store into the ref.  i:=53 is a type error.
>so is
>
>  ri * 3 
>
>--one has to write
>
>  !ri * 3

In normal E, "i * 3" is fine.  The FNF form of this expression would be
"ri getValue() * 3"


>The folks on the e-lang list seem to prefer dynamically typed languages.

Speaking just for myself, I chose to make E dynamically types.  Although I'm 
committed to that path for the next many years of my life, I frankly don't 
know which I prefer.  It's one of the hardest tradeoffs a language designer 
faces.


>There is no reason why a dynamically typed language could not use its
>type system to distinguish between mutable and immutable data the way
>FPLs do, but I do not know of one with a currently maintained
>implementation.

If I understand you correctly, E qualifies for the reasons explained above, 
and I am maintaining its implementation.  Perhaps not well, but with the 
recent Combex announcement (briefly, we got money), things should improve 
rapidly.

>  certainly no language used by more than 100 programmers
>at any give time.  

Ok, not yet.  Watch this space.


>And I know nothing about Smalltalk

It does not.  All variables are mutable.

>Prolog might do it but Prolog is unsuitable for cap programming
>for other reasons.

Curiously, Concurrent Prolog and most of its progeny (such as ToonTalk) are 
fine capability systems.  The invention of Concurrent Prolog probably 
represents the third independent invention of capability computation, though 
this time in a horn-clause inference context.  However, because Udi Shapiro 
is an honorable scholar, you wouldn't know it from his papers.  He became 
aware of Actors before the first paper ("A Subset of Concurrent Prolog and 
its Interpreter") was done, realized that Hewitt had already stated the 
important properties many years earlier.  So he cites it as prior work.  I 
know a bit more of the history only by conversations with Udi.


        Cheers,
        --MarkM