The story of E, part 2 (fwd)
Ka-Ping Yee
ping@lfw.org
Fri, 9 Oct 1998 04:33:56 -0700 (PDT)
The following is forwarded correspondence between Mark and me
for everyone's perusal and criticism/discussion...
Ping Talk back to the Web!
<ping@lfw.org> http://crit.org/
---------- Forwarded message ----------
Date: Wed, 7 Oct 1998 00:47:06 -0700 (PDT)
From: Ka-Ping Yee <ping@lfw.org>
To: Mark Miller <markm@caplet.com>
Subject: The story of E, part 2 (fwd)
Hi, Mark.
A friend of mine decided to ask me, "What's E?"
And so i replied, and found myself blathering on for quite
a while. I thought i would forward some of this to you, so
that you could peruse it for correctness and suitability of
tone -- think of it as a dress rehearsal for the sections i
might eventually like to write.
Context is for someone familiar with C++ and currently
learning Python -- which might be a decent trial run since
many of your readers will probably be starting from somewhere
similar, at least in the C++ camp.
I'd be grateful if you let me know what you think. (The
ultimate review, of course, has yet to come from the recipient
of this message, when we find out how easy it was to grasp the
ideas from my endless verbiage).
Ping
...
Phew. So that was all background. The point is, in order to
achieve "real" security, they used a paradigm known as
"capability-based security" which has been around since the
70s but has largely been ignored by the computing community
in favour of the more common but fatally flawed system known
as the access control list. ACLs are used almost everywhere
(including Unix) and lead to all sorts of horrible security
holes that make it impossible or extremely difficult to
guarantee anything. Some of the people at EC were in that
original group of capability-based security thinkers, or
closely related to their work.
Capability-based security is actually a very simple concept.
(Much simpler and cleaner than ACLs, which involve a list of
permitted users on each object, like the file permissions in
Unix.) In a very small nutshell, capability-based security
is just pure object-oriented programming. That means, you
get to call methods on an object *only* if you have a pointer
to the object, and you can get that pointer to the object
*only* if either (a) you created it or (b) someone passed it
to you. Once you make *everything* work that way, it becomes
very easy to predict who cannot possibly do what.
And that's all. Apply that to classic security problems,
build a distributed object system that transfers objects
over encrypted channels, and you have the core of what EC
built. EC needed a language to build this in, and they
decided that C++ was too hairy -- they wanted Java, but
Java had lots of security holes introduced by the people
at Sun, who -- although they got fairly close to a pure
object model -- still were shortsighted in a few ways. So
the EC people made some modifications to Java to fix the
holes and provide themselves some conveniences for talking
to remote objects, and called the resulting language E.
...
Okay. More about E.
By the way, this is turning into a much longer missive than i
had expected. Sorry for the deluge. But please ask about
anything if you're curious. I'm glossing over some stuff in
the attempt to keep this concise. It took me a while to really
get it all, but they're very cool ideas.
So E the language is in some ways similar to Python, in the
sense that it's object-oriented and much less annoying to write
than C++. It also has an interpreter you can talk directly to,
like Python's:
? 3 + 5
# result: 8
?
The "3 + 5" is me typing; the rest is what E would say.
It's also dynamically typed like Python, and you can create and
change functions and objects during runtime, like Python. One
main difference in the object system is that you don't have a
separate type of thing for a "function" or a "class". They're
all objects.
In E, the equivalent of a "function" is just an object with only
one method. By convention this method is called "run()", and it
is the default method that gets called when you don't specify one.
So calling such a "function" looks the way you're used to.
You could define an object called "sum" that can be run as a
function in the following way:
define sum {
to run(a, b) {
a + b
}
}
Then if i ask for "sum(3, 5)" i get 8.
Another difference that makes E more like C++ is that it's an
"expression language": all code consists of expressions that
evaluate to values, and the value of any sequence of expressions
is the value of the last one. That's why i didn't say "return a + b"
in the function above. There are no "statements" that just do
things without returning a value, like, for example, "if" in C++.
(In C++, something like "x = if (...) {..." doesn't make sense.
But in E, it can.)
In E, an "if" expression just evaluates to the thing in the part
that got executed. So, for example, i could write:
define abs {
to run(x) {
if (x > 0) { x } else { -x }
}
}
There is also a short-hand for writing functions that may look
more familiar:
define abs(x) {
if (x > 0) { x } else { -x }
}
Here, the parenthesis after "abs" tips the interpreter off to
let it know that we want a function-like thing, so it expands
the above into exactly the previous definition for you.
A second ago i just said that everything evaluates out to a value,
so you might wonder what a "define" evaluates to. The obvious
answer is that they give you back the object you just defined.
But what good is this?
Well, i also said there are no special "class" things in E.
The equivalent of a "class" in E is just an object that makes
you more objects of a particular kind. (As compared to C++,
Python does somewhat blur the distinction between a "class"
and an object-generating function; E erases this distinction
completely.) With that said, let's try another example:
define VectorMaker(x, y) {
define vector {
to toString { "<Vector " + x + ", " + y + ">" }
to getX { x }
to getY { y }
to getLength { sqrt(x*x + y*y) }
}
}
Assuming that we have a sqrt() function available, this is a
very simple vector "class". Because the "define vector..."
evaluates to the newly-defined object, that makes VectorMaker
a function that generates and returns new vector objects.
Each time VectorMaker (actually its implied "run" method) is
called, the "define vector" code is executed to produce the
new object:
? v := VectorMaker(1.5, 4)
# result: <Vector 1.5, 4>
Notice that when printing the vector out, its "toString"
method is called to produce the displayed representation, and
in that method the values of "x" and "y" are automatically
converted to strings by the "+" operator. Our new vector
object is both assigned to "v" and returned as the result of
the assignment operation, just as in C++.
? v getX()
# result: 1.5
? v getY()
# result: 4
? v getLength()
# result: 4.27200187265877
Here finally we call some methods on our vector object. Because
Mark wanted E to be easy to type as a command language as well
as something you could edit and compile large programs in, he
chose the space as the method-call operator. Admittedly, this
is pretty unusual, but it seems to work pretty well once you
get used to it. This reduces punctuation clutter, especially
since calling methods becomes about the most common thing you do.
So now you might be wondering "where did x and y get stored?".
Unless you are quite familiar with Scheme or similar languages,
you'll wonder (as i did) where the data members are, since in
C++ you would normally declare structure members to store the
data in. Well, you can still do this in E if you want:
define AnotherVectorMaker(initx, inity) {
define vector {
define x := initx
define y := inity
to toString { "<Vector " + x + ", " + y + ">" }
to getX { x }
to getY { y }
to getLength { sqrt(x*x + y*y) }
}
}
... but you don't need to. Because E is a properly "lexically-
scoped" language, the passed-in values of "x" and "y" in the
first definition remain available to all of the things defined
inside the VectorMaker's scope. Each pair of curly-braces after
a "define" keyword creates a new level of scope, with its own
namespace for variables, and each scope can access variables in
all higher (outer) scopes, as long as those variables are not
hidden by variables of the same name in an inner scope.
There's also none of that "public" or "private" stuff in E.
Since there is no way to directly get at the contents of an
object's scope, the only way to provide the values of "x" and "y"
to the outside is to provide methods that hand them out, as we
did in the vector object. In C++ or Python you could look inside
the vector with something like "v.x" or "v.y", but in E you can
only call methods and that is all.
This is actually pretty important to writing predictably secure
programs: at one glance, just by looking at the definition of a
class like the one above, you can see all the ways that any of
the object's local data could ever get modified or exposed to
the outside world. The local scope ensures that the contents
cannot be accessible from anywhere else -- you can be confident
that any secrets you want to keep inside the object are safe.
Like Python, E has built-in list and dictionary types. You can
create lists using the square brackets, and dictionaries using
the curly-braces:
? a := [1, 5, 4]
# result: [1, 5, 4]
? a[0]
# result: 1
? a[2]
# result: 4
? a[3]
# problem: java.lang.ArrayIndexOutOfBoundsException
? n := {"a" => 440, "b" => 493, "c" => 523 }
# result: {"a" => 440, "b" => 493, "c" => 523 }
? n["a"]
# result: 440
To round out the language definition, let me just briefly describe
the other control constructs. You already know "if"; "while"
and "for" are also pretty much the usual:
while (condition) {
...
break;
...
continue;
...
}
for item in list {
...
break;
...
continue;
...
}
Just for completeness, i'll add that these loops are built out
of more basic constructs called "loop" and "escape". "escape"
may feel a bit strange but allows you to get out of loops pretty
much any way you want. "escape foo { ... }" creates a temporary
function called "foo" in the new scope which, when called, jumps
out to just after the end of the block. So,
while (condition) {
...
}
is internally expanded to
escape break {
loop {
if (! condition) { break }
escape continue {
...
}
}
}
"loop" by itself will just loop forever. Because the "break"
block encloses the entire loop, executing "break" will exit the
whole loop, whereas "continue" will just skip to the end of the
loop block and start over. In the end, the effect is pretty
much what you have in C++ or Python.
Like Python, E also has exceptions, which you can throw with
the "throw" keyword:
try {
...
throw "help!"
...
}
catch message {
...
}
I think that about covers the nuts and bolts of how the language
works. For the really neat distributed-object and security stuff,
i guess we'll have to wait until i feel ready to write another
long message. :)
Ping