Announcing E 0.8.4: The Birthday Release

Mark S. Miller markm@caplet.com
Thu, 27 May 1999 13:18:55 -0700


Today is my birthday.  I am now the ultimate answer to the ultimate question 
of Life, the Universe, and Everything.  Getting back more copies of the 
E Survey would make this a very happy birthday. ;)

-------------------------------------------------------------------------


		The definition of Equality is a sign of E Quality


Up till this release, the implementation of E's "==" (sameness) was an 
incoherent placeholder awaiting an implementation of the real semantics.  
Similarly, E's hashtables (EMap and its subtypes ConstMap, FlexMap, and 
ReadOnlyMap) were using Java's (equally incoherent) .equals() and 
.hashCode() while awaiting the real thing.  An 0.8.3 example:

	? [2] == [2]
	# value: false

It was not acceptable for something so basic to remain broken.

The real "==" has now mostly arrived.  E's "==" (like Java's, PPSmalltalk's, 
and Squeak's "==", Python's "is", and Lisp's "EQ"), is a primitive that 
can't be overridden by user code.  It does not work by sending any messages. 
When it says "x == y", this really means x and y are indistinguishable in 
their effects on future computation.  The Grant Matcher Puzzle 
http://www.erights.org/elib/capability/grant-matcher/index.html explains why 
such a primitive is necessary to combine trust from separately 
partially-trusted sources.

The semantics of the "==" operator in these other languages distinguishes 
between two kinds of objects: scalars and objects with creation-identity.  
Objects with creation identity are == when they have the same 
creation-identity.  The act of object allocation/creation/initialization in 
these languages (eg, Java's "new") endows the new object with a unique 
creation identity.  Scalars, on the other hand, are compared by value.

E's treatment of "==" is very similar, and has all the above formal 
properties (computational equivalence & trust combination) but E broadens 
the compare-by-value category from scalars PassByCopy objects, which (like 
scalars) are immutable and transparent.  It's just too weird for 3 == 3 to 
always be true, but for two points or complex numbers to compare based on 
creation-identity.

The name "PassByCopy" gives away the bigger motivation.  Only by broadening 
the category of objects that have no creation identity, and are defined only 
by their immutable values, can we broaden the category of objects that
can be cheaply passed-by-copy over the network, in an acceptably transparent 
fashion.  Besides the scalars, experience shows that the passed-by-copy 
category needs to include both primitive and user defined aggregate data 
types.  As of this 0.8.4 release, the PassByCopy category does include E's 
scalars and primitive aggregate types (ConstList and ConstMap), but not yet 
objects defined in the E language.

	? [2] == [2]
	# value: true

	? ["foo" => 3, "bar" => 4] == ["foo" => 3, "bar" => 4]
	# value: true

The lookup of keys in E's hashtables (EMap etc..) is now correctly based on 
"==" as well.  If x == y, then something stored under x will be retrieved by y:

	? def x := [[2] => "foo"]
	# [[2] => foo]

	? x[[2]]
	# value: foo

-------------------------------------------------------------------------


			Shaving the Barber with Occam's Razor

Of course, the definition of "==" is recursive in that way loved by computer 
scientists.  For example, two ConstLists are == when they are the same size 
and their corresponding elements are ==.

	? x == [[2] => "foo"]
	# value: true

Actually, if you examine the above statement, you'll see it's actually the 
recursion loved by all computer scientists that love that sort of thing; ie, 
it does not require the recursion to terminate.  Given the distributed 
nature of E, and the delayed fashion in which a promise for a value 
eventually resolves to a value, it would not be possible for E to prohibit 
cyclic data.  Instead (inspired by Colmerauer's Prolog III), all Kernel-E 
algorithms are required to work, finitely and correctly, with cyclic data.  
This includes == and hash table lookup:

	? define tight := [1, tight, "x"]
	# value: [1, ***CYCLE***, x]

	? define loose := [1, [1, loose, "x"], "x"]
	# value: [1, ***CYCLE***, x]

	? tight == loose
	# value: true

	? def map := [tight => "foo"]
	# value: [[1, ***CYCLE***, x] => foo]

	? map[loose]
	# value: foo

Internally, tight and loose have very different representations.  However, 
when both cycles are unwound, they represent the same infinite tree.  One 
could say that tight's representation of this tree is more tightly wound 
than loose's representation.  However, this difference is only in the 
implementation, not in the semantics.  The value of tight and loose is only 
the infinite tree they represent.  If these trees are the same, then tight 
and loose are ==.

Notice that loose prints out according to the tightest winding of the tree 
it represents, not according to the cycle by which it represents this tree.  
Only the tightest winding is finite and canonical.

All this cycle handling may seem complicated when you try to think about it, 
because it is unfamiliar (except PrologIII programmers).  This would 
seem to contradict E's goal of easy & quick familiarity to C-tradition 
programmers.  However, C-tradition programmers would expect cyclic 
structures to cause stack overflows.  Rather than blow up, we just remove 
restrictions and do the right thing, with no dangerous surprises.  It is 
safe for an E programmer not to think about cycles if they wish.


-------------------------------------------------------------------------



				Stable vs Monotonic

Another property of "==" in these other languages: it is total and stable.  
By "total", I mean that for any two variables x and y, if "x == y" compiles 
then it runs and successfully produces a value, rather than throwing an 
exception.  By "stable", I mean that x == y always produces the same answer 
for the same x and y, no matter when and how often we ask.

Unfortunately, promises make it impossible for us to be both total and 
stable.  Consider:

	? define [p1,r1] := PromiseMaker()
	# value: [<Deferred ref>, <Unresolved Resolver>]

	? define [p2,r2] := PromiseMaker()
	# value: [<Deferred ref>, <Unresolved Resolver>]

	? r1 resolve(3)
	? p1
	# value: 3

	? p2
	# value: <Deferred ref>

	? r2 resolve(3)
	? p2
	# value: 3

	? p1 == p2
	# value: true

Once both promises are resolved, they happen to be resolved to == values.  
Since a resolved promise is indistinguishable from the value it resolved to, 
once resolved these promises are themselves ==.  However, what if we had 
asked before they had gotten resolved?  Without being able to predict the 
future, there's no way to give the same answer we will give once they are 
resolved.

If == is asked before there is enough information to give a stable answer, 
rather than give an unstable answer, == throws and exception.  Since it does 
not always give an answer, it is "partial" rather than total.  Since the 
cases for which it does give an answer only grows over time, and since the 
answers it does give are stable, it is "monotonic".  (This resembles 
read-only-unification, or ask-unification, in the Concurrent Prolog family 
of languages.)

	? define [p3,r3] := PromiseMaker()
	# value: [<Deferred ref>, <Unresolved Resolver>]
	
	? def loose := [1, [1, loose, p3], "x"]
	# value: [1, [1, [1, ***CYCLE***, x], <Deferred ref>], x]
	
	? tight == loose
	# problem: <NotSettledException: both must be LOCAL>
	
	? loose == loose[1]
	# problem: <NotSettledException: both must be LOCAL>
	
	? E isSettled(loose)
	# value: false
	
	? r3 resolve("x")
	? loose
	# value: [1, ***CYCLE***, x]
	
	? E isSettled(loose)
	# value: true

	? tight == loose
	# value: true
	
	? loose == loose[1]
	# value: true

As you can see, we've also introduced a new predicate, "E isSettled(foo)", 
that says whether a value is settled.  An object is settled if it's local 
and either 1) not PassByCopy, or 2) PassByCopy, and all of its contents are 
settled.  Once again, this recursive definition doesn't assume the tree is 
finite.  Only settled objects can be used as keys in EMaps.  I suspect that 
after talking to Arturo, we'll also want to say that only settled data can 
be made persistent.


-------------------------------------------------------------------------


			Order-Preserving Tables

Earlier I sent out a first message about the uses of determinism in 
computation.  I haven't yet gotten around to documenting the most important 
use for E: Contract Verification.  An individual E Vat, so it would seem, 
has only the non-determinism of the arrival order of messages sent by the 
external world (other Vats and I/O devices), as its non-deterministic 
inputs.  In the absence of spontaneous Errors by the implementation (like 
java.lang.OutOfMemoryError -- thanks Dan for noticing this issue) it seemed 
likely that one could deterministically replay a Vat's computation from its 
initial state and these inputs.

This allows a partially-trusted third party intermediary to run the code of 
a contract, and provide to external auditing agents a description of the 
inputs and outputs.  The auditing agent could then verify that this 
contract, given the reported inputs, produce the reported outputs.  Such 
auditors are then left with the difficult (but not necessarily impossible!) 
task of comparing notes with the participants and verifying that the outputs 
and inputs they report corresponds to the inputs and outputs reported by 
the intermediary.

Notice that, because of arrival-order non-determinism, the auditor needs 
more information from the intermediary than the other participants could 
provide, since the participant cannot predict how their messages are 
interleaved. This is spontaneous non-determinism generated at the 
intermediary.  Every such spontaneously generated non-deterministic input 
to computation must be identified and logged to enable deterministic replay.

A long time ago, I had noticed the threat to determinism posed by 
Object.hashCode() or any equivalent, and these are not accessible in E.  As 
a result, E's hashtables are necessarily primitive -- one cannot implement a 
new hashtable in E. This seems a small price.  However, not until after the 
0.8.3 release did I notice that hashtable enumeration order is just as bad.  
Besides preventing replay, it can also serve as a covert channel *in*, by 
which unconfined Mallet can send bits to the confined Bob.  This was the 
direction in which E's constrained non-determinism was supposed to prevent 
covert channels. (The other direction is *hard*.)

There's no good answer to this problem, but the least bad answer seems to be 
to make a hashtable's enumeration order be a deterministic consequence of 
the sequence of operations that caused the hashtable to contain what it 
contains.  
http://www.erights.org/javadoc/org/erights/e/elib/tables/EMap.html and 
subclasses has the full spec, but the most important properties are

	1) The hashtable resulting from a hashtable expression has the same
 	order as stated in the expression.

	2) In the absence of other operations, insertion order is preserved.

With these two properties, hashtable order becomes useful for other things, 
and not just something we have to preserve for the sake of determinism.  
Unfortunately, it means that two table with different orders, but otherwise 
the same, are not ==.

	? define a := ["foo" => 3, "bar" => 4]
	# value: [foo => 3, bar => 4]
	
	? define b := ["bar" => 4, "foo" => 3]
	# value: [bar => 4, foo => 3]
	
	? a == b
	# value: false

Btw, notice that tables print in order.


-------------------------------------------------------------------------



			Proxy Comm System stuff


* replaceObject() in ProxyOutputStream has been substantially simplified.

* RedirectorResolver had a bunch of bugs that were preventing whenBroken() 
from working in some circumstances.

* In anticipation of Una, PassByCopy is now a subtype of PassByPresence.  
PassByCopy is the distinction used by ==.  PassByPresence is the distinction 
used by ProxyOutputStream.replaceObject() to decide whether to serialize the 
object itself or not.  An Unum would be PassByPresence but not PassByCopy.


-------------------------------------------------------------------------



			Other Random Stuff


* Because of a bug in how I was walking the Java inheritance graph, E's 
Miranda methods were overriding methods declared in interfaces, rather than 
the other way around.  (Did you know that .getSuperclass() on a Class object 
representing an interface returns null, even though Java interfaces inherit 
from 'class Object'?)


* Inspired yet again by something Ping said about Python, I painlessly made a 
fully upwards compatible change to the grammar: you may now optionally break 
a line after (to the right of) any infix operator.

	? 2 +
	> 3
	# value: 5


* I made a non-upwards-compatible change in the expansion that shouldn't yet 
effect anyone but me: I changed the expansion of the URI and quasi-literal 
syntaxes to use mangling instead of sub-scopes.  Whereas 'foo:/x' used to 
expand to uriGetters["foo"]["/x"], in now expands to foo__uriGetters["/x"].  
Similarly, e`2+3` used to expand to 

	quasiParsers["e"] valueMaker("2+3") substitute([])

but now expands to

	e__quasiParser valueMaker("2+3") substitute([])

This change makes it much easier to define a new uri-getter or quasi-parser 
in a local scope.


* For some unfathomable reason, Javasoft defined Swing not to default to the 
platform look & feel, but instead provides boilerplate example code for how 
to initialize your application to do so.  I can't imagine why this might not 
be the right default, so the E interpreter now does this silently when it 
starts up.  As a result, MarcS' echat now runs with the platform look & feel.


* I've started to write a compiler in E to compile E to Java.  Along the 
way, I made all the E parsenodes participate in the Visitor design pattern, 
so that it would be easy to walk E parse trees.  It turns out there are 29 
fundamental types of node to walk, as documented in 
http://www.erights.org/javadoc/org/erights/e/elang/evm/ETreeVisitor.html 
29 is large compared to scheme, but it's still pleasingly small and 
scheme-like.  Any suggestions for reducing some to the others would be most 
welcome.


* Next are Updoc in E, an E-to-Java compiler in E, more bug fixes, and 
documentation.



-------------------------------------------------------------------------


	Is a semantics of == among capabilities a theory of equal rights?
	--MarkM