The story of E, part 2 (fwd)

Mark S. Miller markm@caplet.com
Mon, 12 Oct 1998 12:53:12 -0700


At 03:27 AM 10/11/98 -0700, Ka-Ping Yee wrote:
>Hi again.
>
>I apologize in advance like sounding whiny and nitpicky about
>small details.  All these are probably much less important than
>the security and implementation issues surrounding the language,
>but i hope that my suggestions will help to produce something
>likable and popular.

Harumph!  The difference between a good idea for a language and a good
language is attention to nitpicky details.  I do not accept your apology!


>On Fri, 9 Oct 1998, Mark S. Miller wrote:
>> 
>> Thanks.  Except for the color, these will be fixed with the next website
>> update.  Care to suggest an alternate color?
>
>I think you're fine without any colour, actually, but if you
>want one, a subdued orange or green or burgandy might do.

Cool.  How do I say that in html?  In any case, check out the newest
(0.6.2) look of the website.  Only the controversial cosmetics of the first
page has changed.

 
>> Wrt casting in general, I'm inclined against the "toFoo" approach because
>> it requires a named method per type you can cast to.
>
>Oh, on the general issue, i'd agree with you on that.
>
>However, it does make me wonder how you implement casting behaviour...
>if you associate it with the target type, how does the target type get
>the information it wants out of the object without a prearranged
>interface?

There cannot be any generic answer.  Any coercions must be specific to the
semantics of the types converted from and to, so it's ok if its
implementation is specific to the protocol of the type being converted from.

Btw, if we wanted to put the shoe on the other foot, we could define a
generic "as(type)" message, where a Java class object is acceptable as a
type.  In this case, we write it as:

	(x*x + y*y) as(double) sqrt

I'm currently not inclined to do this.

BtwBtw, a related issue, implicit coercions, is the most terrifying in
programming language design.  Languages which are otherwise wonderful, eg
Algol68, have lost it on this issue.  Several languages which are terrible,
eg C++ and PL/1, are worst on this issue.  Java doesn't fall into this
trap, but Java's ugliest part is the nearby type-based overload resolution
issue.  E hopes to mostly avoid this hairball.


>(I only picked "to"
>due to "toString".  If i had the choice i would prefer "asString" too...)

Likewise.


>> Jeez I wish they hadn't defined "float" to mean "single precision IEEE
>> floating point"!  I know of no remaining short word that simply means
>> "floating point" without specifying a precision.
>
>Well... i actually used "toFloat" in my original answer because
>"double" sounds even more precision-y than "float".  If you don't
>want to specify a precision, i think "float" sounds more general,
>even though it's used by C.  <shrug> Both are used by C, anyway...

Uh uh, "float" is firmly established by Java (and corroborated by modern
Cs) as meaning *single precision* IEEE floating point number.  I can't
redefine this for the same reason I can't redefine a byte to be 9 bits.
C++ caught on because C programmers thought "I already know most of this
language", and Java caught on because C++ programmers thought the same
thing.  In areas where E doesn't need to teach new lessons I'd like to be
not gratuitously incompatible with the C-like tradition, and especially
with Java.

But I do hate their choice of terms.


>Where would the type names actually appear in E programs?

I need to figure out what the protocol of Type objects in E are, but I know
this much:  All Java Class objects will be made to act like E Type objects,
and all E Type objects will act like SlotMakers.  (ie, the protocol defined
for SlotMaker is a subset of the protocol defined for Type, ie, Type is a
subtype of SlotMaker.)  A SlotMaker is an object that responds to a one-arg
"makeSlot" message by returning an object presumed to act as a Slot.  A
Slot is an object that responds to a zero-argument "getValue" message.  By
definition, the value this returns is the current value of the Slot.  (A
slot that also responds to a one-arg "setValue" supports assignment.)

As documented in http://erights.org/doc/elang/elangmanual.pdf and to be
implemented in 0.7 (the current version out there is 0.6.2), wherever you
can have a defining occurrence of a variable name, you can instead have

	Identifier : expr

in which case, on matching this pattern against a specimen, expr is first
evaluated to a value presumed to act as a SlotMaker.  This value is then
asked to "makeSlot(specimen)", and the resulting object is bound to
Identifier as the Slot holding Identifier's value in the resulting scope.

Btw, this is the only form of defining occurrence of a variable in the
kernel language, which is both simpler and more powerful than the kernel I
presented to you and your friends at the foresight offices.  This
simplification is directly a result of fretting about why you folks were
uncomfortable with some parts of what I presented that day.  Thanks!


>(Leading to, "Do i really hafta type out 'Character'?  Could
>we just make the type name [and coercing function] 'char'?)

Yup.  Another unpleasant Java issue.  Java has way too many ways to talk
about the scalar types.  Just for characters, there's the type and class
"Character" vs the type "char" and the class "Character.TYPE".  I think my
only route to sanity is to insist that, from E, there is only one character
type, which I'll call "char".  Altogether

	char, Character		collapse to	char

	boolean, Boolean		collapse to	boolean

	byte, short, int, long,
	Byte, Short, Integer,
	Long, BigInteger		collapse to	integer

	float, double,
	Float, Double			collapse to	double

	BigDecimal			banished


As with the horrible issue of "float" being historically used up by the
wrong meaning, we have the corresponding issue with "integer".  However,
the history of "floating point number" as a concept with a term is only
computer-science-long, so one can concede that computer science has ruined
it.  "integer" though has a long and proud history before precision limited
registers, or worse, integers in a modular field.  On this, I'll go with
Pythagoras and Peano over Kernighan, Ritchee, and Gosling.

To successfully shield the E programmer from the above distinctions, I'll
be constantly at war with the Java libraries.  It'll be a lot of work, but
on this one I'm hopeful I can win.


>Other coercion issues:
>
>    ? char("abcd")           # should this be allowed?
>    # result: 'a'
>

Bletch!!  I can't imagine why anyone might think this is a good idea.

Btw, you are again saying "# result:" rather than "# value:".  Which is
better?  (This would still be an easy change.)


>Possibly convenient, though i don't think it would be necessary
>as long as you can "abcd"[0].  I hope you consider strings to be
>sequences of characters with all the usual sequence methods:
>this is extremely convenient in Python (and *not* doing this is
>an almost unbelievable oversight in Perl).

I was about to say "yes, of course", but I just checked
org.erights.e.meta.java.lang.StringSugar, and I haven't even implemented
"get" (which is what square-bracket indexing expands to).  Strings *will*
be fixed to act in all ways like Tuples of chars.


>    ? char(97)               # should this be allowed?
>    # result: 'a'
>
>    ? int('a')               # should this be allowed?
>    # result: 97
>
>This seems nice.  I like it, but if you think it is too easy to
>confuse int('4') with int("4") i could relent.  I assume
>string('a') yields "a".
>
>    ? 'a' ord
>    # result: 97
>
>Another way to get the ordinal value of a character?

Good suggestions.  I don't yet have a stance on this, except that I don't
like int("4") or string('a').  


>Can you easily generate number and character ranges?
>Python has the built-in "range" function, which can only
>generate ranges of integers (allowing a specified starting
>and ending point and positive or negative step).  Perl has
>the ".."  operator, such that 1..4 yields (1,2,3,4) and
>'s'..'u' yields ('s', 't', 'u').

I've got the same ".." operator, but I hadn't thought of extending it to
characters.  A good suggestion.  Btw, whereas E's "x..y" expands to "x
thru(y)" and means "from x inclusive thru y inclusive", E's "x..!y" expands
to "x till(y)" and means the much more useful "from x inclusive till y
exclusive".  Closed-open intervals are almost always best.


>Is it true that "abc" + 'd' == "abcd" and 'b' + 'c' == "bc"?
>What about "-" * 8 == "--------" (useful formatting idiom in Python)?

	Yes,	"abc" + 'd' == "abcd"
	No,	'b' + 'c' throws an exception
	but	"" + 'b' + 'c' == "bc"
	Yes,	"-" * 8 == "--------"


>Actually, in general in Python, any sequence works: [3,5,6] * 3 ==
>[3,5,6,3,5,6,3,5,6].

	Yes as well.


>... If you
>want to reserve '*' so that vectors can [3,5,6] * 3 == [9,15,18], ...  In
>Python ... multiplies only homogeneous vectors ...

E's Tuples and Vectors are a storage & collection abstraction, not a linear
algebra abstraction.  An E programmer could define a "matrix" such that

	matrix([3,5,6]) * 3 == matrix([9,15,18])

and even

	matrix([3,5]) * matrix(4,7) == 3*4 + 5*7 == 47

I'm not suggesting this, but it isn't up to me.  APLers, go wild.



>What are the "standard tuple" and "standard mapping" interfaces?
>I'm guessing get(key), put(key), del(key), len(), add(tuple/map)?
>Anything else?

Close.  

For collections in general, how about the methods of
http://erights.org/doc/javadoc/org.erights.e.elib.tables.Table.html but
leaving out isIdentity, keyType, and valueType?  

For mutable collections, we add "put(key, value)" (as in
http://erights.org/doc/javadoc/org.erights.e.elib.tables.TableEditor.html#pu
t and
http://erights.org/doc/javadoc/org.erights.e.meta.java.util.VectorSugar.html
#put) and "clear" (as in
http://erights.org/doc/javadoc/org.erights.e.elib.tables.TableEditor.html#cl
ear and OOPS!, neither Vector nor VectorSugar has a "clear" method.  This
will be fixed.).

Sequences (I do like that word) in addition have both slice(start) and
slice(bound), and (for Java folk) define "length" as a synonym for "size".

"add" (or "+") is only used to combine Tuples, whereas "|" is only used to
combine Mappings.  Why?  As discussed earlier, the Tuple

	["foo", "bar"]

is like the Mapping

	[0 => "foo", 1 => "bar"]

but notice that

	["baz"] + ["foo", "bar"]

yields

	["baz", "foo", "bar"]

which is like the Mapping

	[0 => "baz", 1 => "foo", 2 => "bar"]

whereas

	["baz"] asMapping | ["foo", "bar"] asMapping

yields

	[0 => "baz", 1 => "bar"]

since the binding of zero to "foo" is occluded.


>Any way to get an escaped copy of a string?  In Python there
>is repr(), which produces:
>
>    >>> s = "foo\tbar\n"
>
>    >>> s
>    'foo\011bar\012'
>
>    >>> print s
>    foo     bar
>
>    >>> repr(s)
>    "'foo\\011bar\\012'"
>
>Notice that repr(), not str(), is used for printing out results
>in the interactive loop.  This means that you can directly cut
>and paste simple objects back into the editor, which can be very
>useful (numbers and strings, as well as lists, tuples, and
>dictionaries to any depth as long as they contain only numbers
>and strings).  In Python all objects support both str() and repr().

I like this a lot.  Smalltalk also had something like this, and I've also
become increasingly irritated using the Java toString behavior as the print
part of my read-eval-print loop.  I agree in principle, but this will be a
lot of work.


>This is really good for strings, and provides a sometimes useful
>distinction between a shorter display of an object and a complete
>representation of its state allowing reconstruction.  (You can
>define your own __str__ and __repr__ methods for Python to call.)
>It can sometimes be a bother when it is impossible or undesirable
>to display complete state, in which case you end up defining
>__str__ and __repr__ to just do the same thing.

I don't understand this point??


>What is the default string conversion when toString is not given?

If the object is defined in E, then it's `<$behaviorName>`, where
behaviorName is the name between the define and the methods:

	? define PointMaker(x, y) {
	>     define point {
	>         to getX {x}
	>         to getY {y}
	>     }
	> }
	# value: <PointMaker>
	
	? define pt := PointMaker(3, 5)
	# <point>

Given that we switch from using toString to asRepr or something, and once
upgrade-for-prototyping is working, an interactive session could proceed as
follows:

(In Elmer, we could correct our mistake by editing in place, and then
hitting return again after the final close curly.)


	? define PointMaker(x, y) {
	>     define point {
	>         to asRepr {`PointMaker($x, $y)`}
	>         to getX {x}
	>         to getY {y}
	>     }
	> }
	# value: <PointMaker>

	? pt
	> PointMaker(3, 5)

E's syntax discourages anonymous closures so we can enhance the behavior of
already instantiated objects, as above.


>Oh, and while we're on the subject of basic operations...
>
>    ? 5 % 3
>    # result: 2
>
>    ? -1 % 3
>    # result: 2        # what i want (Python & Perl & Tcl agree)
>
>    # result: -1       # arrgghh!  what is WRONG with these C people?!?!
>
>Please please please!  (Possibly related: what is the difference
>between % and %%?)

This is exactly the difference.  C actually doesn't specify which of the
answers you get, but Java specifies -1.  On principle of avoiding
unnecessary surprises, E's % gives -1 (and so corresponds to E's
truncDivide), but E's wonderful %% gives 2 (and so corresponds to E's
wonderful _/ or floorDivide).


>And what is _% ?  I saw it in http://erights.org/doc/e/satan/index.html
>under the "Begin" section.

Historical.  Will be fixed in the 0.6.3 version of the website.  We thought
that _% would be suggestive of its pairing with _/, but later decided
pairing %% with ** for modular exponentiation was better.  Originally, _/
was //, but this seemed too much like begin-comment to the target audience.
 E is stuck with "//" as a synonym for "#".


>Is there a really good motivation for providing the aliases
>"-" -> "subtract" or "negate", "!" -> "not", "~" -> "complement",
>"[]" -> "get" etc. as seen in the "call" and "method" sections of 
>http://erights.org/doc/e/e-grammar.html?  I am afraid this will
>just lead to confusion, especially subtractions which could look
>like "object<--(arg)", and only enlarges the grammar.

I don't get it.  What's the alternative?


>> In the kernel language, yes, only the curlies of the object expression
>> encloses a behavior definition.  In the full syntax, braces either enclose
>> a scoped expression, or they enclose a behavior definition, to be learned
>> on a case by case basis.  Sorry, but nothing else seemed simpler.  (Feel
>> free to take this as a challenge.)
>
>Hmmm... all the ones i saw seemed to create new scopes.  What
>are the exceptions?

They all create new scopes.  The difference is in what they enclose.  The
object expression encloses a set of methods followed by a sequence of
matchers.  The switch expression encloses a sequence of matchers.  Once we
switch to the new Mapping syntax, all current remaining forms enclose an
expression.  However, the reserved macro syntax provides for the creation
of new forms of either type.


>I notice that "if" and "for" etc. all seem to *require* braces
>after them, according to the grammar page (you can't just have a
>single expr on its own like in C).  I think this is good.  This
>does imply, though, that the parentheses around the condition
>are unnecessary -- we could get rid of the extra punctuation in
>"if", "for", "while", "switch" to match "escape", "match", etc.

When the thing between the keyword and the open-curly is an expression,
it's enclosed in parens.  When it's a pattern, it isn't.  This rule is
enforced on macros, so that the E programmer (and tools) can look at code
using an unknown macro and still do scope analysis!  (Sort of a principle
of least authority for syntactic extensions)  By making the built-in syntax
work according to the same rules, we set up the right expectations.


>(Though, by the way, i don't understand how the "verb" in "for"
>is supposed to work.)

The grammar document is out-of-date.  The verb is gone.


>I also don't understand the expansions for "||" and "&&" on the
>grammar page.  I would have expected something like
>
>    left || right
>
>->
>
>    define __temp__ := left
>    if (__temp__) { __temp__ } else { right }
>
>and
>
>    left && right
>
>->
>
>    define __temp__ := left
>    if (__temp__) { right } else { __temp__ }
>

Remember that if

	expr1 =~ [x, y] && expr2 =~ [z]

succeeds, the resulting scope needs to have x, y, and z bound
appropriately.  Your expansion doesn't do that.


	More messages, asynchronously, later,
	--MarkM