Re: The story of E, part 2 (fwd) Mark S. Miller (markm@caplet.com)
Mon, 12 Oct 1998 12:53:12 -0700

At 03:27 AM 10/11/98 -0700, Ka-Ping Yee wrote:
>Hi again.
>
>I apologize in advance like sounding whiny and nitpicky about
>small details. All these are probably much less important than
>the security and implementation issues surrounding the language,
>but i hope that my suggestions will help to produce something
>likable and popular.

Harumph! The difference between a good idea for a language and a good language is attention to nitpicky details. I do not accept your apology!

>On Fri, 9 Oct 1998, Mark S. Miller wrote:
>>
>> Thanks. Except for the color, these will be fixed with the next website
>> update. Care to suggest an alternate color?
>
>I think you're fine without any colour, actually, but if you
>want one, a subdued orange or green or burgandy might do.

Cool. How do I say that in html? In any case, check out the newest (0.6.2) look of the website. Only the controversial cosmetics of the first page has changed.

>> Wrt casting in general, I'm inclined against the "toFoo" approach because
>> it requires a named method per type you can cast to.
>
>Oh, on the general issue, i'd agree with you on that.
>
>However, it does make me wonder how you implement casting behaviour...
>if you associate it with the target type, how does the target type get
>the information it wants out of the object without a prearranged
>interface?

There cannot be any generic answer. Any coercions must be specific to the semantics of the types converted from and to, so it's ok if its implementation is specific to the protocol of the type being converted from.

Btw, if we wanted to put the shoe on the other foot, we could define a generic "as(type)" message, where a Java class object is acceptable as a type. In this case, we write it as:

(x*x + y*y) as(double) sqrt

I'm currently not inclined to do this.

BtwBtw, a related issue, implicit coercions, is the most terrifying in programming language design. Languages which are otherwise wonderful, eg Algol68, have lost it on this issue. Several languages which are terrible, eg C++ and PL/1, are worst on this issue. Java doesn't fall into this trap, but Java's ugliest part is the nearby type-based overload resolution issue. E hopes to mostly avoid this hairball.

>(I only picked "to"
>due to "toString". If i had the choice i would prefer "asString" too...)

Likewise.

>> Jeez I wish they hadn't defined "float" to mean "single precision IEEE
>> floating point"! I know of no remaining short word that simply means
>> "floating point" without specifying a precision.
>
>Well... i actually used "toFloat" in my original answer because
>"double" sounds even more precision-y than "float". If you don't
>want to specify a precision, i think "float" sounds more general,
>even though it's used by C. <shrug> Both are used by C, anyway...

Uh uh, "float" is firmly established by Java (and corroborated by modern Cs) as meaning *single precision* IEEE floating point number. I can't redefine this for the same reason I can't redefine a byte to be 9 bits. C++ caught on because C programmers thought "I already know most of this language", and Java caught on because C++ programmers thought the same thing. In areas where E doesn't need to teach new lessons I'd like to be not gratuitously incompatible with the C-like tradition, and especially with Java.

But I do hate their choice of terms.

>Where would the type names actually appear in E programs?

I need to figure out what the protocol of Type objects in E are, but I know this much: All Java Class objects will be made to act like E Type objects, and all E Type objects will act like SlotMakers. (ie, the protocol defined for SlotMaker is a subset of the protocol defined for Type, ie, Type is a subtype of SlotMaker.) A SlotMaker is an object that responds to a one-arg "makeSlot" message by returning an object presumed to act as a Slot. A Slot is an object that responds to a zero-argument "getValue" message. By definition, the value this returns is the current value of the Slot. (A slot that also responds to a one-arg "setValue" supports assignment.)

As documented in http://erights.org/doc/elang/elangmanual.pdf and to be implemented in 0.7 (the current version out there is 0.6.2), wherever you can have a defining occurrence of a variable name, you can instead have

Identifier : expr

in which case, on matching this pattern against a specimen, expr is first evaluated to a value presumed to act as a SlotMaker. This value is then asked to "makeSlot(specimen)", and the resulting object is bound to Identifier as the Slot holding Identifier's value in the resulting scope.

Btw, this is the only form of defining occurrence of a variable in the kernel language, which is both simpler and more powerful than the kernel I presented to you and your friends at the foresight offices. This simplification is directly a result of fretting about why you folks were uncomfortable with some parts of what I presented that day. Thanks!

>(Leading to, "Do i really hafta type out 'Character'? Could
>we just make the type name [and coercing function] 'char'?)

Yup. Another unpleasant Java issue. Java has way too many ways to talk about the scalar types. Just for characters, there's the type and class "Character" vs the type "char" and the class "Character.TYPE". I think my only route to sanity is to insist that, from E, there is only one character type, which I'll call "char". Altogether

char, Character collapse to char

boolean, Boolean collapse to boolean

	byte, short, int, long,
	Byte, Short, Integer,
	Long, BigInteger		collapse to	integer

	float, double,
	Float, Double			collapse to	double

	BigDecimal			banished



As with the horrible issue of "float" being historically used up by the wrong meaning, we have the corresponding issue with "integer". However, the history of "floating point number" as a concept with a term is only computer-science-long, so one can concede that computer science has ruined it. "integer" though has a long and proud history before precision limited registers, or worse, integers in a modular field. On this, I'll go with Pythagoras and Peano over Kernighan, Ritchee, and Gosling.

To successfully shield the E programmer from the above distinctions, I'll be constantly at war with the Java libraries. It'll be a lot of work, but on this one I'm hopeful I can win.

>Other coercion issues:
>
> ? char("abcd") # should this be allowed?
> # result: 'a'
>

Bletch!! I can't imagine why anyone might think this is a good idea.

Btw, you are again saying "# result:" rather than "# value:". Which is better? (This would still be an easy change.)

>Possibly convenient, though i don't think it would be necessary
>as long as you can "abcd"[0]. I hope you consider strings to be
>sequences of characters with all the usual sequence methods:
>this is extremely convenient in Python (and *not* doing this is
>an almost unbelievable oversight in Perl).

I was about to say "yes, of course", but I just checked org.erights.e.meta.java.lang.StringSugar, and I haven't even implemented "get" (which is what square-bracket indexing expands to). Strings *will* be fixed to act in all ways like Tuples of chars.

> ? char(97) # should this be allowed?
> # result: 'a'
>
> ? int('a') # should this be allowed?
> # result: 97
>
>This seems nice. I like it, but if you think it is too easy to
>confuse int('4') with int("4") i could relent. I assume
>string('a') yields "a".
>
> ? 'a' ord
> # result: 97
>
>Another way to get the ordinal value of a character?

Good suggestions. I don't yet have a stance on this, except that I don't like int("4") or string('a').

>Can you easily generate number and character ranges?
>Python has the built-in "range" function, which can only
>generate ranges of integers (allowing a specified starting
>and ending point and positive or negative step). Perl has
>the ".." operator, such that 1..4 yields (1,2,3,4) and
>'s'..'u' yields ('s', 't', 'u').

I've got the same ".." operator, but I hadn't thought of extending it to characters. A good suggestion. Btw, whereas E's "x..y" expands to "x thru(y)" and means "from x inclusive thru y inclusive", E's "x..!y" expands to "x till(y)" and means the much more useful "from x inclusive till y exclusive". Closed-open intervals are almost always best.

>Is it true that "abc" + 'd' == "abcd" and 'b' + 'c' == "bc"?
>What about "-" * 8 == "--------" (useful formatting idiom in Python)?

	Yes,	"abc" + 'd' == "abcd"
	No,	'b' + 'c' throws an exception
	but	"" + 'b' + 'c' == "bc"
	Yes,	"-" * 8 == "--------"


>Actually, in general in Python, any sequence works: [3,5,6] * 3 ==
>[3,5,6,3,5,6,3,5,6].

Yes as well.

>... If you
>want to reserve '*' so that vectors can [3,5,6] * 3 == [9,15,18], ... In
>Python ... multiplies only homogeneous vectors ...

E's Tuples and Vectors are a storage & collection abstraction, not a linear algebra abstraction. An E programmer could define a "matrix" such that

matrix([3,5,6]) * 3 == matrix([9,15,18])

and even

matrix([3,5]) * matrix(4,7) == 3*4 + 5*7 == 47

I'm not suggesting this, but it isn't up to me. APLers, go wild.

>What are the "standard tuple" and "standard mapping" interfaces?
>I'm guessing get(key), put(key), del(key), len(), add(tuple/map)?
>Anything else?

Close.

For collections in general, how about the methods of http://erights.org/doc/javadoc/org.erights.e.elib.tables.Table.html but leaving out isIdentity, keyType, and valueType?

For mutable collections, we add "put(key, value)" (as in http://erights.org/doc/javadoc/org.erights.e.elib.tables.TableEditor.html#pu t and
http://erights.org/doc/javadoc/org.erights.e.meta.java.util.VectorSugar.html #put) and "clear" (as in
http://erights.org/doc/javadoc/org.erights.e.elib.tables.TableEditor.html#cl ear and OOPS!, neither Vector nor VectorSugar has a "clear" method. This will be fixed.).

Sequences (I do like that word) in addition have both slice(start) and slice(bound), and (for Java folk) define "length" as a synonym for "size".

"add" (or "+") is only used to combine Tuples, whereas "|" is only used to combine Mappings. Why? As discussed earlier, the Tuple

["foo", "bar"]

is like the Mapping

[0 => "foo", 1 => "bar"]

but notice that

["baz"] + ["foo", "bar"]

yields

["baz", "foo", "bar"]

which is like the Mapping

[0 => "baz", 1 => "foo", 2 => "bar"]

whereas

["baz"] asMapping | ["foo", "bar"] asMapping

yields

[0 => "baz", 1 => "bar"]

since the binding of zero to "foo" is occluded.

>Any way to get an escaped copy of a string? In Python there
>is repr(), which produces:
>
> >>> s = "foo\tbar\n"
>
> >>> s
> 'foo\011bar\012'
>
> >>> print s
> foo bar
>
> >>> repr(s)
> "'foo\\011bar\\012'"
>
>Notice that repr(), not str(), is used for printing out results
>in the interactive loop. This means that you can directly cut
>and paste simple objects back into the editor, which can be very
>useful (numbers and strings, as well as lists, tuples, and
>dictionaries to any depth as long as they contain only numbers
>and strings). In Python all objects support both str() and repr().

I like this a lot. Smalltalk also had something like this, and I've also become increasingly irritated using the Java toString behavior as the print part of my read-eval-print loop. I agree in principle, but this will be a lot of work.

>This is really good for strings, and provides a sometimes useful
>distinction between a shorter display of an object and a complete
>representation of its state allowing reconstruction. (You can
>define your own __str__ and __repr__ methods for Python to call.)
>It can sometimes be a bother when it is impossible or undesirable
>to display complete state, in which case you end up defining
>__str__ and __repr__ to just do the same thing.

I don't understand this point??

>What is the default string conversion when toString is not given?

If the object is defined in E, then it's `<$behaviorName>`, where behaviorName is the name between the define and the methods:

	? define PointMaker(x, y) {
	>     define point {
	>         to getX {x}
	>         to getY {y}
	>     }
	> }
	# value: <PointMaker>
	
	? define pt := PointMaker(3, 5)
	# <point>


Given that we switch from using toString to asRepr or something, and once upgrade-for-prototyping is working, an interactive session could proceed as follows:

(In Elmer, we could correct our mistake by editing in place, and then hitting return again after the final close curly.)

	? define PointMaker(x, y) {
	>     define point {
	>         to asRepr {`PointMaker($x, $y)`}
	>         to getX {x}
	>         to getY {y}
	>     }
	> }
	# value: <PointMaker>

	? pt

	> PointMaker(3, 5)

E's syntax discourages anonymous closures so we can enhance the behavior of already instantiated objects, as above.

>Oh, and while we're on the subject of basic operations...
>
> ? 5 % 3
> # result: 2
>
> ? -1 % 3
> # result: 2 # what i want (Python & Perl & Tcl agree)
>
> # result: -1 # arrgghh! what is WRONG with these C people?!?!
>
>Please please please! (Possibly related: what is the difference
>between % and %%?)

This is exactly the difference. C actually doesn't specify which of the answers you get, but Java specifies -1. On principle of avoiding unnecessary surprises, E's % gives -1 (and so corresponds to E's truncDivide), but E's wonderful %% gives 2 (and so corresponds to E's wonderful _/ or floorDivide).

>And what is _% ? I saw it in http://erights.org/doc/e/satan/index.html
>under the "Begin" section.

Historical. Will be fixed in the 0.6.3 version of the website. We thought that _% would be suggestive of its pairing with _/, but later decided pairing %% with ** for modular exponentiation was better. Originally, _/ was //, but this seemed too much like begin-comment to the target audience. E is stuck with "//" as a synonym for "#".

>Is there a really good motivation for providing the aliases
>"-" -> "subtract" or "negate", "!" -> "not", "~" -> "complement",
>"[]" -> "get" etc. as seen in the "call" and "method" sections of
>http://erights.org/doc/e/e-grammar.html? I am afraid this will
>just lead to confusion, especially subtractions which could look
>like "object<--(arg)", and only enlarges the grammar.

I don't get it. What's the alternative?

>> In the kernel language, yes, only the curlies of the object expression
>> encloses a behavior definition. In the full syntax, braces either enclose
>> a scoped expression, or they enclose a behavior definition, to be learned
>> on a case by case basis. Sorry, but nothing else seemed simpler. (Feel
>> free to take this as a challenge.)
>
>Hmmm... all the ones i saw seemed to create new scopes. What
>are the exceptions?

They all create new scopes. The difference is in what they enclose. The object expression encloses a set of methods followed by a sequence of matchers. The switch expression encloses a sequence of matchers. Once we switch to the new Mapping syntax, all current remaining forms enclose an expression. However, the reserved macro syntax provides for the creation of new forms of either type.

>I notice that "if" and "for" etc. all seem to *require* braces
>after them, according to the grammar page (you can't just have a
>single expr on its own like in C). I think this is good. This
>does imply, though, that the parentheses around the condition
>are unnecessary -- we could get rid of the extra punctuation in
>"if", "for", "while", "switch" to match "escape", "match", etc.

When the thing between the keyword and the open-curly is an expression, it's enclosed in parens. When it's a pattern, it isn't. This rule is enforced on macros, so that the E programmer (and tools) can look at code using an unknown macro and still do scope analysis! (Sort of a principle of least authority for syntactic extensions) By making the built-in syntax work according to the same rules, we set up the right expectations.

>(Though, by the way, i don't understand how the "verb" in "for"
>is supposed to work.)

The grammar document is out-of-date. The verb is gone.

>I also don't understand the expansions for "||" and "&&" on the
>grammar page. I would have expected something like
>
> left || right
>
>->
>
> define __temp__ := left
> if (__temp__) { __temp__ } else { right }
>
>and
>
> left && right
>
>->
>
> define __temp__ := left
> if (__temp__) { right } else { __temp__ }
>

Remember that if

expr1 =~ [x, y] && expr2 =~ [z]

succeeds, the resulting scope needs to have x, y, and z bound appropriately. Your expansion doesn't do that.

	More messages, asynchronously, later,

	--MarkM