The story of E, part 2 (fwd)

Ka-Ping Yee ping@lfw.org
Tue, 13 Oct 1998 02:02:31 -0700 (PDT)


On Mon, 12 Oct 1998, Mark S. Miller wrote:
> Harumph!  The difference between a good idea for a language and a good
> language is attention to nitpicky details.  I do not accept your apology!

Fine.  Have it your way.  :-P

:)

> >I think you're fine without any colour, actually, but if you
> >want one, a subdued orange or green or burgandy might do.
> 
> Cool.  How do I say that in html?  In any case, check out the newest
> (0.6.2) look of the website.  Only the controversial cosmetics of the first
> page has changed.

What?  You're asking me to make an artistic judgement?  Yikes.

Well, straining my feeble sense of colour matching, against the
currently faint-peachy background of the page, i think something
in the range of red-brown-burgandy would work.  Perhaps #b04020
if you like something more orangish, or #a02050 if you like
something more purplish.

But maybe you should really ask a girl.  They always seem to
know this stuff better.
 
> Uh uh, "float" is firmly established by Java (and corroborated by modern
> Cs) as meaning *single precision* IEEE floating point number.  I can't
> redefine this for the same reason I can't redefine a byte to be 9 bits.

Well, i wouldn't go that far.  I'd bet that in the vast majority of
cases where programmers use floating-point numbers, nobody really
does much thinking about their precision limits.  I doubt anyone
actually *relies* on the fact that a float has exactly the precision
it does.  (I don't even remember how many bits it is... i just think
of it as approximately 6 decimal digits.)  Ask 100 C or Java hackers
how many bits there are in a float mantissa and see if anyone knows,
or cares.  (On the other hand, all 100 had better know that there are
8 bits in a byte, or they should be fired or something...)

As for the term "floating-point number", i'm sure it has been used
to refer to numbers of all different precisions in the history of
computer science.

I, for one, wouldn't hurt one bit if you decided to increase the
precision of a float since the E integer type subsumes a whole
bunch of other integer types and the E floating-point type is
going to subsume other floating-point types anyway.

Nonetheless it is not justified for me to *insist* that you change the
name, since your reasons are sound.  I am fine with float, but i am
also not *un*happy with double.  You likely hate "double" more than
i do; i'm just less afraid of "float".

My opinion is personal, and but one.  Perhaps it would be better to
ask other people what they think.  Has anyone else given a view on
this one?

> >Where would the type names actually appear in E programs?
> 
> I need to figure out what the protocol of Type objects in E are, but I know
> this much:  All Java Class objects will be made to act like E Type objects,
> and all E Type objects will act like SlotMakers.

Okay.  Sounds pretty good.  I think Type objects will at least
require a string name and an ordered "containment"/"isa" kind of
comparison.  They may not require an equality comparison... in
fact, it might even be a good idea to explicitly omit equality
comparison, so that subtypes can always stand in for their base
types.  (I don't know about this last idea -- it assumes a
level of discipline about inheritance that C++ programmers, at
least, usually do not achieve.)

I note a potentially troublemaking discontinuity here: it looks
like objects will not inherit from their Type objects, which
means that two (or possibly three) kinds of "isa" are needed:

    1.  an object "isa(n)" instance created by its Maker and
        inheriting from things created by parent Makers

    2.  an object "isa" thing associated with its Type object

    3.  a Type object "isa" subtype of its base types

Also note possible terminology quagmire around inheritance.
We can say "behaviour" instead of "class", but we need good
words to label all the relationships in the smalltalk-style
inheritance diagram.

Is it a requirement of all Types that they be able to create
objects without being given any constructor arguments?  Does
this lead to the possibility of objects being in some invalid
state until initialized?

> A SlotMaker is an object that responds to a one-arg
> "makeSlot" message by returning an object presumed to act as a Slot.

Given an assignable Slot, how do you make an object out of it?  (i.e.
a thing for which object := value will translate to slot setValue(value)).

> kernel language, which is both simpler and more powerful than the kernel I
> presented to you and your friends at the foresight offices.  This
> simplification is directly a result of fretting about why you folks were
> uncomfortable with some parts of what I presented that day.  Thanks!

Hooray!  A triumph for all.  Is the PDF document an up-to-date summary
of the new kernel language?

> 	char, Character		collapse to	char
> 
> 	boolean, Boolean		collapse to	boolean
> 
> 	byte, short, int, long,
> 	Byte, Short, Integer,
> 	Long, BigInteger		collapse to	integer
> 
> 	float, double,
> 	Float, Double			collapse to	double
> 
> 	BigDecimal			banished
 
A definite win!  Now that's what i call progress.

> To successfully shield the E programmer from the above distinctions, I'll
> be constantly at war with the Java libraries.  It'll be a lot of work, but
> on this one I'm hopeful I can win.

This looks to me like a battle worth fighting.  Better for us to
curse and grumble about the explosion of types in Java while
implementing E, than for generations of E programmers after us
to curse *us* for it. :)

> >Other coercion issues:
> >
> >    ? char("abcd")           # should this be allowed?
> >    # result: 'a'
> 
> Bletch!!  I can't imagine why anyone might think this is a good idea.

I agree.  Yuk.

> Btw, you are again saying "# result:" rather than "# value:".  Which is
> better?  (This would still be an easy change.)

Sorry, just a habit.  Again i have no particular attachment to
one or the other.  "# value:" does make sense, as E is eVALUating
what you typed in...

Hey, i wonder, what did you have in mind when you decided to
make the output look like a comment?  Anything actually
functional as opposed to just aesthetic?

Was there any intention of being able to copy and paste
multi-line blocks of interactive sessions?  It sure would be
neat if E knew enough to strip the leading prompt from each
line.  (It is fairly annoying to have to cut and paste each
line individually in Python when i make a mistake, and of
course this is exacerbated by Python's indent sensitivity
which doesn't occur here.)

> Strings *will* be fixed to act in all ways like Tuples of chars.

Yes!

> Good suggestions.  I don't yet have a stance on this, except that I don't
> like int("4") or string('a').  

Why not?  I am actually much more attached to these two
particular examples than the rest of the ones above.

How would you expect to convert from string input to integer?
This seems vital for a scripting language, for quick and
interactive use, etc. and it ought to be easy to do.

I really don't see what you have against 'a' -> "a" ... i
mean, surely string('a') is clearer and more consistent with
the rest of the type coercers than "" + 'a'... please 
enlighten me on your thinking.

Do strings concatenate when you juxtapose them?  I'm thinking
of long multi-line messages here, and also wondering whether a
special quoting mechanism might ever be handy.  (Python uses
'''triple quotes''' for strings that can span many lines.
Makes for easy editing of help messages and self-documentation
inside code.  Btw, any self-documentation conventions?)
 
> I've got the same ".." operator, but I hadn't thought of extending it to
> characters.  A good suggestion.  Btw, whereas E's "x..y" expands to "x
> thru(y)" and means "from x inclusive thru y inclusive", E's "x..!y" expands
> to "x till(y)" and means the much more useful "from x inclusive till y
> exclusive".  Closed-open intervals are almost always best.

These are great!  I think ..! is a really cool syntactic innovation.

> E's Tuples and Vectors are a storage & collection abstraction, not a linear
> algebra abstraction.  An E programmer could define a "matrix" such that

I'm going to risk offense by repeating myself, and more strongly
emphasize that Vector ought to mean vector in the linear algebra
sense.  If you take that word away, what name are the scientists
going to use for a vector when they implement math/geometry
libraries?

There are other good words for sequences that are more generic
('list', 'tuple', 'array'), and i'm inclined to go with them.

> >What are the "standard tuple" and "standard mapping" interfaces?
> >I'm guessing get(key), put(key), del(key), len(), add(tuple/map)?
> >Anything else?
> 
> Close.  
> 
> For collections in general, how about the methods of
> http://erights.org/doc/javadoc/org.erights.e.elib.tables.Table.html but
> leaving out isIdentity, keyType, and valueType?  

Looks pretty good, except for the following things...

    containsKey: as in the other message, how about "maps"?

    each: okay, but i like a verb better: how about "iterate"?

    elements: if we're calling them keys and values, why not values()?
              also note elements() is confusing if sets are maps:
              "set elements()" will get you [null, null, null, ...]
              (I know, set keys() doesn't read that clearly either, but
              at least it might prevent one common misunderstanding.)

    ? items: Python dictionaries have an "items" method which can
             be very handy -- they produce the list of (key, value)
             pairs.  Although Python doesn't do it, it may be useful
             to turn a list into (index, value) pairs too.  Not sure
             if "items" is the best name for this, but it's not awful.
             Maybe "itemize".

             Then again, it may turn out that once you have pattern
             matching that lets you iterate with "for" over a mapping,
             you don't find yourself wanting "items" so much any more.
             Let me think about examples.

keys, values, get, put, are all great.

Aren't we missing del(key) here?  Need this...

How do you deal with iterating over mutable things?  Or do you
just outlaw that?  (Seems like a decent answer, i suppose.)

> For mutable collections, we add "put(key, value)" (as in
> http://erights.org/doc/javadoc/org.erights.e.elib.tables.TableEditor.html#pu
> t and
> http://erights.org/doc/javadoc/org.erights.e.meta.java.util.VectorSugar.html
> #put) and "clear" (as in
> http://erights.org/doc/javadoc/org.erights.e.elib.tables.TableEditor.html#cl
> ear and OOPS!, neither Vector nor VectorSugar has a "clear" method.  This
> will be fixed.).

Works for me!

> Sequences (I do like that word) in addition have both slice(start) and
> slice(bound), and (for Java folk) define "length" as a synonym for "size".

Hmm.  Regarding "len" vs. "length" vs. "size" ... can we just pick
one and go with it everywhere?  Less guessing is probably better.
I'm fine with any of them -- if you want to go with "length" to
appease Java folk, that's great.

> "add" (or "+") is only used to combine Tuples, whereas "|" is only used to
> combine Mappings.

That all seems quite sensible.

> >This is really good for strings, and provides a sometimes useful
> >distinction between a shorter display of an object and a complete
> >representation of its state allowing reconstruction.  (You can
> >define your own __str__ and __repr__ methods for Python to call.)
> >It can sometimes be a bother when it is impossible or undesirable
> >to display complete state, in which case you end up defining
> >__str__ and __repr__ to just do the same thing.
> 
> I don't understand this point??

What i meant was -- yes, it can be really convenient to have __repr__
when everything can be printed reconstructably -- however, it can't
be relied upon.  When something can't be fully represented that way,
__repr__ starts to seem redundant.

If you're still interested in having a repr-type protocol on all
things, one possible way to make it go down easier is a Miranda
method that made asRepr call toString if you didn't supply your
own definition.

> >What is the default string conversion when toString is not given?
> 
> If the object is defined in E, then it's `<$behaviorName>`, where
> behaviorName is the name between the define and the methods:
> 
> 	? define PointMaker(x, y) {
> 	>     define point {
> 	>         to getX {x}
> 	>         to getY {y}
> 	>     }
> 	> }
> 	# value: <PointMaker>
> 	
> 	? define pt := PointMaker(3, 5)
> 	# <point>

I quite like this.  I think this would lead to the convention
that the defining occurrence of a behaviour would capitalize
the behaviour name, in this case "point".

> Given that we switch from using toString to asRepr or something, and once
> upgrade-for-prototyping is working, an interactive session could proceed as
> follows:
> 
> (In Elmer, we could correct our mistake by editing in place, and then
> hitting return again after the final close curly.)
> 
> 	? define PointMaker(x, y) {
> 	>     define point {
> 	>         to asRepr {`PointMaker($x, $y)`}
> 	>         to getX {x}
> 	>         to getY {y}
> 	>     }
> 	> }
> 	# value: <PointMaker>
> 
> 	? pt
> 	> PointMaker(3, 5)

Looks good to me.  Though note that the repr in this case relies
on PointMaker having the right definition in the current scope
(but trying to account for this only leads to a slippery slope).
And it relies on the substitution using asRepr for $x and $y...
which can't work for strings.  You may need to create a different
quasiparser for repr substitution if we proceed this way.

> This is exactly the difference.  C actually doesn't specify which of the
> answers you get, but Java specifies -1.  On principle of avoiding
> unnecessary surprises, E's % gives -1 (and so corresponds to E's
> truncDivide), but E's wonderful %% gives 2 (and so corresponds to E's
> wonderful _/ or floorDivide).

All very beautiful.

It really makes you wonder if anyone ever thought about this
issue when making up other languages.

Years from now people will work amongst programming languages
where the _/ vs. / and %% vs. % operator paradigm is ubiquitous,
and look back and say to each other, "what were those people
*thinking*?"

> Historical.  Will be fixed in the 0.6.3 version of the website.  We thought
> that _% would be suggestive of its pairing with _/, but later decided
> pairing %% with ** for modular exponentiation was better.  Originally, _/
> was //, but this seemed too much like begin-comment to the target audience.
>  E is stuck with "//" as a synonym for "#".

Amusingly enough, i came across an article in which // was suggested as
a new operator in Python to solve exactly this problem.  But i think
_/ is nice.  (Wait a second, though... aren't underscores part of
identifiers?  Oh, no, they can't be, because of pattern ignore.
But are underscores permitted within identifiers?)

> >Is there a really good motivation for providing the aliases
> >"-" -> "subtract" or "negate", "!" -> "not", "~" -> "complement",
> >"[]" -> "get" etc. as seen in the "call" and "method" sections of 
> >http://erights.org/doc/e/e-grammar.html?  I am afraid this will
> >just lead to confusion, especially subtractions which could look
> >like "object<--(arg)", and only enlarges the grammar.
> 
> I don't get it.  What's the alternative?

object <- minus(arg).

I'm afraid that allowing object <- -(arg) will be too easy to
miss when reading code, and cause strange-looking (to the
beginner) error messages about lack of a minus operator or
something.

And, i am following the principle (closely-related to the
"do what KeyKOS did" principle) of "let's do what Perl didn't" ;)
in that i am trying to avoid having too many different ways
to say the same thing.  We'll get people who prefer to write

    to get(index)

and people who prefer to write

    to [](index)

and they will get confused reading each other's code, have
religious wars on the topic, etc.

In my ideal world:

    object + argument          # operators are too common to avoid

    object add(argument)       # the second of only two ways to call add

    object <- add(argument)    # the one & only way to send add

    define object {
        to add(argument) {     # the one & only way to define such a method
            ...
        }
    }

> They all create new scopes.  The difference is in what they enclose.  The
> object expression encloses a set of methods followed by a sequence of
> matchers.  The switch expression encloses a sequence of matchers.

I think these are obvious and clear enough.  They are good.

> However, the reserved macro syntax provides for the creation
> of new forms of either type.
 
I remember vaguely that there was a macro syntax but i don't
think it was ever explained to me.  Do we need one?; is it
well defined yet?; and where can i read the definition?

> When the thing between the keyword and the open-curly is an expression,
> it's enclosed in parens.  When it's a pattern, it isn't.  This rule is
> enforced on macros, so that the E programmer (and tools) can look at code
> using an unknown macro and still do scope analysis!

Now that is a cool idea.  If that really works, it's certainly
ample justification for a few parens.  But i do wonder whether
a language already as powerful as E really needs a macro language.
Have you run into cases yet where you have felt a need for one?

> 	More messages, asynchronously, later,

I like that.  It could kind of become a tagline for the E concurrency
scheme.

asynchronously yours,


Ping                                                 Got a PalmPilot?
<ping@lfw.org>                              http://www.lfw.org/pilot/