Wandering through the libraries

Mark S. Miller markm@caplet.com
Sun, 11 Oct 1998 22:17:25 -0700


At 04:06 AM 10/11/98 -0700, Ka-Ping Yee wrote:
>
>aan(String) 
>      Return rec prefixed by "a " or "an " according to a simple (and
therefore inadequate) heuristic. 
>
>What's wrong with the heuristic?  

The code is:

    static public String aan(String rec) {
        if (rec.length() >= 1 && "aeiou".indexOf(rec.charAt(0)) != -1) {
            return "an " + rec;
        } else {
            return "a " + rec;
        }
    }

If you know a better heuristic, I'd live to hear it.  I write "an herb", "a
heuristic", and "a UFO", but this code won't.  Oops, it will say "a UFO"
since I forgot to be case insensitive!  Is this a bug or a feature?

What should I do with the "sometimes y"?


>Is this method really necessary?

I probably shouldn't have put it into String, but I'm not sure where I
should put it.  I HATE the annoying messages I get from programs that don't
get "a" vs "an" right on the simple cases.  I ended up doing this for one
of the messages I was internally calculating, so, rather than have everyone
implement their own separately broken & separately inadequate heuristic, I
thought I'd share.  Besides making the E implementation non-irritating in
this way, I'd like to see programs written in E be similarly non-irritating.



>Why is "/" named "approxDivide"?  All operations are limited by
>the precision of the registers anyway... does "/" do something
>different from the "/" we know, say, in C?

What's this register stuff??  We is talkin' SEMANTICS!!  On IEEE double
precision floating point numbers, "/" does indeed do what Java does, and
what I think C does by default -- double precision IEEE round to nearest.

As with Scheme, or the ancient tradition of mathematics before registers,
integers are simply integers, with no modulus or precision bounds (other
than the total size of the heap).  Given just double precision floating
point and unlimited precision integers, I cannot provide an accurate
integer division operation, so what are the alternatives?

As you mention in your later message, there's an issue in how a truncating
integer divide operation should deal with negative numbers.  C leaves this
to the implementor.  Less horribly, Java specifies the wrong answer.
Java's answer corresponds to E's truncDivide message
http://erights.org/doc/javadoc/org.erights.e.meta.java.math.BigIntegerSugar.
html#truncDivide  The right answer is E's floorDivide
http://erights.org/doc/javadoc/org.erights.e.meta.java.math.BigIntegerSugar.
html#floorDivide but, since it's different, I can't call it "/".  Doug
Crockford suggested "_/", since the leading underbar was mnemonic for "floor".

On the conservative principle of minimizing surprise to Java programmers
for expressions accepted as valid by both Java and E, perhaps I should have
defined "/" on integers to mean what it means in Java -- E's truncDivide.
However, the expansion from operators to message names in E is insensitive
to data type, since E has no static type information. (**WARNING, possible
bogus argument ahead**)  Especially given name-based polymorphism, good o-o
practice is for a given message name to have a definable meaning across all
applicable types.

There is no semantics in common between what Java's "/" does on integers
and what it does on floating point numbers.  My choice: "/" is shorthand
for "approxDivide", and it means "the IEEE double precision floating point
number closest (by round to nearest) to the quotient of the two numbers,
whatever they are".  In E, "x / y" yields a double and "x _/ y" yields an
integer, independent of whether x or y are integers or floating point numbers.


>Is it necessary to introduce the extra method calls "isLTZero"
>"isGEQZero" etc. with comparisons?  Can't you just expect the
>compareTo method to return a familiar numeric value that you
>can test directly?

Let's say that compareTo always returned -1, 0, 1, or NaN.  What would you
have "x <= y" expand to?


>I think calling atan2, min, max as methods on a double looks
>pretty weird... min and max in particular are probably best
>provided with a tuple or an unlimited number of arguments,
>aren't they?

Ok, I buy it for min and max.  What's your beef with atan2?  How about cos?


>What's the point of the "yourself" method?  (If anything, i
>would prefer that it be called "self"... i'd like not to personify
>my objects if i can help it.)

I love to personify them, especially Alice, Bob, and Carol!  "yourself" is
used in the implementation of partial-ordering in the network protocol, and
isn't expected to be otherwise useful.  It's called "yourself" only because
Smalltalk has a method with the same definition (used for a completely
different purpose), and they called it "yourself".  I won't call it "self"
because that is the conventional instance variable name by which an object
refers to itself.  Even though message names and variable names are in two
separate namespaces (verbs vs nouns) I like to avoid using the same name
for two different meanings.


>Why would one ever want fileUrl: instead of file:?  Is the syntax
>and behaviour of URIs in the language documented somewhere yet?

"fileUrl:x" would evaluate to a java.net.URL object
http://java.sun.com/products/jdk/1.2/docs/api/java/net/URL.html whereas
"file:x" would evaluate to a java.io.File object
http://java.sun.com/products/jdk/1.2/docs/api/java/io/File.html

I don't know when one would want to say fileUrl:x rather than file:x, but
since I was stealing the "file:" URI protocol specifier from the URL
mechanism, and it's visibly different, I thought I'd provide a substitute.


>Is an Ejector what we now call an escape?

An Ejector is what the escape form binds to the param.  In

	escape return {
		...
		if (...) {
			return(3)
		}
		...
	}

within the curlies, "return" is bound to an Ejector.  What do you think
about renaming Ejector "EscapeHatch"?



>Does print() accept multiple arguments?  If it does, what does
>it join them with?

No.


>Further to editCopy vs. editcopy: consider basic language calls
>all lowercase, cf. isa, println?

Good point.  They're now "isA" and "printLn".  (just kidding)  Yes, I have
already stepped onto this slippery slope, but editcopy is just too far.  It
hurts my eyes.  At Xanadu, at first we weren't consistent about midCaps,
and it drove us nuts.  Programmers already have to remember too many
exceptions to what should have been simple rules.  Let's stick with a
simple rule for midCaps.


>Is there an easy "x in set" test?  cf. Python
>
>    if platform in ['windows', 'winnt', 'win95']:
>        print "Warning: your computer will probably crash today."

	if (["windows", "winnt", "win95"] asSet containsKey(platform)) {

>Python even has special parsing so you can write:
>
>    if x not in [1, 2, 3]:
>        ...

	if (! [1, 2, 3] asSet containsKey(x)) {


>Also, Python permits chaining comparison operators so you
>can do a quick-n-easy range test:
>
>    if 3 < x < y <= 8:
>        ...

I do like this.  How do they do it?  What's the BNF?


>If there isn't any other way to know whether a particular key
>is associated with a value in a mapping, i would suggest, say,
>"haskey" or something short instead of "containsKey" to avoid
>a lot of verbiage when checking keys to avoid exceptions.

I will continue to support containsKey, as Java folk expect it.  For
reasons stated earlier, I hate "haskey", and "hasKey" isn't a big enough
improvement to introduce it *in addition to* containsKey.  However, if a
good short single word can be found....?


	Thanks for wonderful questions & suggestions!
	--MarkM