[E-Lang] Operators #2: Comparison
Mark S. Miller
markm@caplet.com
Fri, 06 Apr 2001 18:16:12 -0700
On Python page http://www.python.org/doc/current/ref/customization.html , it
seems the only relevant expansion name is "__cmp__". This seems very
similar to E's current expansion of "<", "<=", "<=>", ">=", and ">"
involving "compareTo", except for something I don't understand.
In E currently:
x < y x compareTo(y) belowZero()
x <= y x compareTo(y) atMostZero()
x <=> y x compareTo(y) isZero()
x >= y x compareTo(y) atLeastZero()
x > y x compareTo(y) aboveZero()
How does Python expand these? Presumably it uses "__cmp__" rather than
"compareTo". But instead of the five messages on the right, the only thing
I could find is "__nonzero__", which doesn't help.
General Meaning:
When x and y, after coercions, are both members of a (type? field? group?
help me out here folks) T that has its own defined "natural" order (which must
be an irreflexive partial order), then
x op__cmp(y)
should return either an integer or a float64, which therefore must either be
negative, zero, positive, or NaN, indicating that x < y, x <=> y, x > y, or
x is incomparable to y, respectively, in this natural order. (Note that NaN
responds to all five zero-comparison queries with false.)
Specific Meanings:
Scalars
* On integers and chars: the obvious meaning. integers are fully ordered, so
op__cmp can respond with an integer.
* On booleans: an error. I don't think it's particularly clarifying to say
that false < true, so op__cmp isn't defined on booleans.
* On float64s: NaNs are incomparable to all float64s including themselves,
and so are also irreflexive.
"NaN op__cmp(f) -> NaN" and "f op__cmp(NaN) -> NaN".
All floats other than NaN (including positive and negative infinity) are
fully ordered as expected. When f1 and f2 and not NaNs, then
f1 op__cmp(f2) -> non-NaN according to this full order. Note the following
two disagreements between E-equality (sameness) and equivalence among float64s:
NaN == NaN vs NaN op__cmp(NaN) -> NaN, so !(NaN <=> NaN)
NaN is the same as itself, but is incomparable to itself.
-0.0 != 0.0 vs -0.0 <=> 0.0
-0.0 is not the same as 0.0, but is equivalent to 0.0
Subset-ness
* On Java classes (and by extension perhaps, on declared E types, should we
ever declare such things, as we probably will): x <= y means that x is a
declared subtype of y. If y is a class, then x is y or a class that extends
a class that's <= y. If y is an interface, then x is y or x is a class or
interface that extends or implements a class that's <= y. x < y means that
x is a declared strict subtype of y, that x < y && !(x <=> y).
* On regions: x <= y means that the set of members of region x is a subset
of the set of members of region y.
* On sets: x <= y means x is a subset of y.
Lists
I think this issue breaks my attempt to unify lists and multisets. Lists
should be ordered according to a lexicographic ordering of their elements.
While this is pleasant in general, it's only compelling in one case,
Strings, but here it seems we have an inescapable conflict. For strings,
x <= y should be by lexicographic ordering on the character codes. If Strings
are also multisets of their characters, then we'd have a conflicting claim
on the meaning of x <= y -- that the characters in x are a subset of the
characters in y. Not only is this a conflict, but the latter meaning is
sufficiently strange that I hereby withdraw my suggested unification of list
and multiset.
So, on lists: x op__cmp(y) compares them lexicographically -- compares
their elements pairwise left to right until it encounters the first pair
that aren't equivalent. The answer to the overall compareTo is negative,
positive of NaN, exactly as the first non-equivalence is one of these. If
the two lists are of equal length and everywhere pairwise equivalent, then
the two lists are equivalent. (When the only disagreement is length, we
generalize the answer from String. What is it?)
Maps
It is plausible either to not define op__cmp on Maps, or to define it
according to the keys of the Map considered as a set. After some voice
conversations with Dean, I'm inclined toward the first answer: not to define
it on Maps. If you want to operate on the keys of the maps as sets, ask the
maps for their keys.