Happy Happy Joy Joy (was: On to Hydro)
Mark S. Miller
markm@caplet.com
Sat, 19 Aug 2000 10:34:19 -0700
Believe it or not, here's a simple solution that satisfies all the
constraints, including
1) support for sort algorithms that require a strict weak ordering (which
I'll call a "full-ordering" below),
2) support for IEEE ordering (which I'll call an "irreflexive partial
ordering"), and
3) only one set of comparison operators in E that satisfy all the normal
equations.
4) normal E programmers need never understand most of these fine
distinctions.
The key was an observation of Tyler's:
MarkM wrote:
>> Also, since floating point values are only partially
>> ordered, your proposal would have floating point values not
>> respond to "<". This is unacceptable.
Tyler wrote:
>Huh? With the exception of NaN, floating point values have a total
>ordering.
Given a set (such as floating point values) of elements that one may compare
with each other, the comparison operator must satisfy at least the minimum
acceptable ordering contract, ie, the universal contract. Since we want to
accept IEEE floating point values as such a set, and since we're confident
it's the worst case, we define the universal contract as an "irreflexive
partial ordering", which we define as a partial ordering in which an element
must either be equivalent or incomparable to itself. (NaN is incomparable to
itself.)
Restating Tyler's observation, for any pair taken from that set, a and b,
either that pair is fully ordered or it's incomparable. (This holds even
when a and b are the same.)
Now let's say that we have a pair-sorter function that accepts only fully
ordered pairs and returns the pair sorted into this full order. If this
function is handed an incomparable pair, it must throw an exception rather
than returning a pair.
define pairFullSort(a, b) :any {
define comparison := a compareTo(b)
if (comparison atMostZero) {
[a, b]
} else if (comparison isNaN) {
throw("incomparable")
} else { # must be aboveZero
[b, a]
}
}
Note that this would have been a bit harder to understand if written with the
relational operators. compareTo/1 saves the day!
Generalizing from pairs, Let's say we wish to write a sort function that
takes a set of elements and, if they are fully ordered, returns a sort of
them, but otherwise throws an exception. Let's call this a full-sort. (We
leave aside for now whether the library should be based on full-sorts, as
Tyler effectively advocates, or partial-sorts, as Dean advocates.)
Different comparison-based full-sort algorithms will make a different subset
of all possible comparisons of the elements. However, if the set of
elements is fully ordered, any comparison based full-sort algorithm must
make enough comparisons to establish what this full ordering is, and
therefore enough to establish that we do have a full ordering.
In order to blow up on a non-full ordering, all these full-sort functions
need to do is use compareTo as their comparator and blow up if the answer
isNaN. By blowing up, they are fail-stop on the first evidence of
non-full-ordering, and so any existing comparison-based full-sort algorithm
may be safely enhanced with this test. Of course, if the algorithm works by
modifying a data structure in place, when blowing up, either these mods need
to be unwound or (more likely) the modified data structure needs to be
thrown away.
From a Java perspective, if we have a type whose elements we statically know
to be fully ordered, we can represent this knowledge statically simply by
having that type's compareTo be declared as returning int, as is already the
case for all compareTo's in the Java standard libraries. (In Java >= 1.2,
they made these compareTo's Java-polymorphic as well by introducing
java.lang.Comparable.)
If we wish to reduce the overhead of full-sorting by not paying the price of
the isNaN test on types we statically know are fully-ordered, we can do the
"case-splitting" optimization -- test early which case we're in and execute
a version of the code specialized for this case. In this case, the general
full-sort must test isNaN, but the specialized full-sort does not. The
wrapper full-sort chooses one or the other based on the declared return type
of the comparator.
Happy Happy Joy Joy!
--MarkM