Happy Happy Joy Joy (was: On to Hydro)

Mark S. Miller markm@caplet.com
Sat, 19 Aug 2000 10:34:19 -0700


Believe it or not, here's a simple solution that satisfies all the 
constraints, including 

1) support for sort algorithms that require a strict weak ordering (which 
     I'll call a "full-ordering" below), 
2) support for IEEE ordering (which I'll call an "irreflexive partial 
     ordering"), and 
3) only one set of comparison operators in E that satisfy all the normal 
     equations.
4) normal E programmers need never understand most of these fine 
     distinctions.

The key was an observation of Tyler's:

MarkM wrote:
 >>  Also, since floating point values are only partially
 >> ordered, your proposal would have floating point values not
 >> respond to "<".  This is unacceptable.

Tyler wrote:
 >Huh? With the exception of NaN, floating point values have a total
 >ordering. 

Given a set (such as floating point values) of elements that one may compare 
with each other, the comparison operator must satisfy at least the minimum 
acceptable ordering contract, ie, the universal contract.  Since we want to 
accept IEEE floating point values as such a set, and since we're confident 
it's the worst case, we define the universal contract as an "irreflexive 
partial ordering", which we define as a partial ordering in which an element 
must either be equivalent or incomparable to itself. (NaN is incomparable to 
itself.)

Restating Tyler's observation, for any pair taken from that set, a and b, 
either that pair is fully ordered or it's incomparable.  (This holds even 
when a and b are the same.)

Now let's say that we have a pair-sorter function that accepts only fully 
ordered pairs and returns the pair sorted into this full order.  If this 
function is handed an incomparable pair, it must throw an exception rather 
than returning a pair.

     define pairFullSort(a, b) :any {
         define comparison := a compareTo(b)
         if (comparison atMostZero) {
             [a, b]
         } else if (comparison isNaN) {
             throw("incomparable")
         } else { # must be aboveZero
             [b, a]
         }
     }

Note that this would have been a bit harder to understand if written with the 
relational operators.  compareTo/1 saves the day!

Generalizing from pairs, Let's say we wish to write a sort function that 
takes a set of elements and, if they are fully ordered, returns a sort of 
them, but otherwise throws an exception.  Let's call this a full-sort.  (We 
leave aside for now whether the library should be based on full-sorts, as 
Tyler effectively advocates, or partial-sorts, as Dean advocates.)

Different comparison-based full-sort algorithms will make a different subset 
of all possible comparisons of the elements.  However, if the set of 
elements is fully ordered, any comparison based full-sort algorithm must 
make enough comparisons to establish what this full ordering is, and 
therefore enough to establish that we do have a full ordering.

In order to blow up on a non-full ordering, all these full-sort functions 
need to do is use compareTo as their comparator and blow up if the answer 
isNaN.  By blowing up, they are fail-stop on the first evidence of 
non-full-ordering, and so any existing comparison-based full-sort algorithm 
may be safely enhanced with this test.  Of course, if the algorithm works by 
modifying a data structure in place, when blowing up, either these mods need 
to be unwound or (more likely) the modified data structure needs to be 
thrown away.

 From a Java perspective, if we have a type whose elements we statically know 
to be fully ordered, we can represent this knowledge statically simply by 
having that type's compareTo be declared as returning int, as is already the 
case for all compareTo's in the Java standard libraries.  (In Java >= 1.2, 
they made these compareTo's Java-polymorphic as well by introducing 
java.lang.Comparable.)

If we wish to reduce the overhead of full-sorting by not paying the price of 
the isNaN test on types we statically know are fully-ordered, we can do the 
"case-splitting" optimization -- test early which case we're in and execute 
a version of the code specialized for this case.  In this case, the general 
full-sort must test isNaN, but the specialized full-sort does not.  The 
wrapper full-sort chooses one or the other based on the declared return type 
of the comparator.

     Happy Happy Joy Joy!
     --MarkM