[swift-evolution] [Idea] Bringing the partial/total ordering distinction into Comparable

Mon Apr 25 18:06:54 CDT 2016

on Sat Apr 23 2016, Brent Royal-Gordon <swift-evolution at swift.org> wrote:

> Currently, Comparable looks like this:
>
> 	public protocol Comparable : Equatable {
> 	  /// A [strict total order](http://en.wikipedia.org/wiki/Total_order#Strict_total_order)
> 	  /// over instances of `Self`.
> 	  @warn_unused_result
> 	  func < (lhs: Self, rhs: Self) -> Bool
>
> 	  @warn_unused_result
> 	  func <= (lhs: Self, rhs: Self) -> Bool
>
> 	  @warn_unused_result
> 	  func >= (lhs: Self, rhs: Self) -> Bool
>
> 	  @warn_unused_result
> 	  func > (lhs: Self, rhs: Self) -> Bool
> 	}
>
> Simple and straightforward, but not actually accurate. In a strict
> total order, all elements are ordered, but that's not true of the
> current Comparable. For instance, floating-point NaNs are not ordered.
>
> The FloatingPoint proposal (SE-0067,
> <https://github.com/apple/swift-evolution/blob/master/proposals/0067-floating-point-protocols.md>)
> suggests that Comparable's requirements should be weakened so that
> only "normal" members of types need to be ordered, while "exceptional"
> members like NaNs are permitted to violate the rules. In practice,
> though, this ends up making algorithms give bizarre and incorrect
> results:
>
> 	Welcome to Apple Swift version 2.2 (swiftlang-703.0.18.1 clang-703.0.29). Type :help for assistance. 
> 	  1> let numbers = Array(0.0.stride(to: 1.0, by: 0.2)) + [.NaN] + Array(1.0.stride(to: 0.0, by: -0.25)) 
> 	numbers: [Double] = 10 values {
> 	  [0] = 0
> 	  [1] = 0.20000000000000001
> 	  [2] = 0.40000000000000002
> 	  [3] = 0.60000000000000009
> 	  [4] = 0.80000000000000004
> 	  [5] = NaN
> 	  [6] = 1
> 	  [7] = 0.75
> 	  [8] = 0.5
> 	  [9] = 0.25
> 	}
> 	  2> numbers.sort()
> 	$R1: [Double] = 10 values {
> 	  [0] = 0
> 	  [1] = 0.20000000000000001
> 	  [2] = 0.40000000000000002
> 	  [3] = 0.60000000000000009
> 	  [4] = 0.80000000000000004
> 	  [5] = NaN
> 	  [6] = 0.25
> 	  [7] = 0.5
> 	  [8] = 0.75
> 	  [9] = 1
> 	}
>
> (Note that the behavior is actually much stranger than simply having the NaN act as a partition of the list—try sorting `Array(0.0.stride(to: 1.0, by: 0.1)) + [.NaN] + Array(0.0.stride(to: 1.0, by: 0.1))` to see what I mean. I'm sure there are sensible implementation reasons why `sort()` behaves this way, but they aren't really relevant to this discussion.)
>
> To address this, FloatingPoint introduces an ad-hoc mechanism: there is a `totalOrder` method in the protocol which actually *does* sort NaNs in a useful way. But because this is ad-hoc, it can't be extended to other, non-floating-point types. And since it's not part of Comparable, the plain `sort()` (well, `sorted()` in Swift 3) method on Sequences of Comparable elements doesn't use it. That's not great; the type system shouldn't lead us astray like this.
>
> I think we should go in the other direction. Rather than weakening Comparable's promises, I think we should instead strengthen and clarify them.
>
> In short, I propose we:
>
> * Introduce a new `<=>` operator which implements a strict total
> ordering on the Comparable type. Rather than returning a `Bool`, it
> returns a new `Order` type which is similar to
> `NSComparisonResult`. This provides a semantic hint that non-ordering
> is not an option for `<=>`.

That has been my plan for some time.

> * Introduce a new `<>` operator which captures the concept of two
> values being unordered relative to one another. For example, `1.0 <>
> .nan` would be `true`.

Cute.  I'm not sure we need this operator, though.  Is this a test you
actually want in real code often enough to make it worth having?

> * Define the friendly comparison operators like `<` and `==` as being
> a partial order covering all values which are not `<>`.  

Natch'.

> * Include default implementations such that you only need to define
> `<=>`; `<>` is always `false` by default, and the other operators call
> through to `<=>` and `<>` to determine the correct values to return.

I think you can do the default implementations of `<`, `>`, `<=`, `>=`,
`==` and `!=` using just `<=>`.  Why involve `<>`?

> * Redefine functions like `sorted(_:)` and `max(_:)` to take a total
> ordering function returning an `Order`, not a partial ordering
> function returning a `Bool`. 

They don't accept partial ordering functions today.

> In other words, you would pass `<=>` instead of `<`.

One problem I haven't come to a conclusion about is the total- vs
strict-weak- ordering question.  For me, this area is where most of the
uncertainty about pursuing something like this lies, and once we get it
sorted out I'd be very happy to move ahead.  I'm sure it's doable but I
just haven't had the time to devote.  In short:

Almost all types can reasonably implement a total order, but almost any
algorithm that can be implemented with just a binary comparison
predicate (sort, binary search, partition, et al) only needs a strict
weak ordering to produce sensible results, because as far as the
ordering is concerned, unordered values appear to be equivalent.

In fact it seems to me that the distinction between a total ordering and
a strict-weak ordering is *only* observable in the presence of an
equality comparison.  When an algorithm takes a single comparison
function (whether it has binary or trinary results) and doesn't
additionally require Equatable conformance, there is no reason
whatsoever to require a total ordering, and you can easily argue that it
is meaningless to do so.

[If you're reading along and wondering what all these kinds of orderings are
about, I once figured that out and wrote a blog post about it:
http://web.archive.org/web/20120422220137/http://cpp-next.com/archive/2010/02/order-i-say/]

> Here's what the new `Comparable` might look like:
>
> 	public enum Order {
> 	  case firstEarlier
> 	  case bothEqual
> 	  case firstLater
> 	}
>
> 	/// Instances of conforming types can be compared using relational operators.
> 	/// 
> 	/// Comparable includes both a total order, which sorts all possible values, 
> 	/// and a partial order, which compares only "normal" or "common" values.
> 	/// The partial order may consider some elements "unordered" and return `false` 
> 	/// for all operations.
> 	/// 
> 	/// The `<=>` operator implements the total order; the others implement the 
> 	/// partial order. You may define only the total order, and `Comparable` will 
> 	/// provide default implementations which use it. You may also define both the 
> 	/// `<=>` operator and the `<>` "unordered" operator, and Comparable will 
> 	/// provide default implementations for the rest of the partial order which them. 
> 	/// You may also choose to implement the `<`, `>`, `<=`, `>=`, `==`, and 
> 	/// `!=` operators to completely customize the implementation.
> 	public protocol Comparable : Equatable {
> 	  /// A [total order](http://en.wikipedia.org/wiki/Total_order#Strict_total_order)
> 	  /// over instances of `Self`. In a total order, no element is permitted to be 
> 	  /// unordered relative to any other.
> 	  @warn_unused_result
> 	  func <=> (lhs: Self, rhs: Self) -> Order
>
> 	  /// Returns `true` if, to partial order operators like `<` and `==`, `lhs` is 
> 	  /// unordered relative to `rhs`.
> 	  @warn_unused_result
> 	  func <> (lhs: Self, rhs: Self) -> Bool
>
> 	  /// Returns `true` if `lhs` is less than `rhs`. Should be consistent with `<=>` except
> 	  /// when the elements are unordered relative to each other.
> 	  @warn_unused_result
> 	  func < (lhs: Self, rhs: Self) -> Bool
>
> 	  /// Returns `true` if `lhs` is greater than `rhs`. Should be consistent with `<=>` except
> 	  /// when the elements are unordered relative to each other.
> 	  @warn_unused_result
> 	  func > (lhs: Self, rhs: Self) -> Bool
>
> 	  /// Returns `true` if `lhs` is less than or equal to `rhs`. Should be consistent with `<=>` 
> 	  /// except when the elements are unordered relative to each other.
> 	  @warn_unused_result
> 	  func <= (lhs: Self, rhs: Self) -> Bool
>
> 	  /// Returns `true` if `lhs` is greater than or equal to `rhs`. Should be consistent with `<=>` except
> 	  /// when the elements are unordered relative to each other.
> 	  @warn_unused_result
> 	  func >= (lhs: Self, rhs: Self) -> Bool
> 	}
>
> Some APIs on Order which might be useful:
>
> 	public extension Order {
> 	  /// Returns the equivalent order for the two arguments reversed.
> 	  func reversed() -> Order {…}
> 	  /// Returns `x` and `y` reordered according to `self`, with the earlier one first.
> 	  func reorder<T>(_ x: T, _ y: T) -> (T, T) {…}
> 	  /// Returns `x` and `y` reordered with the earlier one first.
> 	  static func reorder<T: Comparable>(_ x: T, _ y: T) -> (T, T) {…}
> 	}
>
> Alternate designs:
>
> * The `<>` operator is arguably not very obvious, or too confusable
> with some languages' use of that operator for "not equals". It could
> instead be a different operator, an instance method, or a class
> method.
>
> * It might make sense to instead use `<>` to say "is comparable" and
> `!<>` to say "is incomparable".
>
> * It may also be better to define Comparable such that certain
> *values* are incomparable with any value, rather than certain *pairs*
> of values being incomparable. If so, we would want an `isIncomparable`
> property instead of a method or function. That works for
> `FloatingPoint`, but it might not suit other types. (For instance,
> with the `<>` operator in place, `String.Index` could be made
> incomparable with indices from other strings, but all `String.Index`es
> would still have a total order. That design wouldn't be possible with
> an `isIncomparable` property.)
>
> * The `<=>` operator is common from other languages, but it might
> still be too jargony. One interesting design for this would be to
> expose the total order as a method on `Comparable` which is used as an
> implementation hook for an `Order.init(_:_:)` initializer.
>
> * The cases of Order are highly bikesheddable. I like these names more
> than `ascending` and `descending` because I have an easier time
> understanding what they mean, but others might disagree.
>
> * I'm also toying with the idea that the partial order, which includes
> `==`, may have a looser definition of equality than the total order;
> this would mean that, for instance, `String`'s total order could fall
> back to `UnicodeScalar.value` comparison to distinguish between
> strings which have equal graphemes. I'm not sure how useful that would
> be in practice, though.
>
>
> Any thoughts?

-- 
Dave