[swift-evolution] [pitch] Comparison Reform

Sat Apr 15 15:12:46 CDT 2017

on Thu Apr 13 2017, Xiaodi Wu <swift-evolution at swift.org> wrote:

> Getting this sorted out is definitely a worthwhile effort. I do have
> thoughts about this proposal:
>
> I continue to have reservations about an identical spelling (e.g. `==`)
> giving two different answers with the same values of the same type,
> depending on the generic context. It is a very *clever* design, but it is
> also a very *subtle* behavior that I can see leading to much confusion and
> befuddlement for any user who is not well versed *both* in the intricacies
> of IEEE floating point *and* in the intricacies of Swift.

I can't help but think that the concern over confusion here is not
informed by any realistic situations.  Please describe

> Actually, the fact that this behavior cannot even be achieved without
> currently non-existent compiler features means that it is not possible
> to understand what's truly going on without reading *this document*,

This doesn't seem like a reasonable argument.  New compiler features get
documented outside of the proposals they come from.  Nobody's going to
have to go read a proposal to understand what @implements means.

> after mastering *both* IEEE floating point *and* Swift
> generics/protocols/extensions/static vs. dynamic dispatch. All to use
> `==` correctly. 

I don't understand this argument.  The *whole* point of this proposal is
that you use `==` correctly *without* the need for any special knowledge.

> Which is to say, most people will simply not even know if they happen
> to be using the `==` they did not intend to use.

Most people aren't aware that IEEE comparison is quirky and don't know
what they intend with respect to those semantics, but anyone who *is*
aware of his intention has an easy way to understand what's happening.
Does this code know it's operating on floating point numbers?  If so,
it's IEEE.  If not, it's an equivalence relation.

> I think consideration should be given to a design that achieves a
> user-facing but not onerous differentiation between level 1 and level 2
> equality. However, the only one I can think of is essentially a different
> shade of the `PartiallyComparable` alternative already outlined in the
> document. 

I am *deeply* opposed to such a protocol.  It is purely syntactic in
nature and thus useless for correctly constraining generic algorithms.
People will use it anyway, resulting in algorithms that statically
accept, but are incorrect for, floating point.  In my opinion there's
only one feasible answer that doesn't use the static/dynamic distinction
we've proposed: throw IEEE semantics under the bus, making it available
only under a different syntax.

This would be a drastic move whose primary downside is that floating
point code ported from C would need to be carefully audited and may
become less easy to read.  But at least it would be viable.

> Yet I cannot help but think that the rejected alternative may be
> advantageous in one key aspect. `FloatingPoint` comparison is in a
> sense "less refined" (not exactly precise language, I know) than the
> level 2 ordering proposed here, at least in the sense that the latter
> offers more semantic guarantees about the relationships between
> comparison operators.  

No, they are effectively refinement siblings.  If it was a hierarchical
relationship, we could just make the floating point types conform to
Equatable.  But floating point types are *required* by IEEE to have
comparison semantics that conflict with Equatable.

> It's weird that the less refined `FloatingPoint` refines the more
> refined `Comparable`, 

Do you want to be able to sort floating point numbers without providing
a comparison predicate (one that has to be spelled less obviously than
"<")?  If so, floating point numbers must be Comparable.  If not, we
could talk about breaking this refinement relationship.

> and I think the acrobatics with compiler support illustrate how the
> design is actively working against Swift's overarching direction.

It's not more acrobatic than things we already do in the standard
library to ensure that people transparently see the right behavior (see
.lazy), and we could probably even find ways to do it without language
features.  It would be less principled, harder to understand, and more
fragile than designing a language feature that addresses the need
directly.

> Anyway, so much for that about the design overall.
>
> As for nitpicking details, I agree with others and think it's a poor
> precedent to say that we're going to use an Obj-C name because it's not
> clearly worse than the obvious Swift API guideline-compliant name. When
> that's the case, it seems to me that the whole point of having Swift API
> naming guidelines and making that huge migration from Swift 2 to 3 was so
> that going forward we could name things consistently using one consensus
> style. `compare(_:)` does not merit a term-of-art exception when the Swift
> name is clearly `compared(to:)`.
>
> On Thu, Apr 13, 2017 at 3:17 PM, Ben Cohen via swift-evolution <
> swift-evolution at swift.org> wrote:
>
>>
>> Hi swift evolution,
>>
>> Here’s another pitch, for The Propoosal Formerly Known As Spaceship.
>> Comparison Reform
>>
>>    - Proposal: SE-NNNN
>>    - Authors: Robert Widmann <https://github.com/codafi>, Jaden Geller
>>    <https://github.com/jadengeller>, Harlan Haskins
>>    <https://github.com/harlanhaskins>, Alexis Beingessner
>>    <https://github.com/Gankro>, Ben Cohen
>>    <https://github.com/airspeedswift>
>>    - Status: *Awaiting review*
>>    - Review manager: TBD
>>
>> Introduction
>>
>> This proposal is for changes that we believe should be made to the
>> existing comparison system by:
>>
>>    - Making FloatingPoint comparison context sensitive, so that its
>>    Comparable conformance provides a proper total ordering.
>>    - Introducing a new ternary-valued compare(_ other: Self) ->
>>    ComparisonResult method.
>>    - Removing unnecessary customization points from Comparable.
>>
>> Motivation
>>
>> The motivation comes from several independent points:
>>
>> 1: The standard comparison operators have an intuitive meaning to
>> programmers. Swift encourages encoding that in an implementation of
>> Comparable that respects the rules of a total order
>> <https://en.wikipedia.org/wiki/Total_order>. The standard library takes
>> advantage of these rules to provide consistent implementations for sorting
>> and searching generic collections of Comparable types.
>>
>> Not all types behave so well in this framework, unfortunately. There are
>> cases where the semantics of a total order cannot apply while still
>> maintaining the traditional definition of “comparison” for these types.
>> Take, for example, sorting an array of Floats. Today, Float’s instance of
>> Comparable follows IEEE-754 and returns false for all comparisons of NaN.
>> In order to sort this array, NaN s are considered outside the domain of <,
>> and the order of a sorted array containing them is unspecified. Similarly,
>> a Dictionary keyed off floats can leak entries and memory.
>>
>> 2: Generic algorithms in the Swift Standard Library that make use of the
>> current Comparable protocol may have to make extra comparisons to
>> determine the ordering of values when <, ==, and > should have different
>> behaviours. Having a central operation to return complete ordering
>> information should provide a speedup for these operations.
>>
>> 3: The existing comparison operators don’t “generalize” well. There’s no
>> clean way to add a third or fourth argument to < to ask for non-default
>> semantics. An example where this would be desirable would be specifying the
>> locale or case-sensitivity when comparing Strings.
>>
>> 4: Comparable is over-engineered in the customization points it provides:
>> to our knowledge, there’s no good reason to ever override >=, >, or <=.
>> Each customization point bloats vtables and mandates additional dynamic
>> dispatch.
>>
>> 5: When quickly writing a Comparable type, it is easier to implement a
>> single ternary statement than to separately implement == and <.
>> Proposed solutionComparisonResult
>>
>> Foundation’s ComparisonResult type will be mapped into Swift as
>>
>> @objc public enum ComparisonResult : Int {
>>   case orderedAscending = -1
>>   case orderedSame = 0
>>   case orderedDescending = 1
>> }
>>
>> Comparable
>>
>> Comparable will be changed to have a new ternary comparison method: compare(_
>> other: Self) -> ComparisonResult. x.compare(y) specifies where to place x
>> relative to y. So if it yields .orderedAscending, then x comes before y.
>> This will be considered the new “main” dispatch point of Comparable that
>> implementors should provide.
>>
>> Most code will continue to use < or ==, as it will be optimal for their
>> purposes. However code that needs to make a three-way branch on comparison
>> can use the potentially more efficient compare. Note that compare is only
>> expected to be more efficient in this specific case. If a two-way branch is
>> all that’s being done, < will be more efficient in many cases (if only
>> because it’s easier for the optimizer).
>>
>> For backwards compatibility reasons, compare will have a default
>> implementation defined in terms of <, but to enable only using compare, <
>>  and == will also have default implementations in terms of compare.
>>
>> The compiler will verify that either compare, or < and ==, are provided
>> by every type that claims to conform to Comparable. This will be done in
>> some unspecified way unavailable outside the standard library (it can be
>> made available to in the future, but that’s an unnecessary distraction for
>> this proposal).
>>
>> Types that wish to provide comparison “variants” can do so naturally by
>> adding compare methods with additional arguments. e.g. String.compare(_
>> other: Self, in: Locale) -> ComparisonResult. These have no
>> language-level connection to Comparable, but are still syntactically
>> connected, implying the same total order semantics. This makes them easier
>> to discover, learn, and migrate to.
>>
>> To reduce bloat, the operators <=, >=, and > will be removed from the set
>> of requirements that the Comparable protocol declares. These operators
>> will however continue to exist with the current default implementations.
>> FloatingPoint
>>
>> No changes will be made to the FloatingPoint protocol itself. Instead,
>> new extensions will be added to it to change the behaviour of comparison.
>>
>> The new behaviour centers around the fact that compare(_: Self) ->
>> ComparisonResult will provide a total ordering that’s consistent with
>> Level 2 in the IEEE 754 (2008) spec. This is mostly the same as the
>> standard (Level 1) IEEE ordering, except:
>>
>>    - -0 < +0
>>    - NaN == NaN
>>    - NaN > +Inf (an arbitrary choice, NaN can be placed anywhere in the
>>    number line)
>>
>> Level 2’s distinguishing of -0 and +0 is a bit strange, but is consistent
>> with Equatable’s Substitutability requirement. -0 and +0 have different
>> behaviours: 1/-0 = -Inf while 1/+0 = +Inf. The main problem this can lead
>> to is that a keyed collection may have two “0” entries. In practice this
>> probably won’t be a problem because it’s fairly difficult for the same
>> algorithm to produce both -0 and +0. Any algorithm that does is also
>> probably concerned with the fact that 1.0E-128 and 2.0E-128 are
>> considered distinct values.
>>
>> Note: IEEE specifies several other potential total orderings: level 3,
>> level 4, and the totalOrder predicate. For our purposes, these orderings
>> are too aggressive in distinguishing values that are semantically
>> equivalent in Swift. For most cases, the relevant issue is that they
>> distinguish different encodings of NaN. For more exotic encodings that
>> don’t guarantee normalization, these predicates also consider 10.0e0 <
>> 1.0e1 to be true. An example where this can occur is *IEEE-754 decimal
>> coded floating point*, which FloatingPoint is intended to support.
>>
>> We will then make the comparison operators (<, <=, ==, !=, >=, >)
>> dispatch to one of compare(_:) or FloatingPoint’s IEEE comparison methods
>> (isLess, isEqual, isLessThanOrEqualTo) based on the context.
>>
>>    - If the context knows the type is FloatingPoint, then level 1
>>    ordering will be used.
>>    - If the context only knows the type is Comparable or Equatable, then
>>    level 2 ordering will be used.
>>
>> This results in code that is explicitly designed to work with
>> FloatingPoint types getting the expected IEEE behaviour, while code that is
>> only designed to work with Comparable types (e.g. sort and Dictionary)
>> gets more reasonable total ordering behaviour.
>>
>> To clarify: Dictionary and sort won’t somehow detect that they’re being
>> used with FloatingPoint types and use level 1 comparisons. Instead they
>> will unconditional use level 2 behaviour. For example:
>>
>> let nan = 0.0/0.0
>> func printEqual<T: Equatable>(_ x: T, _ y: T) {
>>   print(x == y)
>> }
>> func printEqualFloats<T: FloatingPoint>(_ x: T, _ y: T) {
>>   print(x == y)
>> }
>> print(nan == nan)          // false, (concrete)
>> printEqual(nan, nan)       // true,  (generic Equatable but not FloatingPoint)
>> printEqualFloats(nan, nan) // false, (generic FloatingPoint)
>>
>> If one wishes to have a method that works with all Equatable/Comparable
>> types, but uses level 1 semantics for FloatingPoint types, then they can
>> simply provide two identical implementations that differ only in the bounds:
>>
>> let nan = 0.0/0.0
>> func printEqual<T: Equatable>(_ x: T, _ y: T) {
>>   print(x == y)
>> }
>> func printEqual<T: FloatingPoint>(_ x: T, _ y: T) {
>>   print(x == y)
>> }
>>
>> printEqual(0, 0)           // true (integers use `<T: Equatable>` overload)
>> printEqual(nan, nan)       // false (floats use `<T: FloatingPoint>` overload)
>>
>> As a result of this change, hashing of floats must be updated to make all
>> NaNs hash equally. -0 and +0 will also no longer be expected to hash
>> equally. (Although they might as an implementation detail – equal values
>> must hash the same, unequal values *may* hash the same.)
>> Misc Standard Library
>>
>> Types that conform to Comparable should be audited for places where
>> implementing or using Comparable would be a win. This update can be done
>> incrementally, as the only potential impact should be performance. As an
>> example, a default implementation of compare(_:) for Array will likely be
>> suboptimal, performing two linear scans to determine the result in the
>> worst-case. (See the default implementation provided in the detailed
>> design.)
>>
>> Some free functions will have <T: FloatingPoint> overloads to better
>> align with IEEE-754 semantics. This will be addressed in a follow-up
>> proposal. (example: min and max)
>> Detailed Design
>>
>> The protocols will be changed as follows:
>>
>> ComparisonResult, currently a type found in Foundation, will be sunk into
>> the Swift Standard Library:
>>
>> @objc public enum ComparisonResult: Int, Equatable {
>>   case orderedAscending = -1
>>   case orderedSame = 0
>>   case orderedDescending = 1
>> }
>> public protocol Comparable: Equatable {
>>   func compare(_ other: Self) -> ComparisonResult
>>
>>   static func < (lhs: Self, rhs: Self) -> Bool
>> }
>> extension Comparable {
>>   func compare(_ other: Self) -> ComparisonResult {
>>     if self == other {
>>       return .orderedSame
>>     } else if self < other {
>>       return .orderedAscending
>>     } else {
>>       return .orderedDescending
>>     }
>>   }
>> }
>> public func < <T: Comparable>(lhs: T, rhs: T) -> Bool {
>>   return lhs.compare(rhs) == .orderedAscending
>> }
>> // IEEE comparison operators (these implementations already exist in std)extension FloatingPoint {
>>   public static func == (lhs: T, rhs: T) -> Bool {
>>     return lhs.isEqual(to: rhs)
>>   }
>>
>>   public static func < (lhs: T, rhs: T) -> Bool {
>>     return lhs.isLess(than: rhs)
>>   }
>>
>>   public static func <= (lhs: T, rhs: T) -> Bool {
>>     return lhs.isLessThanOrEqualTo(rhs)
>>   }
>>
>>   public static func > (lhs: T, rhs: T) -> Bool {
>>     return rhs.isLess(than: lhs)
>>   }
>>
>>   public static func >= (lhs: T, rhs: T) -> Bool {
>>     return rhs.isLessThanOrEqualTo(lhs)
>>   }
>> }
>>
>> // Comparable comparison operators (provides a total ordering)extension FloatingPoint {
>>   @_inline
>>   public func compare(_ other: Self) -> ComparisonResult {
>>     // Can potentially be implemented more efficiently -- this is just the clearest version
>>     if self.isLess(than: other) {
>>       return .orderedAscending
>>     } else if other.isLess(than: self) {
>>       return .orderedDescending
>>     } else {
>>       // Special cases
>>
>>       // -0 < +0
>>       if self.isZero && other.isZero {
>>         // .plus == 0 and .minus == 1, so flip ordering to get - < +
>>         return (other.sign as Int).compare(self.sign as Int)
>>       }
>>
>>       // NaN == NaN, NaN > +Inf
>>       if self.isNaN {
>>         if other.isNaN {
>>           return .orderedSame
>>         } else {
>>           return .orderedDescending
>>         }
>>       } else if other.isNaN {
>>         return .orderedAscending
>>       }
>>
>>       // Otherwise equality agrees with normal IEEE
>>       return .orderedSame
>>     }
>>   }
>>
>>   @_implements(Equatable.==)
>>   public static func _comparableEqual(lhs: Self, rhs: Self) -> Bool {
>>     lhs.compare(rhs) == .orderedSame
>>   }
>>
>>   @_implements(Comparable.<)
>>   public static func _comparableLessThan(lhs: Self, rhs: Self) -> Bool {
>>     lhs.compare(rhs) == .orderedDescending
>>   }
>> }
>>
>> Note that this design mandates changes to the compiler:
>>
>>    - @_implements (or an equivalent mechanism) must be implemented to get
>>    the context-sensitive FloatingPoint behaviour.
>>    - The compiler must verify that either == and <, or compare(_:) is
>>    overridden by every type that conforms to Comparable.
>>
>> Source compatibility
>>
>> Users of ComparisonResult will be able to use it as normal once it
>> becomes a standard library type.
>>
>> Existing implementors of Comparable will be unaffected, though they
>> should consider implementing the new compare method as the default
>> implementation may be suboptimal.
>>
>> Consumers of Comparable will be unaffected, though they should consider
>> calling the compare method if it offers a performance advantage for their
>> particular algorithm.
>>
>> Existing implementors of FloatingPoint should be unaffected – they will
>> automatically get the new behaviour as long as they aren’t manually
>> implementing the requirements of Equatable/Comparable.
>>
>> Existing code that works with floats may break if it’s relying on some
>> code bounded on <T: Equatable/Comparable>providing IEEE semantics. For
>> most algorithms, NaNs would essentially lead to unspecified behaviour, so
>> the primary concern is whether -0.0 == +0.0 matters.
>> ABI stability
>>
>> This must be implemented before ABI stability is declared.
>> Effect on API resilience
>>
>> N/A
>> Alternatives ConsideredSpaceship
>>
>> Early versions of this proposal aimed to instead provide a <=> operator
>> in place of compare. The only reason we moved away from this was that it
>> didn’t solve the problem that comparison didn’t generalize.
>>
>> Spaceship as an operator has a two concrete benefits over compare today:
>>
>>    - It can be passed to a higher-order function
>>    - Tuples can implement it
>>
>> In our opinion, these aren’t serious problems, especially in the long term.
>>
>> Passing <=> as a higher order function basically allows types that aren’t
>> Comparable, but do provide <=>, to be very ergonomically handled by
>> algorithms which take an optional ordering function. Types which provide
>> the comparable operators but don’t conform to Comparable are only pervasive
>> due to the absence of conditional conformance. We shouldn’t be designing
>> our APIs around the assumption that conditional conformance doesn’t exist.
>>
>> When conditional conformance is implemented, the only
>> should-be-comparable-but-aren’t types that will remain are tuples, which
>> we should potentially have the compiler synthesize conformances for.
>>
>> Similarly, it should one day be possible to extend tuples, although this
>> is a more “far future” matter. Until then, the (T, T) -> Bool predicate
>> will always also be available, and < can be used there with the only
>> downside being a potential performance hit.
>> Just Leave Floats Alone
>>
>> The fact that sorting floats leads to a mess, and storing floats can lead
>> to memory leaks and data loss isn’t acceptable.
>> Just Make Floats Only Have A Total Order
>>
>> This was deemed too surprising for anyone familiar with floats from any
>> other language. It would also probably break a lot more code than this
>> change will.
>> Just Make Floats Not Comparable
>>
>> Although floats are more subtle than integers, having places where
>> integers work but floats don’t is a poor state of affairs. One should be
>> able to sort an array of floats and use floats as keys in data structures,
>> even if the latter is difficult to do correctly.
>> PartialComparable
>>
>> PartialComparable would essentially just be Comparable without any stated
>> ordering requirements, that Comparable extends to provide ordering
>> requirements. This would be a protocol that standard IEEE comparison could
>> satisfy, but in the absence of total ordering requirements,
>> PartialComparable is effectively useless. Either everyone would consume
>> PartialComparable (to accept floats) or Comparable (to have reasonable
>> behaviour).
>>
>> The Rust community adopted this strategy to little benefit. The Rust libs
>> team has frequently considered removing the distinction, but hasn’t because
>> doing it backwards compatibly would be complicated. Also because merging
>> the two would just lead to the problems Swift has today.
>> Different Names For compare and ComparisonResult
>>
>> A few different variants for ComparisonResult and its variants were
>> considered:
>>
>>    - Dropping the ordered part of ComparisonResult’s cases e.g. .ascending
>>    - Naming of ComparisonResult as SortOrder
>>    - enum Ordering { case less, equal, greater } (as used by Rust
>>    <https://doc.rust-lang.org/std/cmp/enum.Ordering.html>)
>>    - Case values of inOrder, same, outOfOrder
>>
>> The choice of case names is non-trivial because the enum shows up in
>> different contexts where different names makes more sense. Effectively, one
>> needs to keep in mind that the “default” sort order is ascending to map
>> between the concept of “before” and “less”.
>>
>> The before/after naming to provide the most intuitive model for custom
>> sorts – referring to ascending or less is confusing when trying to
>> implement a descending ordering. Similarly the inOrder/outOfOrder naming
>> was too indirect – it’s more natural to just say where to put the element.
>> If the enum should focus on the sorting case, calling it SortOrder would
>> help to emphasize this fact.
>>
>> This proposal elects to leave the existing Foundation name in-place. The
>> primary motivation for this is that use of the compare function will be
>> relatively rare. It is expected that in most cases users will continue to
>> make use of == or <, returning boolean values (the main exception to this
>> will be in use of the parameterized String comparisons). As such, the
>> source compatibility consequences of introducing naming changes to an
>> existing type seems of insufficient benefit.
>>
>> The method compare(_:) does not fully comport with the API naming
>> guidelines. However, it is firmly established with current usage in
>> Objective-C APIs, will be fairly rarely seen/used (users will usually
>> prefer <, == etc), and alternatives considered, for example compared(to:),
>> were not a significant improvement.
>> Add Overloads for (T, T) -> ComparisonResult
>>
>> It would be slightly more ergonomic to work with ComparisonResult if
>> existing methods that took an ordering predicate also had an overload for (T,
>> T) -> ComparisonResult. As it stands, a case-insensitive sort must be
>> written as follows:
>>
>> myStrings.sort { $0.compare(_ other: $1, case: .insensitive) == .orderedAscending }
>>
>> With the overload, one could write:
>>
>> myStrings.sort { $0.compare($1, case: .insensitive) }
>>
>> we decided against providing these overloads because:
>>
>>    - The existing algorithms in the standard library can’t benefit from
>>    them (only binary comparisons).
>>    - They bloat up the standard library (and any library which intends to
>>    match our API guidelines).
>>    - They potentially introduce confusion over “which” comparison
>>    overload to use.
>>
>> And because we can change our mind later without concern for source or ABI
>> stability, as these overloads would be additive.
>> Future Work
>>
>> This section covers some topics which were briefly considered, but were
>> identified as reasonable and possible to defer to future releases.
>> Specifically they should be backwards compatible to introduce even after
>> ABI stability. Two paths that are worth exploring:
>> Ergonomic Generalized Comparison for Keyed Containers
>>
>> Can we make it ergonomic to use an (arbitrary) alternative comparison
>> strategy for a Dictionary or a BinaryTree? Should they be type-level
>> Comparators, or should those types always store a (Key, Key) ->
>> ComparisonResult closure?
>>
>> We can avoid answering this question because Dictionary is expected to
>> keep a relatively opaque (resilient) ABI for the foreseeable future, as
>> many interesting optimizations will change its internal layout. Although if
>> the answer is type-level, then Default Generic Parameters must be accepted
>> to proceed down this path.
>> ComparisonResult Conveniences
>>
>> There are a few conveniences we could consider providing to make
>> ComparisonResult more ergonomic to manipulate. Such as:
>>
>> // A way to combine orderingsfunc ComparisonResult.breakingTiesWith(_ order: () -> ComparisonResult) -> ComparisonResult
>>
>> array.sort {
>>   $0.x.compare($0.y)
>>   .breakingTiesWith { $0.y.compare($1.y) }
>>   == .orderedAscending
>> }
>>
>> and
>>
>> var inverted: ComparisonResult
>> // A perhaps more "clear" way to express reversing order than `y.compared(to: x)`
>> x.compare(y).inverted
>>
>> But these can all be added later once everyone has had a chance to use
>> them.
>>
>> _______________________________________________
>> swift-evolution mailing list
>> swift-evolution at swift.org
>> https://lists.swift.org/mailman/listinfo/swift-evolution
>>
>>
> _______________________________________________
> swift-evolution mailing list
> swift-evolution at swift.org
> https://lists.swift.org/mailman/listinfo/swift-evolution
>

-- 
-Dave