[swift-evolution] [pitch] Comparison Reform

Xiaodi Wu xiaodi.wu at gmail.com
Sat Apr 15 23:49:18 CDT 2017


On Sat, Apr 15, 2017 at 10:32 PM, Dave Abrahams via swift-evolution <
swift-evolution at swift.org> wrote:

>
> on Sat Apr 15 2017, Xiaodi Wu <swift-evolution at swift.org> wrote:
>
> > On Sat, Apr 15, 2017 at 3:12 PM, Dave Abrahams via swift-evolution <
> > swift-evolution at swift.org> wrote:
> >
> >>
> >> on Thu Apr 13 2017, Xiaodi Wu <swift-evolution at swift.org> wrote:
> >>
> >> > Getting this sorted out is definitely a worthwhile effort. I do have
> >> > thoughts about this proposal:
> >> >
> >> > I continue to have reservations about an identical spelling (e.g.
> `==`)
> >> > giving two different answers with the same values of the same type,
> >> > depending on the generic context. It is a very *clever* design, but
> it is
> >> > also a very *subtle* behavior that I can see leading to much confusion
> >> and
> >> > befuddlement for any user who is not well versed *both* in the
> >> intricacies
> >> > of IEEE floating point *and* in the intricacies of Swift.
> >>
> >> I can't help but think that the concern over confusion here is not
> >> informed by any realistic situations.  Please describe
> >>
> >
> > To be clear, I'm not claiming that my concerns about the proposal
> outweigh
> > my enthusiasm for it.
> >
> > But here, the confusion I'm concerned about stems from the essential
> > conclusion by the proposal authors that types (including, but not
> > necessarily only limited to FP types) which are ordinarily compared in a
> > way that treats certain "special values" differently must also present an
> > alternative notion of comparison that accounts for all possible
> > values.
>
> That may be a conclusion, but it's not an assumption.  For example, it's
> totally reasonable that there is a value of Int (i.e. 0) for which the
> requirements of division don't hold.  We say that 0 is outside the
> domain of / when used as a divisor, and we tried to get away with saying
> that NaN was outside the domain of ==.  However, it's also reasonable to
> trap on integer division by zero.
>
> What we have is a situation where values that “misbehave” when given
> IEEE semantics occur in normal code and are expected to interoperate
> with other floating point values under normal circumstances (such as
> when sorting), and not only interoperate but give reasonable results.
>
> Now, having thought about this a bit more myself, here is a real case
> where confusion might occur:
>
>   if values.contains(.NaN) {
>     print(values.filter { $0 != .NaN }) // Surprise, NaN is printed!
>   }
>
> I find this result truly loathsome, but it seems to me that the only
> reasonable cure is giving == equivalence relation semantics under all
> circumstances.
>
> > The special casing of certain values already makes comparison
> > operations difficult to understand; I guess I'm simply stating what is
> > unavoidably true, that having a "non-special-cased" comparison *in
> > addition* to that adds additional difficulty.
>
> Yup.
>
> > One example where this comes to mind is this: `XCTAssertEqual(+0.0,
> -0.0)`
> > will pass or fail depending on whether XCTAssertEqual provides a
> > FP-specific specialization. This is something that cannot be determined
> by
> > reading the testing code itself: that is, it requires not merely
> knowledge
> > about whether or not the call site "knows" that it's operating on a FP
> > type, but also knowledge about whether XCTest itself is FP-aware.
>
> Yep.
>
> > The issue increases in difficulty when it comes to non-stdlib types.
> > Suppose I were to write a Rational type. (I am writing a Rational type,
> but
> > we can talk about that later.) This Rational type is capable of
> > representing NaN (`(0 / 0 as Rational<Int>).isNaN == true`) and, for
> > consistency with FP types, `Rational<Int>.nan != Rational<Int>.nan`.
>
> Hey, it's your funeral! ;-)
>
> > If this proposal is implemented, *I* would be responsible for making
> > XCTest Rational-aware, vending `XCTAssertEqual<Rational<T>>(_:_:)`.
>
> Now you're mixing apples and oranges with NaN and +/-0, the latter of
> which doesn't apply to rationals.  The way you make XCTest aware of NaN
> is to do the same thing that floating point did: vend the static version
> of == separately from the dynamic version, using @_implements (and get
> us to drop the underscore).
>
> XCTestAssertEqual is a perfect example of a generic function: it is (or
> should be!) concerned with verifying that results match expectations,
> not that “==” behaves according to the specific quirks of a type.
> Anything else would make
>
>    XCTestAssertEqual(resultOfComputation, NaN)
>
> meaningless.


Hmm, I will admit I have not tried it because I have always assumed that
the assertion above is meaningless, and that it ought to be.
`XCTAssertTrue(resultOfComputation.isNaN)` is the way I've been testing my
NaN's.

> In fact, to ensure that users of Rational get the expected behavior
> > everywhere, I would have to vend special Rational-aware versions of
> > every stdlib, core library, and _third-party_ function that is
> > FP-aware, which it is not possible to do.
>
> I'm sorry, I don't follow.  Give me an example, please, and also clearly
> define “FP-awareness.”
>

By FP awareness, I mean that in comparing two FP values a function uses
`FloatingPoint.==` as opposed to `Comparable.==`. For example, I expect
`XCTAssertEqual<T : FloatingPoint>(_:_:)` to be vended as part of XCTest,
in order to make sure that `XCTAssertEqual(resultOfComputation,
Double.nan)` always fails.

>> Actually, the fact that this behavior cannot even be achieved without
> >> > currently non-existent compiler features means that it is not possible
> >> > to understand what's truly going on without reading *this document*,
> >>
> >> This doesn't seem like a reasonable argument.  New compiler features get
> >> documented outside of the proposals they come from.  Nobody's going to
> >> have to go read a proposal to understand what @implements means.
> >
> > I do not mean that I seriously believe @_implements will never be
> > documented. (Although, as an underscored attribute, it does not need to
> be
> > documented outside of the apple/swift repo, in the same way that
> > @_transparent and its ilk are not documented outside the repo.)
>
> Either way, we either end up documenting @_implements or special magic
> behavior for floating point types.  Somebody already needs to document
> the latter for floating point types (.NaN != .NaN), frankly, even though
> IEEE is a
> commonly implemented standard.
>
> > I am simply saying that fully understanding how `Comparable` and
> > `FloatingPoint` interact with each other will require learning one
> > additional advanced feature of the language than is required now,
>
> Agreed, it will require learning something new.
>
> > and that this feature does not even currently exist.
>
> I don't understand the relevance of that last point though.


Not critical to the argument. Just that it will impact all users of Swift.
It's not as though some particularly studious user of Swift 3 will be
rewarded with a head start because he or she has devoted time to careful
study of the language.


>
> >
> >
> >> > after mastering *both* IEEE floating point *and* Swift
> >> > generics/protocols/extensions/static vs. dynamic dispatch. All to use
> >> > `==` correctly.
> >>
> >> I don't understand this argument.  The *whole* point of this proposal is
> >> that you use `==` correctly *without* the need for any special
> knowledge.
> >>
> >> > Which is to say, most people will simply not even know if they happen
> >> > to be using the `==` they did not intend to use.
> >>
> >> Most people aren't aware that IEEE comparison is quirky and don't know
> >> what they intend with respect to those semantics, but anyone who *is*
> >> aware of his intention has an easy way to understand what's happening.
> >> Does this code know it's operating on floating point numbers?  If so,
> >> it's IEEE.  If not, it's an equivalence relation.
> >>
> >> > I think consideration should be given to a design that achieves a
> >> > user-facing but not onerous differentiation between level 1 and level
> 2
> >> > equality. However, the only one I can think of is essentially a
> different
> >> > shade of the `PartiallyComparable` alternative already outlined in the
> >> > document.
> >>
> >> I am *deeply* opposed to such a protocol.  It is purely syntactic in
> >> nature and thus useless for correctly constraining generic algorithms.
> >
> > But there do exist types for which some pairs of values cannot be
> compared
> > to each other. A generic algorithm that blows up if a value cannot be
> > compared to another value should not operate on such types, no?
>
> I don't know what you mean by “blows up.” If you mean, “traps,” I think
> that's fine.  If you mean, “misbehaves,” I think that's really hard to
> justify unless it's easy for users to keep these values from ever
> entering their program.  The algorithm itself doesn't know all the types
> on which it can operate, so its documentation isn't in a position to
> warn users about this precondition.  Therefore, the caller, who got the
> values from somewhere and may be generic itself, is in no position to
> check for the problem.  The only cure is to bottleneck production of
> values of that type and ensure that those values never get beyond some
> boundary in the code.  I don't think that's practical in the case of
> floating point, not least because ordinary operations on legit values
> can produce the problematic values.
>
> >> People will use it anyway, resulting in algorithms that statically
> >> accept, but are incorrect for, floating point.  In my opinion there's
> >> only one feasible answer that doesn't use the static/dynamic distinction
> >> we've proposed: throw IEEE semantics under the bus, making it available
> >> only under a different syntax.
> >>
> >> This would be a drastic move whose primary downside is that floating
> >> point code ported from C would need to be carefully audited and may
> >> become less easy to read.  But at least it would be viable.
> >>
> >
> > Yes I agree that it would be a much more drastic move that would make
> > Swift difficult to use for numerics, and I, at least, would not want
> > to go down that road.
>
> I'd rather not, myself.  The choices are not good:
>
> 1. Complexity and occasional surprises that can be documented and
>    explained.
> 2. Silent misbehavior.
> 3. Separate syntax for distinct semantics.
>
> #2 is the only one I find completely unacceptable.
>
> >> > Yet I cannot help but think that the rejected alternative may be
> >> > advantageous in one key aspect. `FloatingPoint` comparison is in a
> >> > sense "less refined" (not exactly precise language, I know) than the
> >> > level 2 ordering proposed here, at least in the sense that the latter
> >> > offers more semantic guarantees about the relationships between
> >> > comparison operators.
> >>
> >> No, they are effectively refinement siblings.  If it was a hierarchical
> >> relationship, we could just make the floating point types conform to
> >> Equatable.  But floating point types are *required* by IEEE to have
> >> comparison semantics that conflict with Equatable.
> >>
> >> > It's weird that the less refined `FloatingPoint` refines the more
> >> > refined `Comparable`,
> >>
> >> Do you want to be able to sort floating point numbers without providing
> >> a comparison predicate (one that has to be spelled less obviously than
> >> "<")?  If so, floating point numbers must be Comparable.  If not, we
> >> could talk about breaking this refinement relationship.
> >>
> >> > and I think the acrobatics with compiler support illustrate how the
> >> > design is actively working against Swift's overarching direction.
> >>
> >> It's not more acrobatic than things we already do in the standard
> >> library to ensure that people transparently see the right behavior (see
> >> .lazy), and we could probably even find ways to do it without language
> >> features.  It would be less principled, harder to understand, and more
> >> fragile than designing a language feature that addresses the need
> >> directly.
> >
> > I have an incipient idea. It begins with:
> >
> > enum ComparisonResult {
> >   case orderedAscending, orderedSame, orderedDescending, unordered
> > }
> >
> > I am sure there is something I am missing which makes this design a
> > horrible idea, but I'm interested in hearing them.
>
> It's not a horrible idea, but to be fair it's not really a design yet,
> either.  You haven't said anything about what it's supposed to mean, how
> it is supposed to be used, how people write, use, and compose generic
> algorithms with it, how it deals with floating point, etc.
>

I've evolved my thinking based on the discussion. Let's see:

```
public enum ComparisonResult : Equatable {
  case orderedAscending, equivalent, orderedDescending, unordered
  // I have renamed one case, in the hopes of emphasizing that two values
  // that compare `equivalent` are not merely ordered the same, but should
  // satisfy the three conditions of an equivalence relation.
}
// I will have to leave the question of how to bridge from Obj-C
// NSComparisonResult up to more capable hands.

public protocol Comparable : Equatable {
  func compared(to other: Self) -> ComparisonResult
}
// This will have to be modified as necessarily to support compiler magic
// necessary for source-compatibility with Swift 3

extension Comparable {
  public static func == (lhs: Self, rhs: Self) -> Bool {
    let comparison = lhs.compared(to: rhs)
    precondition(comparison != .unordered)
    return comparison == .equivalent
  }

  public static func < (lhs: Self, rhs: Self) -> Bool {
    let comparison = lhs.compared(to: rhs)
    precondition(comparison != .unordered)
    return comparison == .orderedAscending
  }
  // etc.

  // Something I thought I'd never want to see, but on reflection not
terrible:
  public static func &== (lhs: Self, rhs: Self) -> Bool {
    return lhs.compared(to: rhs) == .equivalent
  }

  public static func &< (lhs: Self, rhs: Self) -> Bool {
    return lhs.compared(to: rhs) == .orderedAscending
  }
  // etc.
}

extension FloatingPoint : Comparable {
  public func compared(to other: Self) -> ComparisonResult {
    if isNaN || other.isNaN { return .unordered }
    if isLess(than: other) { return .orderedAscending }
    if other.isLess(than: self) { return .orderedDescending }

    // On reconsideration, probably actually least surprising if +0.0 ==
-0.0
    // no matter what. It does not matter that division can give different
    // results for two equivalent values, only that the three rules of an
    // equivalence relation will hold, and it does.
    //
    // If a user is really savvy to the sign of zero, there is
FloatingPoint.sign,
    // which is necessary in any case for working with FP operations that
    // distinguish between +0 and -0. I can't think of a generic algorithm
over
    // all Comparable types that could make use of the distinction between
+0
    // and -0.
    return .equivalent
  }
}
```

In this design, `==`, `<`, etc. correspond to IEEE "signaling" operators,
while `&==`, `&<`, etc. correspond to C-style operators, which IEEE calls
"quiet" and actually notates using "?==", "?<", etc.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-evolution/attachments/20170415/8f0e4b6c/attachment.html>


More information about the swift-evolution mailing list