[swift-evolution] [Draft][Proposal] Formalized Ordering

Dave Abrahams dabrahams at apple.com
Fri Jul 22 18:34:09 CDT 2016

on Fri Jul 22 2016, Tony Allevato <swift-evolution at swift.org> wrote:

> On Fri, Jul 22, 2016 at 2:52 PM Dave Abrahams via swift-evolution <
> swift-evolution at swift.org> wrote:
>> on Fri Jul 22 2016, Tony Allevato <swift-evolution at swift.org> wrote:
>> > I like a lot of this, but the changes to Equatable are where I get stuck.
>> > What are the scenarios where areSame is useful *outside* the context of
>> the
>> > proposed new Comparable interface?
>> >
>> > I ask because changing the requirement for Equatable to areSame instead
>> of
>> > == seems like a backwards change to me. There are plenty of unorderable
>> > types where == is the obvious thing you want to implement, and this makes
>> > it less obvious. It also adds a named method to a protocol to serve the
>> > purpose of an operator, which I've been fighting hard against in SE-0091
>> > (even though you keep the global one and delegate to it).
>> >
>> > There are two concepts at play here: comparability and orderability.
>> 99.99%
>> > of the time, they are identical.
>> The concepts are “domain-specific semantics” vs “semantics that is
>> useful in generic contexts.”  Yes, they are usually identical.
>> > Your proposal mentions one place where they're not: IEEE floating
>> > point numbers, because there exists an element in that space, NaN,
>> > that doesn't satisfy an equivalence relation at all.
>> It's not limited to NaN.  The +0/-0 distinction can be tricky as well.
>> > But it's still reasonable to want a stable ordering with those
>> > included.
>> It's also reasonable to want to search for those in a collection or use
>> them as hash keys.  I'm pointing this out because it goes to the
>> definition of equality, which sorting in general does not.
>> > In the proposal as it's written right now, the individual inequality
>> > operators are implemented in terms of <=>. That won't work for
>> > FloatingPoint, because (NaN < x) and (NaN >= x) should both be false but
>> > the default implementations provided would make the latter true. So
>> > FloatingPoint would still have to provide its own implementations of *all
>> > of the (in)equality operators*, not just ==, in order to have the correct
>> > definition w.r.t. to IEEE 754. I didn't see that called out anywhere in
>> the
>> > write-up.
>> That's my error, actually. I wasn't thinking straight when I proposed a
>> change to the proposal that I claimed dropped the need for the other
>> operators.
>> > That being said, don't get me wrong—there's still a lot about this
>> proposal
>> > that I like :)  Here's what I'm thinking (which is mostly what you have
>> > written, with some tweaks):
>> >
>> > 1) Don't change Equatable. I don't see a need to distinguish between
>> > equivalence and equality on its own (if there is one, please let me
>> > know!).
>> There is, because for algorithms that require Equatable to have any kind
>> of meaningful semantics the equivalence relation requirement must be
>> fulfilled, and prominent types exist whose `==` operator is not an
>> equivalence relation.
> Thanks for the detailed reply, Dave—this and some of your earlier replies
> to this thread have helped me understand the use cases for this
> distinction. So the argument is that something like this:
>     [ 1.0, -2.0, Double.NaN, 4.0 ].contains(Double.NaN)
> should return true because the argument is in the same equivalence class as
> the element, even though NaN == NaN currently returns false? 

Yes... but much more than that, too.

You should be *allowed* to use algorithms like `contains` on arrays of
`Double` with predictable results. Today, `Double` “conforms” to
`Equatable` but `==` has the wrong semantics.  So, `Double` lies to the
rest of the library, and therefore, technically, any uses of Double with
those algorithms has unspecified behavior.  

We can fudge around it a bit and document that NaN isn't “in the domain
of `==`,” similar to the way `0 as Int` isn't in the domain of
divisors. That effectively means that you're allowed to use `contains`
on an array of `Double`, with predictable results, so long as it
contains no NaNs... and otherwise, unspecified behavior.  But IMO that's
rather awful.

> That seems totally reasonable, and I think it would be very helpful
> for the proposal to specifically call out some of these
> scenarios—right now it focuses mostly on ordering, which led to my
> confusion.
> To take this further, let's say I have a data structure where the elements
> are a type with multiple fields, and the ordering is determined by a single
> one of those fields (the "key"). 

If the ordering is a total ordering over keys, this is a strict weak
ordering over elements, where all elements with the same key fall into
the same equivalence class.  If you consider the other fields to be
inessential to the value of your elements (c.f. Array's capacity), you
can call it a total ordering over your elements.

> Would a reasonable definition of equivalence in this model be one
> where a is equivalent to b if a.key == b.key, regardless of the values
> of the other fields, and equality is implemented by comparing all the
> fields? 

The proposal as written specifies that `a <=> b` is a total ordering and
it is consistent with `areSame(a, b)`.  That means that the definition of
`areSame` that only compares keys would be allowed iff you consider the
other fields to be inessential to the value of your elements.

Equality via `==` is allowed to have whatever domain-specific semantics
you like. Best practice would be for `==` to match `areSame`
exactly. Second-best practice would be for it to not match but still be
an equivalence relation.  Unfortunately floats follow “worst practice”

> This is similar to how C++ STL's definition of equivalence for ordered
> collections falls out of the expression !(a < b) && !(b < a), since
> you could conceivably implement < and == with the same distinctions
> there.
> I would be completely supportive of using `===` for this equivalence
> instead of areSame, based on your argument about identity. Would we need an
> escape hatch for people who absolutely need to know whether two instances
> occupy the same address? 

`ObjectIdentifier(a) == ObjectIdentifier(b)` already works.

> If I had to choose equivalence vs. same-address as the one to make
> more verbose, same-address seems like the obvious choice to me.
>> > As it stands today, I think the proposal "leaks" ordering concepts into
>> > Equatable when it shouldn't.
>> I don't see any evidence for that, and I don't even believe you've said
>> anything here to support that point of view.
>> > 2) Comparable defines <=>, as proposed, but *also* defines <, >, <=, >=.
>> A
>> > protocol extension provides defaults for <, >, <=, >=, ==, and !=
>> > implemented in terms of <=>. This lets most implementors of Comparable
>> > implement <=> and get everything else for free, but it also lets types
>> > replace individual operators with customized implementations (see #4
>> below)
>> > easily *within* the type (SE-0091).
>> Check
>> > 3) Comparable should be documented to imply that the default behavior is
>> to
>> > link the behavior of <=> to the individual comparisons, but that it can
>> be
>> > changed, meaning that only <=> must define a total ordering and the
>> > individual comparison operators need not.
>> Yes, the doc comments are missing from the proposal.
>> > 4) The very few types, like FloatingPoint, that need to provide
>> > domain-specific behavior to do the obvious/intended thing for users can
>> and
>> > should override <, >, <=, >=, ==, and !=. This should be called out
>> > explicitly, and it would *not* affect ordering.
>> Depends what you mean by “affect ordering.”  Clearly if you sort Floats
>> using < explicitly, it will have an effect.
>> > I think it's entirely reasonable to have (NaN == NaN) return false and
>> > (NaN != NaN) return true but (NaN <=> NaN) return .same without
>> > introducing another areSame concept, because the former is demanded by
>> > IEEE 754.  5) Algorithms that rely on a total order, like sorts, must
>> > be implemented in terms of <=>, not in terms of the individual
>> > operators, because of the possibility that the definitions can be
>> > severed above.
>> But you're forgetting algorithms that require an equivalence relation,
>> which is basically everything that's constrained to Equatable.
>> > As mentioned below, the one thing that a three-way comparison loses is
>> the
>> > easy ability to pass > instead of < to reverse the ordering, but it's
>> > trivial to write a function that does this and I think it should be
>> > included as part of the proposal. Something like this (may be typos, I'm
>> > writing it in Gmail):
>> >
>> > public func reverse<C: Comparable>(ordering: (C, C) -> Ordering) -> (C,
>> C)
>> > -> Ordering {
>> >   return { lhs, rhs in
>> >     switch ordering(lhs, rhs) {
>> >     case .ascending: return .descending
>> >     case .descending: return .ascending
>> >     case .same: return .same
>> >   }
>> > }
>> >
>> > (Comedy alternative: Add a second operator, >=<. But that might be
>> pushing
>> > it.)
>> Agreed, we should do something about this use case.
>> --
>> Dave
>> _______________________________________________
>> swift-evolution mailing list
>> swift-evolution at swift.org
>> https://lists.swift.org/mailman/listinfo/swift-evolution
> _______________________________________________
> swift-evolution mailing list
> swift-evolution at swift.org
> https://lists.swift.org/mailman/listinfo/swift-evolution


More information about the swift-evolution mailing list