[swift-dev] Rationalizing FloatingPoint conformance to Equatable

Xiaodi Wu xiaodi.wu at gmail.com
Fri Oct 27 01:24:42 CDT 2017

On Fri, Oct 27, 2017 at 1:09 AM, Jonathan Hull <jhull at gbis.com> wrote:

> On Oct 26, 2017, at 8:16 PM, Xiaodi Wu <xiaodi.wu at gmail.com> wrote:
> On Thu, Oct 26, 2017 at 4:34 PM, Jonathan Hull <jhull at gbis.com> wrote:
>> On Oct 26, 2017, at 11:47 AM, Xiaodi Wu <xiaodi.wu at gmail.com> wrote:
>> On Thu, Oct 26, 2017 at 1:30 PM, Jonathan Hull <jhull at gbis.com> wrote:
>>> Now you are just being rude. We all want Swift to be awesome… let’s try
>>> to keep things civil.
>> Sorry if my reply came across that way! That wasn't at all the intention.
>> I really mean to ask you those questions and am interested in the answers:
>> Thank you for saying that. I haven’t been sleeping well, so I am probably
>> a bit grumpy.
>> Unless I misunderstand, you're arguing that your proposal is superior to
>> Rust's design because of a new operator that returns `Bool?` instead of
>> `Bool`; if so, how is it that you haven't reproduced Rust's design problem,
>> only with the additional syntax involved in unwrapping the result?
>> Two things:
>> 1) PartialEq was available in generic contexts and it provided the IEEE
>> comparison. Our IEEE comparison (which I am calling ‘&==‘ for now) is not
>> available in generic contexts beyond FloatingPoint. If we were to have this
>> in a generic context beyond FloatingPoint, then we would end up with the
>> same issue that Rust had.
> What I'm saying is that we *must* have this available in generic contexts
> beyond FloatingPoint, such as on Numeric, for reasons I've described and
> which I'll elaborate on shortly.
> I disagree pretty strongly with this.
> I get that that is your point of view, but I really don’t think it is
> possible to have everything here at the same time.  Nothing prevents you
> from adding this conformance in your own code (though I wouldn’t recommend
> it).
> 2) It is actually semantically different. This MostlyEquatable protocol
>> returns nil when the guarantees of the relation would be violated… and the
>> author has to decide what to do with that.  Depending on the use case, the
>> best course of action may be to: treat it as false, trap, throw, or
>> branch.  Swift coders are used to this type of decision when encountering
>> optionals.
>> And if, as I understand, your argument is that your design is superior to
>> Rust's *because* it requires unwrapping, then isn't the extent to which
>> people will avoid using the protocol unintentionally also equally and
>> unavoidably the same extent to which it makes Numeric more cumbersome?
>> It isn’t that unwrapping is meant to be a deterrent, it is that there are
>> cases where the Equivalence relation may fail to hold, and the programmer
>> needs to deal with those (when working in a generic context).  Failure to
>> do so leads to subtle bugs.
>> Numeric has to use ‘==?’ because there are cases where the relation will
>> fail. I’d love for it to conform to Equatable, but it really doesn’t if you
>> look at it honestly, because it can run into cases where reflexivity
>> doesn’t hold, and we have to deal with those cases.
> Well, it's another thing entirely if you want Numeric not to be Equatable
> (or, by that token, Comparable). Yes, it'd be correct, but that'd be a
> surprising and user-hostile design.
> Yes, that is what I am saying. Numeric can’t actually conform to Equatable
> (without lying), so let’s be up front about it.  It does, however, conform
> to this new idea of MostlyEquatable, so we can use that for our generic
> needs.  MostlyEquatable semantically provides everything Equatable does…
> but with the extra possibility that the relation may not hold (it actually
> gives you additional information).  Everything that is possible with
> Equatable is also possible with MostlyEquatable (just not with the same
> number of machine instructions).
> Everything I have said here applies to Comparable as well, and I have a
> similar solution in mind that I didn’t want to clutter the discussion with.
> I also want to point out that you still have full speed in both Equatable
> contexts and in FloatingPoint contexts. It is just in generic code that
> mixes the two that we have some inefficiency because of the differing
> guarantees. This is true of generic code in general.
> As I said above, the typical ways to handle that nil would be: treat it as
>> false, trap, throw, or branch.  The current behavior is equivalent to
>> "treat it as false”, and yes, that is the right thing for some algorithms
>> (and you can still do that). But there are also lots of algorithms that
>> need to trap or throw on Nan, or branch to handle it differently.  The
>> current behavior also silently fails, which is why the bugs are so hard to
>> track down.
> That is inherent to the IEEE definition of "quiet NaN": the operations
> specified in that standard are required to silently accept NaN.
> Premature optimization is the root of all evil.
>> You said it was impossible, so I gave you a very quick example showing
>>> that the current behavior was still possible.  I wasn’t recommending that
>>> everyone should only ever use that example for all things.
>>> For FloatingPoint, ‘(a &== b) == true’ would mimic the current behavior
>>> (bugs and all). It may not hold for all types.
>> Oops, that should be ‘==?’ (which returns an optional).  I am getting
>> tired, it is time for bed.
>> No, the question was how it would be possible to have these guarantees
>> hold for `Numeric`, not merely for `FloatingPoint`, as the purpose is to
>> use `Numeric` for generic algorithms. This requires additional semantic
>> guarantees on what you propose to call `&==`.
>> Well, they hold for FloatingPoint and anything which is actually
>> Equatable. Those are the only things I can think of that conform to Numeric
>> right now, but I can’t guarantee that someone won’t later add a type to
>> Numeric which also fails to actually conform to equatable in some different
>> way.
>> To be fair, anything that breaks this would also break current algorithms
>> on Numeric anyway.
> This doesn't answer my question. If `(a ==? b) == true` is the only way to
> spell what's currently spelled `==` in a generic context, then `Numeric`
> must make such semantic guarantees as are necessary to guarantee that this
> spelling behaves in that way for all conforming types, or else it would not
> be possible to write generic numeric algorithms that operate on any
> `Numeric`-conforming type. What would those guarantees have to be?
> You don’t have those guarantees now.
> ‘(a ==? b) == true’ is one possible way to get the current behavior for
> FloatingPoint.  It should hold for all FloatingPoint.  It should hold for
> all Numeric things which are FloatingPoint or Integer (or anything
> Equatable).  But if someone comes up with a new exotic type *which doesn’t
> conform properly to Equatable*, then all bets are off.  But then it would
> also break current code assuming the current IEEE behavior…
> But let’s say you have an algorithm you are certain is free of NaNs (maybe
> you filter them at an earlier stage).  Well then you could say '(a ==?
> b)!’.  An easy argument could also be made for allowing ‘a ==! b’ so you
> don’t have to wrap/unwrap.
> or you might use 'guard let’ to have an early exit when NaN == NaN is
> discovered.
> There are also other ways to get the current behavior. For example, you
> could cast to FloatingPoint and use '&==‘ directly.
> The whole point is that you have to put thought into how you want to deal
>>> with the optional case where the relation’s guarantees have failed.
>>> If you need full performance, then you would have separate overrides on
>>> Numeric for members which conform to FloatingPoint (where you could use
>>> &==) and Equatable (where you could use ==). As you get more generic, you
>>> lose opportunities for optimization. That is just the nature of generic
>>> code. The nice thing about Swift is that you have an opportunity to
>>> specialize if you want to optimize more. Once things like conditional
>>> conformances come online, all of this will be nicer, of course.
>> This is a non-starter then. Protocols must enable useful generic code.
>> What you're basically saying is that you do not intend for it to be
>> possible to use methods on `Numeric` to ask about level 1 equivalence in a
>> way that would not be prohibitively expensive. This, again, eviscerates the
>> purpose of `Numeric`.
>> I don’t consider it “prohibitively expensive”.  I mean, dictionaries
>> return an optional.  Lots of things return optionals.  I have to deal with
>> them all over the place in Swift code.
>> I think having the tradeoff of having quicker to write code vs more
>> performant code is completely reasonable.  Ideally everything would happen
>> instantly, but we really can’t get away from making *some* tradeoffs here.
>> If I just need something that works, I can use ==? and handle the nil
>> cases.  If unwrapping an optional is untenable from a speed perspective in
>> a particular case for some reason, then I think it is completely reasonable
>> to have the author additionally write optimized versions specializing based
>> on additional information which is known (e.g. FloatingPoint or Equatable).
> No, it's not the cost of unwrapping the result, it's the cost of computing
> the result, which is much higher than the single machine instruction that
> is IEEE floating-point equivalence. The point of `Numeric` is to make it
> possible to write generic algorithms that do meaningful math with either
> integer or floating-point types. If the only way to write such an algorithm
> with reasonable performance is to specialize one version for integers and
> another for floating-point values, then `Numeric` serves no purpose as a
> protocol.
> Well, the naive implementation of ==? for floats would be:
> static func ==? (lhs: Self, rhs: Self) -> Bool? {
> if lhs.isNan && rhs.isNan {return nil}
> return lhs &== rhs
> }
> But we might very easily be able to play compiler tricks to speed that up
> in certain cases.  For example, we could have some underscored subtype of
> Float or compiler annotation when the compiler can reason it won’t be NaN
> (e.g. constants or floats created from literals).  In those cases, it could
> just use the machine version directly. At the very least, comparing against
> literals should be able to be retain single instruction status.  The
> programmer shouldn’t have to worry about that though.
> I don’t think it is reasonable to expect a single machine instruction in
> all generic contexts.  Faster is better, but the nature of generic code is
> that you have to accept some inefficiency in exchange for being able to
> write code once across multiple types with varying guarantees.  My main
> point was that much/all of the efficiency can be reclaimed where needed by
> doing extra programming work.
> Also, even with ==? instead of ==, Numeric is far from useless.  For
> example, we can generically create math formulas using +,-, and *.  In
> fact, if Numeric’s only utility was ==, we would spell it Equatable.
> Finally, once features from the generics manifesto come online, it might
> be possible to regain Equatable conformance in some cases and not others.
> So, for example, you would be able to write == against a literal, but would
> still have to use ==? when both sides could be NaN.  That is for the future
> though...
> Note that I am mostly talking about library code here.  Once you build up
> a library of functions on Numeric that handle this correctly, you can use
> those functions as building blocks, and you aren’t even worrying about ==
> for the most part.  For example, if we build a version of index(of:) on
> collection which works for our MostlyEquatable protocol, then we can pass
> Numeric to it generically.  Whether they decided it was important enough to
> put in an optimization for FloatingPoint or not, it doesn’t affect the way
> we call it.  It could even have only a generic version for years, and then
> gain an optimization later if it became important.
> You cannot do this for most collection algorithms, because they are mostly
> protocol extension methods that can be shadowed but not overridden. But
> again, that's not what I'm talking about. I'm talking about writing
> _generic numeric algorithms_, not using numeric types with generic
> collection algorithms.
> Well, for something like index(of:) it would actually be using
> FloatingPoint’s notion of '==?’.  Working with Numeric would just fall out
> for free.
> As for writing generic numeric algorithms, my point was that you can use
> the building blocks of other algorithms written for Numeric. But nothing is
> stopping you writing code on Numeric which does everything it does now
> (just using ==? and handling the possibility of nil).  That may not always
> get you code which boils down to a single machine instruction, but that is
> true of generic code in general.  If performance is critical, then you have
> the option to optimize on top of the generic version.
> As I said above, there are also things the compiler can do here in the
> generic case, so I don’t think the situation is as dire as you say.
> The point I'm making here, again, is that there are legitimate uses for
>> `==` guaranteeing partial equivalence in the generic context. The
>> approximation being put forward over and over is that generic code always
>> requires full equivalence and concrete floating-point code always requires
>> IEEE partial equivalence. That is _not true_. Some generic code (for
>> instance, that which uses `Numeric`) relies on partial equivalence
>> semantics and some floating-point code can nonetheless benefit from a
>> notion of full equivalence.
>> I mean, it would be nice if Float could truly conform to Equatable, but
>> it would also be nice if I didn’t have to check for null pointers.  It
>> would certainly be faster if instead of unwrapping optionals, I could just
>> use pointers directly.  It would even work most of the time… because I
>> would be careful to remember to add checks where they were really
>> important… until I forget, and then there is a bug!  This kind of premature
>> optimization has cost our economy literally Trillions of dollars.
>> We have optionals for exactly this reason in Swift.  It forces us to take
>> those things which will "work fine most of the time”, and consider the case
>> where it won’t.  I know it is slightly faster not to consider that case,
>> but that is exactly why this is a notorious source of bugs.
>> You write as though it's a foregone conclusion that Float cannot conform
> to Equatable. I disagree. My starting point is that Float *can*--and in
> fact *must*--conform to Equatable; the question I'm asking is, how must
> Equatable be designed such that this can be possible?
> Equatable conformance (and Equivalence Relations in general) require
> Reflexivity.  IEEE is not Reflexive.  QED.

Without replying yet to the remainder of this response, as a matter of
defining what it is we're debating, what you state is both true and does
not preclude Float conforming to Equatable.

Yes, an equivalence relation requires reflexivity. Yes, Equatable
conformance should guarantee an equivalence relation. But, as I stated in
my initial message, one question to be answered is: "(A) Must
`Equatable.==` be a full equivalence relation?" Note the part about `==`.
That much is not settled. My take is: no, the equivalence relation
guaranteed by conformance to Equatable does not need to be spelled `==`.

Reflexivity is actually a really important guarantee to write generic
> code.  Removing it as a guarantee would cripple Equatable.  You couldn’t
> write index(of:). You couldn’t write contains(). You couldn’t write
> Dictionary.  Hashing in general, would break.
> The closest thing to your starting point is the MostlyEquatable protocol I
> have described. That provides the relation, but also allows for it to fail
> to hold.  We are talking FloatingPoint here, but I honestly think it would
> be useful to a host of more complex types as well, which don’t quite fit
> into Equatable.  We should also keep our current notion of Equatable around
> as well, so types which actually meet it (e.g. Int) don’t have to worry
> about a case which will never happen.
> Both concepts must be exposed in a protocol-based manner to accommodate
>> all use cases. It will not do to say that exposing both concepts will
>> confuse the user, because the fact remains that both concepts are already
>> and unavoidably exposed, but sometimes without a way to express the
>> distinction in code or any documentation about it. Disappearing the notion
>> of partial equivalence from protocols removes legitimate use cases.
>> On the contrary, I am saying we should make the difference explicit.
>> On Oct 26, 2017, at 11:01 AM, Xiaodi Wu <xiaodi.wu at gmail.com> wrote:
>>> On Thu, Oct 26, 2017 at 11:50 AM, Jonathan Hull <jhull at gbis.com> wrote:
>>>> On Oct 26, 2017, at 9:40 AM, Xiaodi Wu <xiaodi.wu at gmail.com> wrote:
>>>> On Thu, Oct 26, 2017 at 11:38 AM, Jonathan Hull <jhull at gbis.com> wrote:
>>>>> On Oct 26, 2017, at 9:34 AM, Xiaodi Wu <xiaodi.wu at gmail.com> wrote:
>>>>> On Thu, Oct 26, 2017 at 10:57 AM, Jonathan Hull <jhull at gbis.com>
>>>>> wrote:
>>>>>> On Oct 26, 2017, at 8:19 AM, Xiaodi Wu <xiaodi.wu at gmail.com> wrote:
>>>>>> On Thu, Oct 26, 2017 at 07:52 Jonathan Hull <jhull at gbis.com> wrote:
>>>>>>> On Oct 25, 2017, at 11:22 PM, Xiaodi Wu <xiaodi.wu at gmail.com> wrote:
>>>>>>> On Wed, Oct 25, 2017 at 11:46 PM, Jonathan Hull <jhull at gbis.com>
>>>>>>> wrote:
>>>>>>>> As someone mentioned earlier, we are trying to square a circle
>>>>>>>> here. We can’t have everything at once… we will have to prioritize.  I feel
>>>>>>>> like the precedent in Swift is to prioritize safety/correctness with an
>>>>>>>> option ignore safety and regain speed.
>>>>>>>> I think the 3 point solution I proposed is a good compromise that
>>>>>>>> follows that precedent.  It does mean that there is, by default, a small
>>>>>>>> performance hit for floats in generic contexts, but in exchange for that,
>>>>>>>> we get increased correctness and safety.  This is the exact same tradeoff
>>>>>>>> that Swift makes for optionals!  Any speed lost can be regained by
>>>>>>>> providing a specific override for FloatingPoint that uses ‘&==‘.
>>>>>>> My point is not about performance. My point is that `Numeric.==`
>>>>>>> must continue to have IEEE floating-point semantics for floating-point
>>>>>>> types and integer semantics for integer types, or else existing uses of
>>>>>>> `Numeric.==` will break without any way to fix them. The whole point of
>>>>>>> *having* `Numeric` is to permit such generic algorithms to be written. But
>>>>>>> since `Numeric.==` *is* `Equatable.==`, we have a large constraint on how
>>>>>>> the semantics of `==` can be changed.
>>>>>>> It would also conform to the new protocol and have it’s Equatable
>>>>>>> conformance depreciated. Once we have conditional conformances, we can add
>>>>>>> Equatable back conditionally.  Also, while we are waiting for that, Numeric
>>>>>>> can provide overrides of important methods when the conforming type is
>>>>>>> Equatable or FloatingPoint.
>>>>>>> For example, if someone wants to write a generic function that works
>>>>>>>> both on Integer and FloatingPoint, then they would have to use the new
>>>>>>>> protocol which would force them to correctly handle cases involving NaN.
>>>>>>> What "new protocol" are you referring to, and what do you mean about
>>>>>>> "correctly handling cases involving NaN"? The existing API of `Numeric`
>>>>>>> makes it possible to write generic algorithms that accommodate both integer
>>>>>>> and floating-point types--yes, even if the value is NaN. If you change the
>>>>>>> definition of `==` or `<`, currently correct generic algorithms that use
>>>>>>> `Numeric` will start to _incorrectly_ handle NaN.
>>>>>>> #1 from my previous email (shown again here):
>>>>>>> Currently, I think we should do 3 things:
>>>>>>>>> 1) Create a new protocol with a partial equivalence relation with
>>>>>>>>> signature of (T, T)->Bool? and automatically conform Equatable things to it
>>>>>>>>> 2) Depreciate Float, etc’s… Equatable conformance with a warning
>>>>>>>>> that it will eventually be removed (and conform Float, etc… to the partial
>>>>>>>>> equivalence protocol)
>>>>>>>>> 3) Provide an '&==‘ relation on Float, etc… (without a protocol)
>>>>>>>>> with the native Float IEEE comparison
>>>>>>> In this case, #2 would also apply to Numeric.  You can think of the
>>>>>>> new protocol as a failable version of Equatable, so in any case where it
>>>>>>> can’t meet equatable’s rules, it returns nil.
>>>>>> Again, Numeric makes possible the generic use of == with
>>>>>> floating-point semantics for floating-point values and integer semantics
>>>>>> for integer values; this design would not.
>>>>>> Correct.  I view this as a good thing, because another way of saying
>>>>>> that is: “it makes possible cases where == sometimes conforms to the rules
>>>>>> of Equatable and sometimes doesn’t."  Under the solution I am advocating,
>>>>>> Numeric would instead allow generic use of '==?’.
>>>>>> I suppose an argument could be made that we should extend ‘&==‘ to
>>>>>> Numeric from FloatingPoint, but then we would end up with the Rust
>>>>>> situation you were talking about earlier…
>>>>> This would break any `Numeric` algorithms that currently use `==`
>>>>> correctly. There are useful guarantees that are common to integer `==` and
>>>>> IEEE floating-point `==`; namely, they each model equivalence of their
>>>>> respective types at roughly what IEEE calls "level 1" (as numbers, rather
>>>>> than as their representation or encoding). Breaking that utterly
>>>>> eviscerates `Numeric`.
>>>>> Nope.  They would continue to work as they always have, but would have
>>>>> a depreciation warning on them.  The authors of those algorithms would have
>>>>> a full depreciation cycle to update the algorithms.  Fixits would be
>>>>> provided to make conversion easier.
>>>> After the depreciation cycle, Numeric would no longer guarantee a
>>>> common "level 1" comparison for conforming types.
>>>> It would, using ==?, you would just be forced to deal with the
>>>> possibility of the Equality relation not holding.  '(a ==? b) == true'
>>>> would mimic the current behavior.
>>> What are the semantic guarantees required of `==?` such that this would
>>> be guaranteed to be the current behavior? How would this be implementable
>>> without being so costly that, in practice, no generic numeric algorithms
>>> would ever use such a facility?
>>> Moreover, if `(a ==? b) == true` guarantees the current behavior for all
>>> types, and all currently Equatable types will conform to this protocol,
>>> haven't you just reproduced the problem seen in Rust's `PartialEq`, only
>>> now with clumsier syntax and poorer performance?
>>> Is it the _purpose_ of this design to make it clumsier and less
>>> performant so people don't use it? If so, to the extent that it is an
>>> effective deterrent, haven't you created a deterrent to the use of Numeric
>>> to an exactly equal extent?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-dev/attachments/20171027/39057bce/attachment-0001.html>

More information about the swift-dev mailing list