[swift-dev] Rationalizing FloatingPoint conformance to Equatable

Fri Oct 27 01:09:01 CDT 2017

> On Oct 26, 2017, at 8:16 PM, Xiaodi Wu <xiaodi.wu at gmail.com> wrote:
> 
> On Thu, Oct 26, 2017 at 4:34 PM, Jonathan Hull <jhull at gbis.com <mailto:jhull at gbis.com>> wrote:
> 
>> On Oct 26, 2017, at 11:47 AM, Xiaodi Wu <xiaodi.wu at gmail.com <mailto:xiaodi.wu at gmail.com>> wrote:
>> 
>> On Thu, Oct 26, 2017 at 1:30 PM, Jonathan Hull <jhull at gbis.com <mailto:jhull at gbis.com>> wrote:
>> Now you are just being rude. We all want Swift to be awesome… let’s try to keep things civil.
>> 
>> Sorry if my reply came across that way! That wasn't at all the intention. I really mean to ask you those questions and am interested in the answers:
> 
> Thank you for saying that. I haven’t been sleeping well, so I am probably a bit grumpy.
> 
>> Unless I misunderstand, you're arguing that your proposal is superior to Rust's design because of a new operator that returns `Bool?` instead of `Bool`; if so, how is it that you haven't reproduced Rust's design problem, only with the additional syntax involved in unwrapping the result?
> 
> Two things:
> 
> 1) PartialEq was available in generic contexts and it provided the IEEE comparison. Our IEEE comparison (which I am calling ‘&==‘ for now) is not available in generic contexts beyond FloatingPoint. If we were to have this in a generic context beyond FloatingPoint, then we would end up with the same issue that Rust had.
> 
> What I'm saying is that we *must* have this available in generic contexts beyond FloatingPoint, such as on Numeric, for reasons I've described and which I'll elaborate on shortly.

I disagree pretty strongly with this. 

I get that that is your point of view, but I really don’t think it is possible to have everything here at the same time.  Nothing prevents you from adding this conformance in your own code (though I wouldn’t recommend it).

> 2) It is actually semantically different. This MostlyEquatable protocol returns nil when the guarantees of the relation would be violated… and the author has to decide what to do with that.  Depending on the use case, the best course of action may be to: treat it as false, trap, throw, or branch.  Swift coders are used to this type of decision when encountering optionals. 
> 
> 
>> And if, as I understand, your argument is that your design is superior to Rust's *because* it requires unwrapping, then isn't the extent to which people will avoid using the protocol unintentionally also equally and unavoidably the same extent to which it makes Numeric more cumbersome?
> 
> It isn’t that unwrapping is meant to be a deterrent, it is that there are cases where the Equivalence relation may fail to hold, and the programmer needs to deal with those (when working in a generic context).  Failure to do so leads to subtle bugs.
> 
> Numeric has to use ‘==?’ because there are cases where the relation will fail. I’d love for it to conform to Equatable, but it really doesn’t if you look at it honestly, because it can run into cases where reflexivity doesn’t hold, and we have to deal with those cases.
> 
> Well, it's another thing entirely if you want Numeric not to be Equatable (or, by that token, Comparable). Yes, it'd be correct, but that'd be a surprising and user-hostile design.

Yes, that is what I am saying. Numeric can’t actually conform to Equatable (without lying), so let’s be up front about it.  It does, however, conform to this new idea of MostlyEquatable, so we can use that for our generic needs.  MostlyEquatable semantically provides everything Equatable does… but with the extra possibility that the relation may not hold (it actually gives you additional information).  Everything that is possible with Equatable is also possible with MostlyEquatable (just not with the same number of machine instructions).

Everything I have said here applies to Comparable as well, and I have a similar solution in mind that I didn’t want to clutter the discussion with.

I also want to point out that you still have full speed in both Equatable contexts and in FloatingPoint contexts. It is just in generic code that mixes the two that we have some inefficiency because of the differing guarantees. This is true of generic code in general.

> As I said above, the typical ways to handle that nil would be: treat it as false, trap, throw, or branch.  The current behavior is equivalent to "treat it as false”, and yes, that is the right thing for some algorithms (and you can still do that). But there are also lots of algorithms that need to trap or throw on Nan, or branch to handle it differently.  The current behavior also silently fails, which is why the bugs are so hard to track down.
> 
> That is inherent to the IEEE definition of "quiet NaN": the operations specified in that standard are required to silently accept NaN.
> 
> Premature optimization is the root of all evil.
> 
> 
>> You said it was impossible, so I gave you a very quick example showing that the current behavior was still possible.  I wasn’t recommending that everyone should only ever use that example for all things.
>> 
>> For FloatingPoint, ‘(a &== b) == true’ would mimic the current behavior (bugs and all). It may not hold for all types.
> 
> Oops, that should be ‘==?’ (which returns an optional).  I am getting tired, it is time for bed.
> 
> 
>> No, the question was how it would be possible to have these guarantees hold for `Numeric`, not merely for `FloatingPoint`, as the purpose is to use `Numeric` for generic algorithms. This requires additional semantic guarantees on what you propose to call `&==`.
> 
> Well, they hold for FloatingPoint and anything which is actually Equatable. Those are the only things I can think of that conform to Numeric right now, but I can’t guarantee that someone won’t later add a type to Numeric which also fails to actually conform to equatable in some different way. 
> 
> To be fair, anything that breaks this would also break current algorithms on Numeric anyway.
> 
> This doesn't answer my question. If `(a ==? b) == true` is the only way to spell what's currently spelled `==` in a generic context, then `Numeric` must make such semantic guarantees as are necessary to guarantee that this spelling behaves in that way for all conforming types, or else it would not be possible to write generic numeric algorithms that operate on any `Numeric`-conforming type. What would those guarantees have to be? 

You don’t have those guarantees now.  

‘(a ==? b) == true’ is one possible way to get the current behavior for FloatingPoint.  It should hold for all FloatingPoint.  It should hold for all Numeric things which are FloatingPoint or Integer (or anything Equatable).  But if someone comes up with a new exotic type *which doesn’t conform properly to Equatable*, then all bets are off.  But then it would also break current code assuming the current IEEE behavior…

But let’s say you have an algorithm you are certain is free of NaNs (maybe you filter them at an earlier stage).  Well then you could say '(a ==? b)!’.  An easy argument could also be made for allowing ‘a ==! b’ so you don’t have to wrap/unwrap.

or you might use 'guard let’ to have an early exit when NaN == NaN is discovered. 

There are also other ways to get the current behavior. For example, you could cast to FloatingPoint and use '&==‘ directly.

>> The whole point is that you have to put thought into how you want to deal with the optional case where the relation’s guarantees have failed.
>> 
>> If you need full performance, then you would have separate overrides on Numeric for members which conform to FloatingPoint (where you could use &==) and Equatable (where you could use ==). As you get more generic, you lose opportunities for optimization. That is just the nature of generic code. The nice thing about Swift is that you have an opportunity to specialize if you want to optimize more. Once things like conditional conformances come online, all of this will be nicer, of course.
>> 
>> This is a non-starter then. Protocols must enable useful generic code. What you're basically saying is that you do not intend for it to be possible to use methods on `Numeric` to ask about level 1 equivalence in a way that would not be prohibitively expensive. This, again, eviscerates the purpose of `Numeric`.
> 
> I don’t consider it “prohibitively expensive”.  I mean, dictionaries return an optional.  Lots of things return optionals.  I have to deal with them all over the place in Swift code.
> 
> I think having the tradeoff of having quicker to write code vs more performant code is completely reasonable.  Ideally everything would happen instantly, but we really can’t get away from making *some* tradeoffs here.
> 
> If I just need something that works, I can use ==? and handle the nil cases.  If unwrapping an optional is untenable from a speed perspective in a particular case for some reason, then I think it is completely reasonable to have the author additionally write optimized versions specializing based on additional information which is known (e.g. FloatingPoint or Equatable).
> 
> No, it's not the cost of unwrapping the result, it's the cost of computing the result, which is much higher than the single machine instruction that is IEEE floating-point equivalence. The point of `Numeric` is to make it possible to write generic algorithms that do meaningful math with either integer or floating-point types. If the only way to write such an algorithm with reasonable performance is to specialize one version for integers and another for floating-point values, then `Numeric` serves no purpose as a protocol.

Well, the naive implementation of ==? for floats would be:

	static func ==? (lhs: Self, rhs: Self) -> Bool? {
		if lhs.isNan && rhs.isNan {return nil}
		return lhs &== rhs
	}

But we might very easily be able to play compiler tricks to speed that up in certain cases.  For example, we could have some underscored subtype of Float or compiler annotation when the compiler can reason it won’t be NaN (e.g. constants or floats created from literals).  In those cases, it could just use the machine version directly. At the very least, comparing against literals should be able to be retain single instruction status.  The programmer shouldn’t have to worry about that though.

I don’t think it is reasonable to expect a single machine instruction in all generic contexts.  Faster is better, but the nature of generic code is that you have to accept some inefficiency in exchange for being able to write code once across multiple types with varying guarantees.  My main point was that much/all of the efficiency can be reclaimed where needed by doing extra programming work.

Also, even with ==? instead of ==, Numeric is far from useless.  For example, we can generically create math formulas using +,-, and *.  In fact, if Numeric’s only utility was ==, we would spell it Equatable.

Finally, once features from the generics manifesto come online, it might be possible to regain Equatable conformance in some cases and not others.  So, for example, you would be able to write == against a literal, but would still have to use ==? when both sides could be NaN.  That is for the future though...

>> Note that I am mostly talking about library code here.  Once you build up a library of functions on Numeric that handle this correctly, you can use those functions as building blocks, and you aren’t even worrying about == for the most part.  For example, if we build a version of index(of:) on collection which works for our MostlyEquatable protocol, then we can pass Numeric to it generically.  Whether they decided it was important enough to put in an optimization for FloatingPoint or not, it doesn’t affect the way we call it.  It could even have only a generic version for years, and then gain an optimization later if it became important.
> 
> 
> You cannot do this for most collection algorithms, because they are mostly protocol extension methods that can be shadowed but not overridden. But again, that's not what I'm talking about. I'm talking about writing _generic numeric algorithms_, not using numeric types with generic collection algorithms.

Well, for something like index(of:) it would actually be using FloatingPoint’s notion of '==?’.  Working with Numeric would just fall out for free.

As for writing generic numeric algorithms, my point was that you can use the building blocks of other algorithms written for Numeric. But nothing is stopping you writing code on Numeric which does everything it does now (just using ==? and handling the possibility of nil).  That may not always get you code which boils down to a single machine instruction, but that is true of generic code in general.  If performance is critical, then you have the option to optimize on top of the generic version.

As I said above, there are also things the compiler can do here in the generic case, so I don’t think the situation is as dire as you say.

>> The point I'm making here, again, is that there are legitimate uses for `==` guaranteeing partial equivalence in the generic context. The approximation being put forward over and over is that generic code always requires full equivalence and concrete floating-point code always requires IEEE partial equivalence. That is _not true_. Some generic code (for instance, that which uses `Numeric`) relies on partial equivalence semantics and some floating-point code can nonetheless benefit from a notion of full equivalence.
> 
> I mean, it would be nice if Float could truly conform to Equatable, but it would also be nice if I didn’t have to check for null pointers.  It would certainly be faster if instead of unwrapping optionals, I could just use pointers directly.  It would even work most of the time… because I would be careful to remember to add checks where they were really important… until I forget, and then there is a bug!  This kind of premature optimization has cost our economy literally Trillions of dollars.
> 
> We have optionals for exactly this reason in Swift.  It forces us to take those things which will "work fine most of the time”, and consider the case where it won’t.  I know it is slightly faster not to consider that case, but that is exactly why this is a notorious source of bugs.
> 
> You write as though it's a foregone conclusion that Float cannot conform to Equatable. I disagree. My starting point is that Float *can*--and in fact *must*--conform to Equatable; the question I'm asking is, how must Equatable be designed such that this can be possible?

Equatable conformance (and Equivalence Relations in general) require Reflexivity.  IEEE is not Reflexive.  QED.

Reflexivity is actually a really important guarantee to write generic code.  Removing it as a guarantee would cripple Equatable.  You couldn’t write index(of:). You couldn’t write contains(). You couldn’t write Dictionary.  Hashing in general, would break.

The closest thing to your starting point is the MostlyEquatable protocol I have described. That provides the relation, but also allows for it to fail to hold.  We are talking FloatingPoint here, but I honestly think it would be useful to a host of more complex types as well, which don’t quite fit into Equatable.  We should also keep our current notion of Equatable around as well, so types which actually meet it (e.g. Int) don’t have to worry about a case which will never happen.

>> Both concepts must be exposed in a protocol-based manner to accommodate all use cases. It will not do to say that exposing both concepts will confuse the user, because the fact remains that both concepts are already and unavoidably exposed, but sometimes without a way to express the distinction in code or any documentation about it. Disappearing the notion of partial equivalence from protocols removes legitimate use cases.
> 
> On the contrary, I am saying we should make the difference explicit.
> 
> 
>> 
>>> On Oct 26, 2017, at 11:01 AM, Xiaodi Wu <xiaodi.wu at gmail.com <mailto:xiaodi.wu at gmail.com>> wrote:
>>> 
>>> On Thu, Oct 26, 2017 at 11:50 AM, Jonathan Hull <jhull at gbis.com <mailto:jhull at gbis.com>> wrote:
>>> 
>>>> On Oct 26, 2017, at 9:40 AM, Xiaodi Wu <xiaodi.wu at gmail.com <mailto:xiaodi.wu at gmail.com>> wrote:
>>>> 
>>>> On Thu, Oct 26, 2017 at 11:38 AM, Jonathan Hull <jhull at gbis.com <mailto:jhull at gbis.com>> wrote:
>>>> 
>>>>> On Oct 26, 2017, at 9:34 AM, Xiaodi Wu <xiaodi.wu at gmail.com <mailto:xiaodi.wu at gmail.com>> wrote:
>>>>> 
>>>>> On Thu, Oct 26, 2017 at 10:57 AM, Jonathan Hull <jhull at gbis.com <mailto:jhull at gbis.com>> wrote:
>>>>> 
>>>>>> On Oct 26, 2017, at 8:19 AM, Xiaodi Wu <xiaodi.wu at gmail.com <mailto:xiaodi.wu at gmail.com>> wrote:
>>>>>> 
>>>>>> 
>>>>>> On Thu, Oct 26, 2017 at 07:52 Jonathan Hull <jhull at gbis.com <mailto:jhull at gbis.com>> wrote:
>>>>>>> On Oct 25, 2017, at 11:22 PM, Xiaodi Wu <xiaodi.wu at gmail.com <mailto:xiaodi.wu at gmail.com>> wrote:
>>>>>>> 
>>>>>>> On Wed, Oct 25, 2017 at 11:46 PM, Jonathan Hull <jhull at gbis.com <mailto:jhull at gbis.com>> wrote:
>>>>>>> As someone mentioned earlier, we are trying to square a circle here. We can’t have everything at once… we will have to prioritize.  I feel like the precedent in Swift is to prioritize safety/correctness with an option ignore safety and regain speed.
>>>>>>> 
>>>>>>> I think the 3 point solution I proposed is a good compromise that follows that precedent.  It does mean that there is, by default, a small performance hit for floats in generic contexts, but in exchange for that, we get increased correctness and safety.  This is the exact same tradeoff that Swift makes for optionals!  Any speed lost can be regained by providing a specific override for FloatingPoint that uses ‘&==‘.
>>>>>>> 
>>>>>>> My point is not about performance. My point is that `Numeric.==` must continue to have IEEE floating-point semantics for floating-point types and integer semantics for integer types, or else existing uses of `Numeric.==` will break without any way to fix them. The whole point of *having* `Numeric` is to permit such generic algorithms to be written. But since `Numeric.==` *is* `Equatable.==`, we have a large constraint on how the semantics of `==` can be changed. 
>>>>>> 
>>>>>> It would also conform to the new protocol and have it’s Equatable conformance depreciated. Once we have conditional conformances, we can add Equatable back conditionally.  Also, while we are waiting for that, Numeric can provide overrides of important methods when the conforming type is Equatable or FloatingPoint.
>>>>>> 
>>>>>> 
>>>>>>> For example, if someone wants to write a generic function that works both on Integer and FloatingPoint, then they would have to use the new protocol which would force them to correctly handle cases involving NaN.
>>>>>>> 
>>>>>>> What "new protocol" are you referring to, and what do you mean about "correctly handling cases involving NaN"? The existing API of `Numeric` makes it possible to write generic algorithms that accommodate both integer and floating-point types--yes, even if the value is NaN. If you change the definition of `==` or `<`, currently correct generic algorithms that use `Numeric` will start to _incorrectly_ handle NaN.
>>>>>> 
>>>>>> 
>>>>>> #1 from my previous email (shown again here):
>>>>>>>> Currently, I think we should do 3 things:
>>>>>>>> 
>>>>>>>> 1) Create a new protocol with a partial equivalence relation with signature of (T, T)->Bool? and automatically conform Equatable things to it
>>>>>>>> 2) Depreciate Float, etc’s… Equatable conformance with a warning that it will eventually be removed (and conform Float, etc… to the partial equivalence protocol)
>>>>>>>> 3) Provide an '&==‘ relation on Float, etc… (without a protocol) with the native Float IEEE comparison
>>>>>> 
>>>>>> 
>>>>>> In this case, #2 would also apply to Numeric.  You can think of the new protocol as a failable version of Equatable, so in any case where it can’t meet equatable’s rules, it returns nil.
>>>>>> 
>>>>>> Again, Numeric makes possible the generic use of == with floating-point semantics for floating-point values and integer semantics for integer values; this design would not.
>>>>> 
>>>>> Correct.  I view this as a good thing, because another way of saying that is: “it makes possible cases where == sometimes conforms to the rules of Equatable and sometimes doesn’t."  Under the solution I am advocating, Numeric would instead allow generic use of '==?’.
>>>>> 
>>>>> I suppose an argument could be made that we should extend ‘&==‘ to Numeric from FloatingPoint, but then we would end up with the Rust situation you were talking about earlier…
>>>>> 
>>>>> This would break any `Numeric` algorithms that currently use `==` correctly. There are useful guarantees that are common to integer `==` and IEEE floating-point `==`; namely, they each model equivalence of their respective types at roughly what IEEE calls "level 1" (as numbers, rather than as their representation or encoding). Breaking that utterly eviscerates `Numeric`.
>>>> 
>>>> Nope.  They would continue to work as they always have, but would have a depreciation warning on them.  The authors of those algorithms would have a full depreciation cycle to update the algorithms.  Fixits would be provided to make conversion easier.
>>>> 
>>>> After the depreciation cycle, Numeric would no longer guarantee a common "level 1" comparison for conforming types.
>>> 
>>> It would, using ==?, you would just be forced to deal with the possibility of the Equality relation not holding.  '(a ==? b) == true' would mimic the current behavior.
>>> 
>>> What are the semantic guarantees required of `==?` such that this would be guaranteed to be the current behavior? How would this be implementable without being so costly that, in practice, no generic numeric algorithms would ever use such a facility?
>>> 
>>> Moreover, if `(a ==? b) == true` guarantees the current behavior for all types, and all currently Equatable types will conform to this protocol, haven't you just reproduced the problem seen in Rust's `PartialEq`, only now with clumsier syntax and poorer performance?
>>> 
>>> Is it the _purpose_ of this design to make it clumsier and less performant so people don't use it? If so, to the extent that it is an effective deterrent, haven't you created a deterrent to the use of Numeric to an exactly equal extent?

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-dev/attachments/20171026/154a85fa/attachment.html>