[swift-evolution] [Draft] Rename Sequence.elementsEqual

Michael Ilseman milseman at apple.com
Mon Oct 16 11:19:01 CDT 2017



> On Oct 16, 2017, at 7:20 AM, Thorsten Seitz via swift-evolution <swift-evolution at swift.org> wrote:
> 
> 
> 
> Am 16.10.2017 um 07:19 schrieb Xiaodi Wu <xiaodi.wu at gmail.com <mailto:xiaodi.wu at gmail.com>>:
> 
>> On Sun, Oct 15, 2017 at 11:57 PM, Thorsten Seitz <tseitz42 at icloud.com <mailto:tseitz42 at icloud.com>> wrote:
>> 
>> 
>> Am 16.10.2017 um 00:41 schrieb Xiaodi Wu via swift-evolution <swift-evolution at swift.org <mailto:swift-evolution at swift.org>>:
>> 
>>> On Sun, Oct 15, 2017 at 2:32 PM, Kevin Nattinger <swift at nattinger.net <mailto:swift at nattinger.net>> wrote:
>>>> […]
>>>> Sets, as a mathematical concept, have no intrinsic order. However, instances of `Set`, which can be iterated over, *do* have at least one order which can be said to be intrinsic in the following sense: as long as iteration is possible, no API design can prevent that order from being observed and associated with the instance. Put another way, if you can use an instance of a type in a for...in loop, intrinsic to that functionality is a publicly visible order.
>>> 
>>> You keep saying this, I keep saying it’s only a technical “order” that is an implementation detail
>>> 
>>> You keep saying it's an implementation detail, which it emphatically is *not*. It's a *public guarantee* by the protocol that you can use an instance of `Set` in a `for...in` loop, thereby exposing a publicly visible order. An implementation detail is something
>> 
>> Being able to use a Set in a for...in loop does *not* make it ordered! The purpose is is just being able to do something with each element. That a for...loop works sequentially is just a side effect. Just imagine we had parallelized for...in loops.
>> 
>> No, it is not at all a "side effect." A for...in loop is a way of controlling the flow of code which accesses elements in a sequence one after another, and the correct behavior of code inside the loop depends on these semantics. A "parallel for" loop would be a totally different thing; arbitrary for...in loops can't be automatically "upgraded" to a "parallel for" loop because they have different semantics, and types that support "parallel for" would likely have to conform to a protocol other than `Sequence`.
> 
> Exactly.
> 
>>> that could go away with an alternative implementation. By contrast, no implementation that permits an instance of `Set` being iterated over in a `for...in` loop can avoid exposing at least one publicly visible order, because it's not a matter of implementation. Put another way, by allowing iteration, a type necessarily exposes at least one publicly visible order and thereby models a sequence in at least one way.
>> 
>> Wrong. I could easily implement Set or another type to iterate in a random order over its elements, so each iteration would expose a different (meaningless) order.
>> 
>> No, you cannot implement `Set` in this way because it conforms to `Collection`, which guarantees a multi-pass sequence. Set *must expose the same order on every iteration*.
> 
> Set conforming to Collection is even worse than just conforming to Sequence as a quote from the documentation shows: "In addition to the operations that collections inherit from the Sequence protocol, you gain access to methods that depend on accessing an element at a specific position in a collection."
> Clearly the elements of a Set do not have specific positions.
> 

That’s not at all clear to me, could you elaborate? My understanding is that elements of Set definitely *do* have a position, and that’s why you can use an index on a set to retrieve the element. The same index on the same set retrieves the same element.


> Furthermore my argument still stands for types derived from Sequence, so sidestepping it by pointing out that the situation is even worse for the current implementation of Set won't work. Can you explain why a type derived from Sequence which will iterate over its elements in random order should have methods like dropFirst()?
> 
> 
>>  
>> This demonstrates that being able to be used in a for...in loop is about doing somthing with each element and not about element order.
>> 
>> Again, Sequence *already doesn't* require the order to be meaningful or even repeatable.
> 
> I know. That's why I am arguing that certain methods of Sequence make no sense. At least we have reached that common ground. 
> So why do you insist on Sequence having methods like dropFirst() if the notion of "first" does not make any sense?
> 
>> Notice that, above, I said at least one order. A conforming type could
>> expose as many different orders as iterations, but that changes nothing. I can still map an instance of that type to get an array of elements, and that array will reveal some order which is the result of the inner workings of the type.
>> 
>> The latter is an additional property which should be expressed in an additional protocol like Kevin suggested.
>> 
>> What useful generic algorithms would this protocol support that are not already possible?
> 
> It would allow expressing generic algorithms depending on an order.
> 
> -Thorsten
> 
> 
>> 
>> -Thorsten
>> 
>> 
>> 
>>>  
>>> and can’t be relied upon for anything and so we shouldn’t provide methods that rely on it. I think this part of the discussion has reached the “agree to disagree” stage.
>>> 
>>>> […]
>>>> You’re a fan of the principal of least surprise. Tell me, which would be less surprising: Set.dropFirst() actually drops a random element, or Set doesn’t have a dropFirst()? And if you think dropFirst() removing an element at random is not surprising, please explain why.
>>>> 
>>>> I think Set.dropFirst removing the first element that I observe on iteration is the least surprising answer, because Swift tells me that the stdlib Set models a set but that it is also a sequence.
>>> 
>>> Your logic is backwards. You’re saying it’s “least surprising” because that’s how it’s currently implemented, not that it should be implemented that way because it’s least surprising.
>>> 
>>> No, I'm saying it's least surprising because any type that supports iterated access thereby exposes an order--not as an implementation detail but as a matter of public API--and in the absence of any other order, "first" must refer to that order so exposed.
>>>>>>> […]
>>>> 
>>>> And that’s PRECISELY why lexicographicallyEqual does not make sense to apply to unordered sets. There is no lexicographical comparison possible, so why do you keep insisting they should have a method that falsely claims to lexicographically compare them?
>>>> 
>>>> I agree! It doesn't make sense if no comparison is possible! But Swift tells me that a `Set` is a `Sequence`!
>>> 
>>> You keep making the circular argument that a Set should do things because it currently does them. If you want to argue against changing things, argue that things shouldn’t be changed, not that the current implementation is correct because it is the current implementation.
>>> 
>>> No, I'm arguing that `Set`, by supporting iterated access, is not wrong to conform to a protocol called `Sequence` because it does have an intrinsic and publicly observable order, which is not an accident of a particular implementation but is inherent to any type that supports iterated access. Now, whether it's the *best choice* to conform `Set` to `Sequence` and offer order-dependent functions is a matter of taste, but it isn't *wrong*.
>>>> […]
>>>> You will always have to account for this possibility, because Swift's `Equatable` explicitly allows "special values" to be not equal to themselves. This is, at least in part, in order to accommodate the IEEE decree that NaN != NaN:
>>> 
>>>> 
>>>> ```
>>>> let x = [Double.nan]
>>>> x.elementsEqual(x) // false
>>>> ```
>>> 
>>> NaN is special, one-shot and unordered sequences are not. Unless you think that all unordered and single-pass sequences should compare false for `elementsEqual`, this is irrelevant for any sequence that doesn’t contain NaN and well-defined (false) for any that does.
>>> 
>>> Certainly, not all single-pass sequences should compare false to themselves, but some should: for instance, an infinite single-pass stream of all 1's should compare true to itself, but an infinite single-pass stream of alternating 1's and 0's should compare false to itself. If you write generic code that calls `elementsEqual`, it is pervasively incorrect to test for identity by assuming that elementsEqual will return true on reflexive comparison. NaN is only one of many reasons why such code would blow up.
>>>  
>>> 
>>>> Changing this behavior is way beyond the scope of this thread (and has been the topic of hours (actually, weeks and weeks) of fun on this list previously).
>>> 
>>> Yes, I’ve seen the discussion on NaN and Comparable. It’s not the same discussion.
>>> 
>>>> […]
>>>>> It would be better to say that the iteration order is well-defined. That will almost always mean documented, and usually predictable though obviously e.g. RNGs and iterating in random order will not be predictable by design.
>>>> 
>>>> Wouldn't it then suffice to document, say, that a set's iteration order is the insertion order?
>>> 
>>> Now this actually gave me pause. I guess it does match what I said, but I still take issue with the fact that two Sets could compare `==` but not `elementsEqual`. I think that defining iteration order as insertion order adds a piece of publicly documented state that goes beyond what a Set really is. What you describe is really an OrderedSet, just without the random-access manipulation.
>>> 
>>> a) There is no semantic requirement on the part of `==` to be equivalent to an elementwise comparison when it is defined on a collection; in fact, one could imagine that some exotic sequence might legitimately define equality in a way that has nothing to do with elementwise comparison. Put another way, `==` returning `true` does not imply `elementsEqual` returning `true`, and `elementsEqual` returning `true` does not imply `==` returning `true`. This applies equally to ordered collections and is independent of the question of how to model unordered collections.
>>> 
>>> b) You keep writing that some Foo is really some Bar, but those are really just names. What would be the harm if Swift's `Set` indeed simply models an ordered set without random-access manipulation?
>>> 
>>> I’ll have to mull this over to see if I can come up with a coherent and (more) specific requirement for what makes an Iterable a Sequence, since clearly “documented” isn’t enough.  Perhaps something along the lines that any two Sequences that compare equal must iterate the same.
>>> 
>>>> […]
>>>> Apple documentation calls this one of the "order-dependent" methods. It is surely acceptable for a type that conforms to an order-dependent protocol to have methods that are order-dependent; they do, however, have to be clearly order-dependent to avoid confusion on unordered types.
>>> 
>>> I’m not clear on what you’re trying to get across here. It seems you’re saying unordered types shouldn’t have order-dependent methods, which is exactly what I’ve been arguing.
>>> 
>>> No, I'm saying, essentially, that there are no truly unordered types in Swift; `Set` and `Dictionary` lead double lives modeling unordered collections on the one hand and ordered collections on the other. The order-dependent methods can continue to exist; they just need to be clearly named so that users know when they're using an instance of `Set` in the manner of an unordered collection and when they're using an instance of `Set` in the manner of an ordered collection.
>>>  
>>> 
>>>>  [...]
>>>> Then there are all the methods that imply a specific order of iteration. If the “sequence” is unordered, who knows what you’ll get? It is incredibly easy for an engineer to write a method that implicitly relies on a passed sequence being intrinsically ordered and another engineer to pass it an unordered “sequence.”  The first engineer could neglect to document the dependency, or even not notice it; or the second engineer could just fail to read the documentation thoroughly enough.  There is currently no way for the compiler to enforce passing only an object that is (or at least claims to be) intrinsically ordered.
>>>> 
>>>> It is also incredibly easy for such an engineer to use `for...in` instead to accomplish the same task, generic over ordered and unordered sequences whatever you name such distinguished protocols. I think your beef really still boils down to Set being compatible with `for...in` at all, as Jon acknowledges.
>>> 
>>> Not providing ordered functions for unordered collections makes the developers think about what they actually need. If any object will do, they can use for…in, .makeIterator().next(), or an `anyObject()` we provide as a convenience. If they actually need the first from some specific order, it’s a reminder they need to sort the objects first to get the right one.
>>> 
>>> The whole point of protocol hierarchies is to enable useful generic algorithms. Here, the purpose of having a protocol that unites both ordered and unordered collections is to permit the writing of generic algorithms that operate on both; a user would want the first item from an ordered collection or an arbitrary item (but the same one on multiple passes) from an unordered collection. The name for that is currently `first`. Brent demonstrated a trivial one-line example of such a use.
>>> 
>>> That’s particularly useful for functions that actually need an ordered sequence; using OrderedSequence instead of Iterable (just as placeholders) would be a firm reminder not to pass in an unordered collection.
>>> 
>>>>> […]
>>>> 
>>>> 
>>>> As I said, you're welcome to tackle the protocol hierarchy, but I really doubt it's within the realm of realistic endpoints for Swift 5. I'm just trying to propose a narrowly targeted pragmatic solution to one specific limited harm that might be deliverable by the next point release. As a great man once said, Swift is a pragmatic language.
>>> 
>>> If you want a pragmatic solution, fix the bug in functionality, don’t try and rename the method to something obscure to cover it up.
>>> 
>>> What I'm arguing is that there *is no bug in functionality*, only a naming problem. It is true that the current protocol hierarchy would not be my preferred design, but that's a matter of taste in terms of, again, where to draw the line between too much modeling or not enough. But that's not tantamount to a *bug*.
>>>  
>>> If you want to limit the harm, override `equalObjects` on unordered sequences to use `==` (very strongly preferred), or always `false` (less desirable, but at least consistent)
>>> 
>>>>> […]
>>> 
>>>> 
>>>> The Swift stdlib deliberately eschews modeling everything in protocol hierarchies with the highest level of granularity. There's some fudging, deliberately, to find a happy medium between obtuse and approachable, between too many/too specialized and not enough. For example, I pushed for protocols such as `Field` and `Ring` at the top of the numeric hierarchy, which might allow complex number types to slot into the hierarchy more sensibly, for example. But we have a compromise protocol `Numeric` which doesn't quite have the same guarantees but is much more approachable. Notice that we also don't break down numeric protocols into `Addable`, `Subtractable`, etc.; we also have that fudge factor built into `Equatable`, as I mentioned.
>>> 
>>> Eh, one or two corner cases on a protocol is probably fine. What’s not fine is over half (Sequence) or almost all (Collection) the methods not being applicable.  There is a very clear gap there. We don’t need to fix everything, but this is something that can and should be addressed.
>>> 
>>> This would be based on the premise that an instance of `Set` has no intrinsic order; I disagree for the reasons above.
>>> _______________________________________________
>>> swift-evolution mailing list
>>> swift-evolution at swift.org <mailto:swift-evolution at swift.org>
>>> https://lists.swift.org/mailman/listinfo/swift-evolution <https://lists.swift.org/mailman/listinfo/swift-evolution>
>> 
> _______________________________________________
> swift-evolution mailing list
> swift-evolution at swift.org
> https://lists.swift.org/mailman/listinfo/swift-evolution

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-evolution/attachments/20171016/c8f68679/attachment.html>


More information about the swift-evolution mailing list