[swift-evolution] [Draft] Rename Sequence.elementsEqual

Xiaodi Wu xiaodi.wu at gmail.com
Tue Oct 17 07:44:10 CDT 2017

On Tue, Oct 17, 2017 at 00:56 Thorsten Seitz <tseitz42 at icloud.com> wrote:

> Am 17.10.2017 um 00:13 schrieb Xiaodi Wu <xiaodi.wu at gmail.com>:
> On Mon, Oct 16, 2017 at 14:21 Thorsten Seitz <tseitz42 at icloud.com> wrote:
>> Am 16.10.2017 um 16:20 schrieb Xiaodi Wu via swift-evolution <
>> swift-evolution at swift.org>:
>> On Mon, Oct 16, 2017 at 05:48 Jonathan Hull <jhull at gbis.com> wrote:
>>> On Oct 15, 2017, at 9:58 PM, Xiaodi Wu <xiaodi.wu at gmail.com> wrote:
>>> On Sun, Oct 15, 2017 at 8:51 PM, Jonathan Hull <jhull at gbis.com> wrote:
>>>> On Oct 14, 2017, at 10:48 PM, Xiaodi Wu <xiaodi.wu at gmail.com> wrote:
>>>>>  That ordering can be arbitrary, but it shouldn’t leak internal
>>>>>> representation such that the method used to create identical things affects
>>>>>> the outcome of generic methods because of differences in internal
>>>>>> representation.
>>>>>>>  It would be better to say that the iteration order is well-defined.
>>>>>>> That will almost always mean documented, and usually predictable though
>>>>>>> obviously e.g. RNGs and iterating in random order will not be predictable
>>>>>>> by design.
>>>>>>>> That's actually more semantically constrained than what Swift calls
>>>>>>>> a `Collection` (which requires conforming types to be multi-pass and(?)
>>>>>>>> finite). By contrast, Swift's `SpongeBob` protocol explicitly permits
>>>>>>>> conforming single-pass, infinite, and/or unordered types.
>>>>>>>> I think you’re talking about Sequence here, I’ve lost track of your
>>>>>>>> nonsense by now. Yes, the current Swift protocol named Sequence allows
>>>>>>>> unordered types. You seem to keep asserting that but not actually
>>>>>>>> addressing my argument, which is *that allowing Sequences to be
>>>>>>>> unordered with the current API is undesired and actively harmful, and
>>>>>>>> should* *therefore** be changed*.
>>>>>>> What is harmful about it?
>>>>>>> After thinking about it, I think the harmful bit is that unordered
>>>>>>> sequences are leaking internal representation (In your example, this is
>>>>>>> causing people to be surprised when two sets with identical elements are
>>>>>>> generating different sequences/orderings based on how they were created).
>>>>>>> You are correct when you say that this problem is even true for for-in.
>>>>>> I would not say it is a problem. Rather, by definition, iteration
>>>>>> involves retrieving one element after another; if you're allowed to do that
>>>>>> with Set, then the elements of a Set are observably ordered in some way.
>>>>>> Since it's not an OrderedSet--i.e., order doesn't matter--then the only
>>>>>> sensible conclusion is that the order of elements obtained in a for...in
>>>>>> loop must be arbitrary. If you think this is harmful, then you must believe
>>>>>> that one should be prohibited from iterating over an instance of Set.
>>>>>> Otherwise, Set is inescapably a Sequence by the Swift definition of
>>>>>> Sequence. All extension methods on Sequence like drop(while:) are really
>>>>>> just conveniences for common things that you can do with iterated access;
>>>>>> to my mind, they're essentially just alternative ways of spelling various
>>>>>> for...in loops.
>>>>>> I think an argument could be made that you shouldn’t be able to
>>>>>> iterate over a set without first defining an ordering on it (even if that
>>>>>> ordering is somewhat arbitrary).  Maybe we have something like a
>>>>>> “Sequenc(e)able” protocol which defines things which can be turned into a
>>>>>> sequence when combined with some sort of ordering.  One possible ordering
>>>>>> could be the internal representation (At least in that case we are calling
>>>>>> it out specifically).  If I had to say
>>>>>> “setA.arbitraryOrder.elementsEqual(setB.arbitraryOrder)” I would definitely
>>>>>> be less surprised when it returns false even though setA == setB.
>>>>> Well, that's a totally different direction, then; you're arguing that
>>>>> `Set` and `Dictionary` should not conform to `Sequence` altogether. That's
>>>>> fine (it's also a direction that some of us explored off-list a while ago),
>>>>> but at this point in Swift's evolution, realistically, it's not within the
>>>>> realm of possible changes.
>>>>> I am actually suggesting something slightly different.  Basically, Set
>>>>> and Dictionary’s conformance to Collection would have a different
>>>>> implementation.  They would conform to another protocol declaring that they
>>>>> are unordered. That protocol would fill in part of the conformance to
>>>>> sequence/collection using a default ordering, which is mostly arbitrary,
>>>>> but guaranteed to produce the same ordering for the same list of elements
>>>>> (even across collection types).  This would be safer, but a tiny bit slower
>>>>> than what we have now (We could also potentially develop a way for
>>>>> collections like set to amortize the cost). For those who need to recover
>>>>> speed, the new protocol would also define a property which quickly returns
>>>>> a sequence/iterator using the internal ordering (I arbitrarily called it
>>>>> .arbitraryOrder).
>>>>> I believe it would not be source breaking.
>>>> That is indeed something slightly different.
>>>> In an ideal world--and my initial understanding of what you were
>>>> suggesting--Set and Dictionary would each have a member like `collection`,
>>>> which would expose the underlying data as a `SetCollection` or
>>>> `DictionaryCollection` that in turn would conform to `Collection`;
>>>> meanwhile, Set and Dictionary themselves would not offer methods such as
>>>> `prefix`, or indexing by subscript, which are not compatible with being
>>>> unordered. For those who want a particular ordering, there'd be something
>>>> like `collection(ordered areInIncreasingOrder: (T, T) -> Bool) ->
>>>> {Set|Dictionary}Collection`.
>>>> What you suggest here instead would be minimally source-breaking.
>>>> However, I'm unsure of where these guarantees provide benefit to justify
>>>> the performance cost. Certainly not for `first` or `dropFirst(_:)`, which
>>>> still yields an arbitrary result which doesn't make sense for something
>>>> _unordered_. We *could* have an underscored customization point named
>>>> something like `_customOrderingPass` that is only invoked from
>>>> `elementsEqual` or other such methods to pre-rearrange the internal
>>>> ordering of unordered collections in some deterministic way before
>>>> comparison. Is that what you have in mind?
>>>> Something like that.  Whatever we do, there will be a tradeoff between
>>>> speed, correctness, and ergonomics.
>>>> My suggestion trades speed for correctness, and provides a way to
>>>> recover speed through additional typing (which is slightly less ergonomic).
>>> You haven't convinced me that this is at all improved in "correctness."
>>> It trades one arbitrary iteration order for another on a type that tries to
>>> model an unordered collection.
>>>> We could do something like you suggest. I don’t think the method would
>>>> need to be underscored… the ordering pass could just be a method on the
>>>> protocol which defines it as unordered.  Then we could provide a special
>>>> conformance for things where order really matters based on adherence to
>>>> that protocol.  That might be an acceptable tradeoff.  It would give us
>>>> speed at the cost of having the correct implementation being less ergonomic
>>>> and more error prone (you have to remember to check that it is unordered
>>>> and call the ordering method when it mattered).
>>>> I’d still be a bit worried that people would make incorrect generic
>>>> algorithms based on expecting an order from unordered things, but at least
>>>> it would be possible for them check and handle it correctly.  I think I
>>>> could get behind that tradeoff/compromise, given where we are in the swift
>>>> process and Swift's obsession with speed (though I still slightly prefer
>>>> the safer default).  At least the standard library would handle all the
>>>> things correctly, and that is what will affect the majority of programmers.
>>> What is an example of such an "incorrect" generic algorithm that would
>>> be made correct by such a scheme?
>>> To start with, the one you gave as an example at the beginning of this
>>> discussion: Two sets with identical elements which have different internal
>>> storage and thus give different orderings as sequences.  You yourself have
>>> argued that the confusion around this is enough of a problem that we need
>>> to make a source-breaking change (renaming it) to warn people that the
>>> results of the ‘elementsEqual’ algorithm are undefined for sets and
>>> dictionaries.
>> No, I am arguing that the confusion about ‘elementsEqual’ is foremost a
>> problem with its name; the result of this operation is not at all undefined
>> for two sets but actually clearly defined: it returns true if two sets have
>> the same elements in the same iteration order, which is a publicly
>> observable behavior of sets (likewise dictionaries).
>> But it is a behavior which has absolutely no meaning at all because the
>> order does not depend on the elements of the set but on the history of how
>> the set has been reached its current state.
>> So why should I ever use this method on a set?
>> What is the use case?
> One example: you can use it to check an instance of Set<Float> to
> determine if it has a NaN value. (The “obvious” way of doing it is not
> guaranteed to work since NaN != NaN.)
> How would I do that? I'd rather expect to use a property isNaN on Float to
> do that.


I don’t see why a non-source-breaking change is suddenly off-limits.
>>> But more than that, any generic algorithm which is assuming that the
>>> sequence is coming from an ordered source (i.e. many things using
>>> first/last).  Some uses of first are ok because the programmer actually
>>> means ‘any’, but anywhere where they actually mean first/last may be
>>> problematic.
>> Such as...?
>> Currently, there is no way to test for ordered-ness, so there is no way
>>> for even a careful programmer to mitigate this problem.  By adding a
>>> protocol which states that something is unordered, we can either branch on
>>> it, or create a separate version of an algorithm for things which conform.
>> It is clearly the case that Swift’s protocol hierarchy fits sets and
>> collections imperfectly; however, it is in the nature of modeling that
>> imperfections are present. The question is not whether it is possible to
>> incur performance, API surface area, and other trade-offs to make the model
>> more faithful, but rather whether this usefully solves any problem. What is
>> the problem being mitigated? As I write above, Swift’s Set and Dictionary
>> types meet the semantic requirements for Collection and moonlight as
>> ordered collections. What is a generic algorithm on an ordered collection
>> that is  “not OK” for Set and Dictionary? (“elementsEqual”, as I’ve said,
>> is not such an example.)
>> On the contrary, `elementsEqual` is exactly such an example, because it
>> makes no sense to use it on a Set.
>> let s1 = Set([1,2,3,4,5,6])
>> let s2 = Set([6,5,4,3,2,1])
>> Both sets have different iteration orders. Comparing those sets with some
>> other collection using `elementsEqual` will give no meaningful result
>> because the order - and therefore the result of `elementsEqual` - is in
>> effect random.
> No, it is not such an example; it’s misleadingly named but works
> correctly—that is, its behavior matches exactly the documented behavior,
> which relies on only the semantic guarantees of Sequence, which Set
> correctly fulfills.
> Fulfills to the letter. Again, what can you do with it if the result is
> random??

The result is not random.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-evolution/attachments/20171017/ba8ce20f/attachment.html>

More information about the swift-evolution mailing list