[swift-evolution] [Draft] Rename Sequence.elementsEqual
Xiaodi Wu
xiaodi.wu at gmail.com
Mon Oct 16 15:07:52 CDT 2017
On Mon, Oct 16, 2017 at 13:15 David Sweeris <davesweeris at mac.com> wrote:
>
> On Oct 16, 2017, at 09:21, Michael Ilseman <milseman at apple.com> wrote:
>
>
>
> On Oct 16, 2017, at 8:46 AM, David Sweeris via swift-evolution <
> swift-evolution at swift.org> wrote:
>
>
> On Oct 16, 2017, at 07:20, Xiaodi Wu via swift-evolution <
> swift-evolution at swift.org> wrote:
>
>
> On Mon, Oct 16, 2017 at 05:48 Jonathan Hull <jhull at gbis.com> wrote:
>
>>
>> On Oct 15, 2017, at 9:58 PM, Xiaodi Wu <xiaodi.wu at gmail.com> wrote:
>>
>> On Sun, Oct 15, 2017 at 8:51 PM, Jonathan Hull <jhull at gbis.com> wrote:
>>
>>>
>>> On Oct 14, 2017, at 10:48 PM, Xiaodi Wu <xiaodi.wu at gmail.com> wrote:
>>>
>>>> That ordering can be arbitrary, but it shouldn’t leak internal
>>>>> representation such that the method used to create identical things affects
>>>>> the outcome of generic methods because of differences in internal
>>>>> representation.
>>>>>
>>>>>>
>>>>>>
>>>>>> It would be better to say that the iteration order is well-defined.
>>>>>> That will almost always mean documented, and usually predictable though
>>>>>> obviously e.g. RNGs and iterating in random order will not be predictable
>>>>>> by design.
>>>>>>
>>>>>>>
>>>>>>> That's actually more semantically constrained than what Swift calls
>>>>>>> a `Collection` (which requires conforming types to be multi-pass and(?)
>>>>>>> finite). By contrast, Swift's `SpongeBob` protocol explicitly permits
>>>>>>> conforming single-pass, infinite, and/or unordered types.
>>>>>>>
>>>>>>>
>>>>>>> I think you’re talking about Sequence here, I’ve lost track of your
>>>>>>> nonsense by now. Yes, the current Swift protocol named Sequence allows
>>>>>>> unordered types. You seem to keep asserting that but not actually
>>>>>>> addressing my argument, which is *that allowing Sequences to be
>>>>>>> unordered with the current API is undesired and actively harmful, and
>>>>>>> should* *therefore** be changed*.
>>>>>>>
>>>>>>
>>>>>> What is harmful about it?
>>>>>>
>>>>>>
>>>>>> After thinking about it, I think the harmful bit is that unordered
>>>>>> sequences are leaking internal representation (In your example, this is
>>>>>> causing people to be surprised when two sets with identical elements are
>>>>>> generating different sequences/orderings based on how they were created).
>>>>>> You are correct when you say that this problem is even true for for-in.
>>>>>>
>>>>>
>>>>> I would not say it is a problem. Rather, by definition, iteration
>>>>> involves retrieving one element after another; if you're allowed to do that
>>>>> with Set, then the elements of a Set are observably ordered in some way.
>>>>> Since it's not an OrderedSet--i.e., order doesn't matter--then the only
>>>>> sensible conclusion is that the order of elements obtained in a for...in
>>>>> loop must be arbitrary. If you think this is harmful, then you must believe
>>>>> that one should be prohibited from iterating over an instance of Set.
>>>>> Otherwise, Set is inescapably a Sequence by the Swift definition of
>>>>> Sequence. All extension methods on Sequence like drop(while:) are really
>>>>> just conveniences for common things that you can do with iterated access;
>>>>> to my mind, they're essentially just alternative ways of spelling various
>>>>> for...in loops.
>>>>>
>>>>>
>>>>> I think an argument could be made that you shouldn’t be able to
>>>>> iterate over a set without first defining an ordering on it (even if that
>>>>> ordering is somewhat arbitrary). Maybe we have something like a
>>>>> “Sequenc(e)able” protocol which defines things which can be turned into a
>>>>> sequence when combined with some sort of ordering. One possible ordering
>>>>> could be the internal representation (At least in that case we are calling
>>>>> it out specifically). If I had to say
>>>>> “setA.arbitraryOrder.elementsEqual(setB.arbitraryOrder)” I would definitely
>>>>> be less surprised when it returns false even though setA == setB.
>>>>>
>>>>
>>>> Well, that's a totally different direction, then; you're arguing that
>>>> `Set` and `Dictionary` should not conform to `Sequence` altogether. That's
>>>> fine (it's also a direction that some of us explored off-list a while ago),
>>>> but at this point in Swift's evolution, realistically, it's not within the
>>>> realm of possible changes.
>>>>
>>>>
>>>> I am actually suggesting something slightly different. Basically, Set
>>>> and Dictionary’s conformance to Collection would have a different
>>>> implementation. They would conform to another protocol declaring that they
>>>> are unordered. That protocol would fill in part of the conformance to
>>>> sequence/collection using a default ordering, which is mostly arbitrary,
>>>> but guaranteed to produce the same ordering for the same list of elements
>>>> (even across collection types). This would be safer, but a tiny bit slower
>>>> than what we have now (We could also potentially develop a way for
>>>> collections like set to amortize the cost). For those who need to recover
>>>> speed, the new protocol would also define a property which quickly returns
>>>> a sequence/iterator using the internal ordering (I arbitrarily called it
>>>> .arbitraryOrder).
>>>>
>>>> I believe it would not be source breaking.
>>>>
>>>
>>> That is indeed something slightly different.
>>>
>>> In an ideal world--and my initial understanding of what you were
>>> suggesting--Set and Dictionary would each have a member like `collection`,
>>> which would expose the underlying data as a `SetCollection` or
>>> `DictionaryCollection` that in turn would conform to `Collection`;
>>> meanwhile, Set and Dictionary themselves would not offer methods such as
>>> `prefix`, or indexing by subscript, which are not compatible with being
>>> unordered. For those who want a particular ordering, there'd be something
>>> like `collection(ordered areInIncreasingOrder: (T, T) -> Bool) ->
>>> {Set|Dictionary}Collection`.
>>>
>>> What you suggest here instead would be minimally source-breaking.
>>> However, I'm unsure of where these guarantees provide benefit to justify
>>> the performance cost. Certainly not for `first` or `dropFirst(_:)`, which
>>> still yields an arbitrary result which doesn't make sense for something
>>> _unordered_. We *could* have an underscored customization point named
>>> something like `_customOrderingPass` that is only invoked from
>>> `elementsEqual` or other such methods to pre-rearrange the internal
>>> ordering of unordered collections in some deterministic way before
>>> comparison. Is that what you have in mind?
>>>
>>>
>>>
>>> Something like that. Whatever we do, there will be a tradeoff between
>>> speed, correctness, and ergonomics.
>>>
>>> My suggestion trades speed for correctness, and provides a way to
>>> recover speed through additional typing (which is slightly less ergonomic).
>>>
>>
>> You haven't convinced me that this is at all improved in "correctness."
>> It trades one arbitrary iteration order for another on a type that tries to
>> model an unordered collection.
>>
>>
>>> We could do something like you suggest. I don’t think the method would
>>> need to be underscored… the ordering pass could just be a method on the
>>> protocol which defines it as unordered. Then we could provide a special
>>> conformance for things where order really matters based on adherence to
>>> that protocol. That might be an acceptable tradeoff. It would give us
>>> speed at the cost of having the correct implementation being less ergonomic
>>> and more error prone (you have to remember to check that it is unordered
>>> and call the ordering method when it mattered).
>>>
>>> I’d still be a bit worried that people would make incorrect generic
>>> algorithms based on expecting an order from unordered things, but at least
>>> it would be possible for them check and handle it correctly. I think I
>>> could get behind that tradeoff/compromise, given where we are in the swift
>>> process and Swift's obsession with speed (though I still slightly prefer
>>> the safer default). At least the standard library would handle all the
>>> things correctly, and that is what will affect the majority of programmers.
>>>
>>
>> What is an example of such an "incorrect" generic algorithm that would be
>> made correct by such a scheme?
>>
>>
>> To start with, the one you gave as an example at the beginning of this
>> discussion: Two sets with identical elements which have different internal
>> storage and thus give different orderings as sequences. You yourself have
>> argued that the confusion around this is enough of a problem that we need
>> to make a source-breaking change (renaming it) to warn people that the
>> results of the ‘elementsEqual’ algorithm are undefined for sets and
>> dictionaries.
>>
>
> No, I am arguing that the confusion about ‘elementsEqual’ is foremost a
> problem with its name; the result of this operation is not at all undefined
> for two sets but actually clearly defined: it returns true if two sets have
> the same elements in the same iteration order, which is a publicly
> observable behavior of sets (likewise dictionaries).
>
>
> How is the iteration order of an unordered set or dictionary “publicly
> observable”? If either is implemented such that it can asynchronously
> optimize its storage (maybe by rebalancing a tree or merging two
> non-contiguous array segments or something), its iteration order could
> change *without* changing what values it contains. Seems like consecutive
> calls to “elementsEquals” (or whatever we’re calling it) should return the
> same answer, if we don’t add, remove, or mutate elements.
>
>
> Sets are values. If you add, remove, or mutate any elements you have a
> different Set and thus a potentially different ordering of elements.
>
>
> From the “value semantics” PoV, yes. But from the “unordered collection of
> values” PoV, Sets/Dictionaries, being unordered, are semantically free to
> rearrange the in-memory ordering of their elements *without* user
> intervention.
>
No, they are not semantically free to do so. The semantics of Collection
forbid it, because the iteration order must be multi-pass. As long as the
value is unchanged, the iteration order is unchanged. That is a documented,
public guarantee of the API.
Maybe we need to add a mechanism for expressing the idea of a “value type
> which has referential tendencies“ for these “managed” types of values?
>
> Perhaps a less abstract example would be a “RemoteDictionary”, which maps
> each key to a remote server where the value is actually stored... I would
> expect it to iterate over its values in whatever order it gets them back
> from all the servers. And since dictionaries are unordered and network
> conditions & server loads can change (quickly), I wouldn’t expect
> consecutive iterations to necessarily be in the same order.
>
> - Dave Sweeris
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-evolution/attachments/20171016/3eaa0a02/attachment.html>
More information about the swift-evolution
mailing list