[swift-evolution] [Draft] Rename Sequence.elementsEqual

Xiaodi Wu xiaodi.wu at gmail.com
Sun Oct 15 14:22:52 CDT 2017


On Sun, Oct 15, 2017 at 14:14 Thorsten Seitz <tseitz42 at icloud.com> wrote:

> Am 15.10.2017 um 10:38 schrieb Xiaodi Wu via swift-evolution <
> swift-evolution at swift.org>:
>
> On Sun, Oct 15, 2017 at 2:29 AM, Kevin Nattinger <swift at nattinger.net>
> wrote:
>
>>
>> On Oct 14, 2017, at 7:54 PM, Xiaodi Wu <xiaodi.wu at gmail.com> wrote:
>>
>> On Sat, Oct 14, 2017 at 6:17 PM, Kevin Nattinger <swift at nattinger.net>
>> wrote:
>>
>>> […]
>>>> * A Swift `Sequence` is, to put it simplistically, a thing that can be
>>>> iterated over in a `for...in` loop. If it would make you happy, for the
>>>> rest of the discussion, let's suppose we called the protocol `ForLoopable`
>>>> instead of `Sequence`.
>>>>
>>>>
>>>> ForLoopable is so ugly. Since we’re just iterating over the elements,
>>>> how about, oh, say, `Iterable`? Hey, that looks familiar.
>>>>
>>>
>>> I'm not trying to bikeshed the name of `Sequence`. I'm picking an
>>> intentionally unwieldy name for the purposes of discussing the semantics of
>>> this particular protocol. The point is that the underlying issue has
>>> nothing to do with the name; it can be `Iterable` or it can be `SpongeBob`
>>> for all I care.
>>>
>>>
>>> I’m not trying to bikeshed the name either, The underlying issue is that
>>> (what is currently) Sequence actually encompasses two separate
>>> functionalities, and those functionalities need to be separated with their
>>> separate semantic requirements documented. “Sequence: Iterable,”
>>> “OrderedSequence: Sequence,” “SpongeBob: ForLoopable,” the names are 100%
>>> irrelevant at this point; what’s important is that one is not necessarily
>>> ordered and the other guarantees an order.
>>>
>>>
>>
>> What are the "two separate functionalities”?
>>
>>
>> Iteration, with convenience methods that don’t imply or rely on an order
>> that may not be there; and convenience methods applicable to sequences that
>> do have an intrinsic order.
>>
>
> Sets, as a mathematical concept, have no intrinsic order. However,
> instances of `Set`, which can be iterated over, *do* have at least one
> order which can be said to be intrinsic in the following sense: as long as
> iteration is possible, no API design can prevent that order from being
> observed and associated with the instance. Put another way, if you can use
> an instance of a type in a for...in loop, intrinsic to that functionality
> is a publicly visible order.
>
>
> I disagree. Sets are value types, therefore two instances of `Set` are
> equal if they contain the same elements. An intrinsic order should
> therefore only depend on the elements contained and should be the same for
> two instances of `Set` which are equal.
> This is not the case, though, as you can easily check in a playground by
> looking at Set([1,2,3,4,5,6]) and Set([6,5,4,3,2,1]) which represent the
> same value and are equal but do *not* have the same order.
>
>
>
>
>> All the extension methods on Sequence are ways of spelling things that
>> you can write in a few lines of code using a `for...in` loop; they're in
>> the stdlib to allow a more functional style which some prefer. If you
>> accept that a type should support iteration with a `for...in` loop, then
>> what is your basis for claiming that these extension methods are "separate
>> functionalities”?
>>
>>
>> Just because you *can* express something in code doesn’t mean you should,
>> or that it’s correct. It is objectively false to say a Set has a first or
>> last object, because the objects therein have no order. You can take a
>> random object from the set and call it “first”, but that doesn’t make that
>> a correct definition of Set.first. A Set has no order, a specific iteration
>> has an “order” only in the sense that all and only the objects in the set
>> have to come out one at a time, but that doesn’t mean the Set itself has an
>> order, specifically a first or last object.
>>
>
> Since Set conforms to Collection, it is guaranteed that if one element of
> an instance of Set comes out first one time, it'll come out first every
> time from that instance. If it helps, think of Swift's Set as modeling
> (imperfectly, as all models must) both a mathematical set and a multi-pass
> sequence, just as Swift's Int models both an integer and a sequence of bits.
>
> You’re a fan of the principal of least surprise. Tell me, which would be
>> less surprising: Set.dropFirst() actually drops a random element, or Set
>> doesn’t have a dropFirst()? And if you think dropFirst() removing an
>> element at random is not surprising, please explain why.
>>
>
> I think Set.dropFirst removing the first element that I observe on
> iteration is the least surprising answer, because Swift tells me that the
> stdlib Set models a set but that it is also a sequence.
>
>
> The latter is exactly the problem Kevin did point out. A Set is an
> Iterable (in the sense that I can iterate over its elements with the order
> being a meaningless random side effect) but it is *not* a Sequence (in the
> sense that the order conveys any meaning).
>

Swift's Sequence protocol does not require the order of iteration to
"convey any meaning"; it doesn't even require it to be deterministic.


> -Thorsten
>
>
> […]
>>>>
>>>> * If a type `T` conforms to `ForLoopable` and an instance `t` of that
>>>> type has at least one element, then *something* has to be the first element
>>>> in a `for element in t { ... }` loop. Put another way, every instance of a
>>>> type that conforms to `ForLoopable` must have at least one publicly
>>>> observable order (although, intriguingly, I'm not sure it has to be a
>>>> repeatable one). It is possible, therefore, to have a semantic answer to
>>>> the question of which element is `first` or (if finite) `last`; one can
>>>> also `drop(while:)`, etc., and perform lexicographical comparisons.
>>>>
>>>>
>>>> As a side effect of Swift being a procedural language each iteration
>>>> happens to occur in some order, yes, but that order is meaningless and
>>>> reflects nothing about the Set itself.  In fact, I’d say that *`first`,
>>>> `last`, etc. are not even defined on the original Set per se, only on the
>>>> specific order that a particular iteration resulted in*. And that
>>>> order is not necessarily predictable, nor necessarily stable, as you
>>>> yourself said.
>>>>
>>>> Consider an Iterable that gives a different order every time it’s
>>>> iterated.
>>>> Should calling `.first` or `last` give a different object every time?
>>>> That’s absurd.
>>>> Should an object lexicographically compare not equal to itself? Even
>>>> more absurd.
>>>>
>>>
>>> What's your basis for saying that such behavior is absurd? It is
>>> explicitly permitted for instances of types conforming to `SpongeBob` to be
>>> single-pass and/or infinite. For a single-pass `SpongeBob`, `first` will
>>> certainly return a different value every time it is invoked.
>>>
>>>
>>> Is `first` mutating? No. Should it be? No! `first` and `last` are a peek
>>> at the state of the object.
>>>
>>
>> You're right, `first` should not be mutating; that's actually an
>> important design consideration, as Ole pointed out, and it's not actually
>> available on `Sequence` for that reason. However, `first { _ in true }` is
>> available and is potentially mutating, as are all methods on Sequence by
>> design.
>>
>> Is `elementsEqual` (or *shudder* lexicographicallyEqual) reflexive? IMO
>>> it clearly should be. Especially with the “lexicographically” part—from
>>> everything I can find, a lexicographical ordering is by definition
>>> reflexive. Do you have a citation for the idea that lexicographical
>>> equality can legitimately be non-reflexive?
>>>
>>
>> Clearly (tautologically?), such a function should be reflexive for any
>> argument ordered with respect to itself. However, if there is no
>> lexicographical comparison possible, then a thing cannot compare
>> lexicographically equal to anything, even itself.
>>
>>
>> And that’s PRECISELY why lexicographicallyEqual does not make sense to
>> apply to unordered sets. There is no lexicographical comparison possible,
>> so why do you keep insisting they should have a method that falsely claims
>> to lexicographically compare them?
>>
>
> I agree! It doesn't make sense if no comparison is possible! But Swift
> tells me that a `Set` is a `Sequence`!
>
>>
>> A random number generator fulfills all the semantic requirements of
>> conforming to `SpongeBob`, and in fact I do just that in NumericAnnex
>> <https://github.com/xwu/NumericAnnex/blob/master/Sources/PRNG.swift#L53>.
>> `first` gives a different value every time, and a randomly generated
>> `SpongeBob` would unsurprisingly compare lexicographically not equal to
>> itself.
>>
>>
>> > IMO that’s a bug in the implementation; lexicographical equality is
>> reflexive, period.
>>
>> > Presumably the `elementsEqual` method contains something along these
>> lines (take from SequenceAlgorithms.swift.gyb):
>>
>>     var iter1 = self.makeIterator()
>>     var iter2 = other.makeIterator()
>>
>> > By creating two iterators, you’re mutating while iterating. Turns out
>> there’s even a warning against this in Sequence.swift:
>>
>> /// Using Multiple Iterators
>> /// ========================
>> ///
>> /// Whenever you use multiple iterators (or `for`-`in` loops) over a
>> single
>> /// sequence, be sure you know that the specific sequence supports
>> repeated
>> /// iteration, either because you know its concrete type or because the
>> /// sequence is also constrained to the `Collection` protocol.
>> ///
>> /// Obtain each separate iterator from separate calls to the sequence's
>> /// `makeIterator()` method rather than by copying. Copying an iterator is
>> /// safe, but advancing one copy of an iterator by calling its `next()`
>> method
>> /// may invalidate other copies of that iterator. `for`-`in` loops are
>> safe in
>> /// this regard.
>>
>> *> The default implementation of elementsEqual is therefore unsafe* because
>> it has the potential for using an invalidated iterator.
>>
>> You are misunderstanding the warning in the second paragraph here. The
>> implementation (not a default implementation, unless I'm mistaken, as it
>> cannot be overridden)
>>
>> makes each iterator using separate calls to `makeIterator()`, just as the
>> documentation tells you to do. Calling next() on one iterator does not
>> invalidate the other iterator, because the second is not a copy of the
>> first.
>>
>>
>> Indeed, I misread that comment.
>> That said, is there a well-defined behavior when iterating a one-shot
>> sequence with two iterators?
>> (It *is* a default implementation, btw)
>>
>
> Not sure about that; I don't see a protocol requirement, in which case it
> can only be shadowed in a concrete type but it can't be overridden. How to
> accommodate single-pass sequences is an interesting question. Off the top
> of my head, an iterator would have to be or wrap a reference type.
>
>>
>> You are, however, right that calling `rng.elementsEqual(rng)` is not
>> advised. It isn't unsafe; it's just not useful. That said, calling
>> `array.elementsEqual(array)` is equally safe and equally useless, and the
>> uselessness of such a reflexive comparison is neither here nor there.
>>
>>
>> Funny how you complain about my code being useless and yet you insist
>> below, "If it's not providing you with utility, then don't use it.”
>> Regardless, you’re wrong to dismiss this case. `foo.elementsEqual(foo)`
>> on its own makes little sense, sure, but you could easily find yourself in
>> a method comparing two iterators you obtained from elsewhere, and
>> occasionally they happen to be the identical object. Allowing an iterator
>> to return `false` for .elementsEqual on itself is unexpected and dangerous.
>>
>
> You will always have to account for this possibility, because Swift's
> `Equatable` explicitly allows "special values" to be not equal to
> themselves. This is, at least in part, in order to accommodate the IEEE
> decree that NaN != NaN:
>
> ```
> let x = [Double.nan]
> x.elementsEqual(x) // false
> ```
>
> Changing this behavior is way beyond the scope of this thread (and has
> been the topic of hours (actually, weeks and weeks) of fun on this list
> previously).
>
> On the other hand, if I have a collection of objects that I want iterated
>>> in a particular order, I can use a container that iterates in a specific,
>>> known, well-defined way, and use that to construct the sequence of
>>> objects.  That’s clearly an Iterable collection, but the guarantee is
>>> stronger than that. Since it iterates objects in a specific sequence, the
>>> logical way to express that would be `Sequence: Iterable`. Again, we’ve
>>> seen that before.
>>>
>>> Now, since a Sequence is guaranteed to iterate the same every time,
>>> suddenly our `first`, `last`, `drop*`, etc. methods have a meaning inherent
>>> to the collection itself, rather than a specific iteration.
>>>
>>
>> What you call a "Sequence" here would have to be multi-pass, finite, and
>> ordered.
>>
>>
>> > Ordered, yes, but it’s only admittedly poor wording that suggests
>> multi-pass, and I don’t think anything there suggests finite.
>>
>> If a Sequence is "guaranteed to iterate the same every time," then surely
>> it must be multi-pass; what's the alternative?
>>
>>
>> Not sure if you just missed the very next sentence or are actively
>> ignoring it just to be argumentative. I already acknowledged that that
>> phrase didn’t convey the meaning I intended, and a Sequence is not and
>> should not be required to be multi-pass.
>>
>
> I entirely misunderstood your next sentence as asserting that being
> multi-pass makes the iteration order well-defined.
>
>>
>>
>>> It would be better to say that the iteration order is well-defined. That
>>> will almost always mean documented, and usually predictable though
>>> obviously e.g. RNGs and iterating in random order will not be predictable
>>> by design.
>>>
>> Wouldn't it then suffice to document, say, that a set's iteration order
> is the insertion order?
>
>>
>>> That's actually more semantically constrained than what Swift calls a
>>> `Collection` (which requires conforming types to be multi-pass and(?)
>>> finite). By contrast, Swift's `SpongeBob` protocol explicitly permits
>>> conforming single-pass, infinite, and/or unordered types.
>>>
>>>
>>> I think you’re talking about Sequence here, I’ve lost track of your
>>> nonsense by now. Yes, the current Swift protocol named Sequence allows
>>> unordered types. You seem to keep asserting that but not actually
>>> addressing my argument, which is *that allowing Sequences to be
>>> unordered with the current API is undesired and actively harmful, and
>>> should* *therefore** be changed*.
>>>
>>
>> What is harmful about it?
>>
>>
>> Well, for one, the issue you raised this thread about—two sets that are
>> `==` could return either true or false for `elementsEqual`, depending on
>> how they arrived at their current state. That’s not acceptable, and the
>> problem isn’t with the name of the method.
>>
>
> Apple documentation calls this one of the "order-dependent" methods. It is
> surely acceptable for a type that conforms to an order-dependent protocol
> to have methods that are order-dependent; they do, however, have to be
> clearly order-dependent to avoid confusion on unordered types.
>
>
>> Then there are all the methods that imply a specific order of iteration.
>> If the “sequence” is unordered, who knows what you’ll get? It is incredibly
>> easy for an engineer to write a method that implicitly relies on a passed
>> sequence being intrinsically ordered and another engineer to pass it an
>> unordered “sequence.”  The first engineer could neglect to document the
>> dependency, or even not notice it; or the second engineer could just fail
>> to read the documentation thoroughly enough.  There is currently no way for
>> the compiler to enforce passing only an object that is (or at least claims
>> to be) intrinsically ordered.
>>
>
> It is also incredibly easy for such an engineer to use `for...in` instead
> to accomplish the same task, generic over ordered and unordered sequences
> whatever you name such distinguished protocols. I think your beef really
> still boils down to Set being compatible with `for...in` at all, as Jon
> acknowledges.
>
>>
>>
>>> As long as it is possible to iterate over a `SpongeBob`, it is
>>> meaningful to ask what element is first obtained upon iteration or to drop
>>> the first element obtained upon iteration.
>>>
>>> And as long as iteration is not required to be repeatable (and it
>>> isn't), it is perfectly acceptable for these algorithms to return a
>>> different result every time.
>>>
>>>
>>> It is “meaningful” in the sense that it can technically be programmed.
>>> The actual results are meaningless beyond returning or dropping a random*
>>> element.
>>>
>>> *: Don’t nitpick the word “random”, you know exactly what I mean. It’s
>>> just shorter and no less clear than “technically more-or-less deterministic
>>> but not documented, not generally predictable, and probably but not
>>> necessarily consistent from one call to the next."
>>>
>>>
>> I fail to see the issue here. If it's not providing you with utility,
>> then don't use it.
>>
>>
>> I have no problem with functions I don’t use provided they are
>> well-defined and reasonably accurately named. Functions requiring an order
>> on unordered collections don’t pass that bar.
>>
>
> As I said, you're welcome to tackle the protocol hierarchy, but I really
> doubt it's within the realm of realistic endpoints for Swift 5. I'm just
> trying to propose a narrowly targeted pragmatic solution to one specific
> limited harm that might be deliverable by the next point release. As a
> great man once said, Swift is a pragmatic language.
>
>> Since Collections do guarantee multi-pass iteration, Brent's example of
>> `set.dropFirst().reduce(set.first!, ...)` provides just one instance where
>> an unordered Collection can profitably make use of `first`. Permitting
>> generic algorithms that can operate on either arrays or sets, for example,
>> is the desired effect of having such a protocol; a generic algorithm that
>> takes a Collection can ask for the first element, and in the case of an
>> unordered Collection, an arbitrary element will do just fine.
>>
>>
>> The generic algorithms should be on a protocol that specifies everything
>> they require. If one can work on anything you can iterate over, put it on
>> Iterable. If another requires the objects to be ordered, put it on
>> Sequence. Need to express that an algorithm requires a multi-pass sequence?
>> Make a MultiPassSequence protocol and put the algorithm on an extension
>> containing that requirement. Use protocols to express requirements, as they
>> were designed for. Don’t just tack a bunch of methods onto a protocol that
>> isn’t sufficient to describe their requirements and say, “oh, by the way,
>> only use this method if your implementation meets these conditions…"
>>
>
> The benefits are likely outweighed by the costs of such an approach taken
> to completion, because there are many axes to differentiate. The protocol
> hierarchy for collections is already daunting, leading to monstrosities
> such as `MutableRangeReplaceableRandomAccessSlice`. It stretches the bounds
> of sensibility to have a
> `LazyUnorderedInfiniteMultiPassMutableRangeReplaceableRandomAccessSlice`.
>
> The Swift stdlib deliberately eschews modeling everything in protocol
> hierarchies with the highest level of granularity. There's some fudging,
> deliberately, to find a happy medium between obtuse and approachable,
> between too many/too specialized and not enough. For example, I pushed for
> protocols such as `Field` and `Ring` at the top of the numeric hierarchy,
> which might allow complex number types to slot into the hierarchy more
> sensibly, for example. But we have a compromise protocol `Numeric` which
> doesn't quite have the same guarantees but is much more approachable.
> Notice that we also don't break down numeric protocols into `Addable`,
> `Subtractable`, etc.; we also have that fudge factor built into
> `Equatable`, as I mentioned.
>
>>
>>> `first` is the first object in the Sequence. It doesn’t matter how the
>>>> sequence came to be in that order; it doesn’t matter whether or not the
>>>> sequence has already been iterated or how many times. `first` is the first
>>>> object that is, was, and always will be presented by the Sequence’s
>>>> Iterator. (Until the collection is mutated, obviously).
>>>>
>>>> *To summarize,*
>>>> A Set has no intrinsic order. You can iterate over it, and a specific
>>>> iteration of a set has an order, but that order is not tied to the Set
>>>> itself beyond including all and only the items therein. Therefore, the Set
>>>> itself has no intrinsic `first`, `last`, lexicographical comparison, etc.;
>>>> only its iterations do, and they are not themselves Sets.
>>>> A Sequence does have an intrinsic order. The order of iteration
>>>> reflects the order inherent to the Sequence. Therefore, a Sequence has a
>>>> `first`, `last`, lexicographical comparison, etc.
>>>>
>>>> Just in case it’s not obvious, `Set` here is pretty much
>>>> interchangeable with any other unordered iterable.
>>>>
>>>
>>> What you want to call a "Sequence" is what Swift calls a `Collection`,
>>> with additional restrictions. What you want to be called "Iterable" is what
>>> Swift calls `Sequence` (or now, `SpongeBob`). Clearly, shuffling names will
>>> not make these protocols support any more functionality, so that can be put
>>> aside.
>>>
>>>
>>> No, no, no! What I want to call “Iterable” is specified below. It is
>>> about HALF of what’s currently in Sequence—the half that has to do with
>>> iterating, whence the name.
>>> What I want to call Sequence is precisely what Swift now calls
>>> Sequence—the methods that are in Iterable by virtue of adopting Iterable,
>>> PLUS some methods that only make sense on iterable groups of objects where
>>> the iteration order is well-defined.
>>>
>>>
>>> Now, with that out of the way, why do you think that only `Collection`
>>> types should have `first` and `last`? These helper properties and methods
>>> are simply convenient ways to spell things that can be done with
>>> `for...in`--the whole point of supplying them is to allow people to work
>>> with these types in a more functional style.
>>>
>>>
>>> Apparently “collection" was a bad choice of word. What I clearly meant
>>> was not the current Swift Collection protocol, but rather an unordered
>>> assemblage of objects. UnorderedCollection, perhaps, or if that’s still
>>> going to cause issues, try UnorderedAssemblage.  What `first` and `last`
>>> really mean in an UnorderedAssemblage is give me some object from the
>>> assembled objects, I don’t care which one. For which it’s much more clear
>>> to have an `anyObject()` as on NSSet; as another user has pointed out,
>>> `assemblage.makeIterator().next()` works just as well. (I just checked, and
>>> that’s actually how `first` is implemented. But it’s on Collection, which
>>> is guaranteed to be multipass,)
>>>
>>
>> Again, the point of having a protocol-based design is to allow useful
>> _generic_ algorithms to be written; that `first` and `last` would be
>> equivalent to an arbitrary element in the case that a collection is
>> unordered is not at all an argument against these types conforming to
>> `Collection`; if anything, it's an argument for it.
>>
>>
>> If a protocol demands the first object, you should give it the first
>> object. If you don’t have a first object, maybe you shouldn’t conform to
>> the protocol. If the protocol really just needs any old object, call it
>> `anyObject`.
>>
>
> Sure, but we *do* have a first element; it just happens to be the first
> that is obtainable on iteration. That you could make a good case for any
> other element to be first doesn't mean that this one isn't a perfectly
> cromulent first.
>
> Just as `Sequence.underestimatedCount` is equivalent to `Collection.count`
>> for types that conform to `Collection`, or the instance
>> `BinaryInteger.bitWidth` is equivalent to a static `bitWidth` for types
>> that conform to `FixedWidthInteger`.
>>
>>
>> I don’t see how those are relevant, they all mean what they claim to,
>> unlike Set.first/dropFirst/etc.
>>
>> public protocol Iterable {
>>>>>>>   associatedtype Iterator: IteratorProtocol
>>>>>>>   func map<T>(...) -> [T] // Iterable where .Iterator.Element == T
>>>>>>>   func filter(...) -> [Iterator.Element] // Iterable where
>>>>>>> .Iterator.Element == Self.Iterator.Element
>>>>>>>   func forEach(...)
>>>>>>>   func makeIterator() -> Iterator
>>>>>>>   var underestimatedCount: Int { get }
>>>>>>> }
>>>>>>>
>>>>>>> public protocol Sequence: Iterable { // Maybe OrderedSequence just
>>>>>>> to make the well-defined-order requirement explicit
>>>>>>>   associatedtype SubSequence
>>>>>>>   func dropFirst(...)   -> SubSequence   // Sequence where
>>>>>>> .Iterator.Element == Self.Iterator.Element
>>>>>>>   func dropLast(...)    -> SubSequence   //    " "
>>>>>>>   func drop(while...)   -> SubSequence   //    " "
>>>>>>>   func prefix(...)      -> SubSequence   //    " "
>>>>>>>   func prefix(while...) -> SubSequence   //    " "
>>>>>>>   func suffix(...)      -> SubSequence   //    " "
>>>>>>>   func split(...where...)  -> [SubSequence] // Iterable where
>>>>>>> .Iterator.Element == (Sequence where .Iterator.Element ==
>>>>>>> Self.Iterator.Element)
>>>>>>> }
>>>>>>>
>>>>>>
>>>> And just to be explicit,
>>>> struct Set: Iterable {…}
>>>> struct Dictionary: Iterable {…}
>>>> struct Array: Sequence {…}
>>>> etc.
>>>>
>>>> Hopefully at some point:
>>>> struct OrderedSet: Sequence {…}
>>>>
>>>
>>
> _______________________________________________
> swift-evolution mailing list
> swift-evolution at swift.org
> https://lists.swift.org/mailman/listinfo/swift-evolution
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-evolution/attachments/20171015/ce60f39e/attachment.html>


More information about the swift-evolution mailing list