[swift-evolution] [Pitch] Remove destructive consumption from Sequence

Xiaodi Wu xiaodi.wu at gmail.com
Fri Jul 1 12:08:14 CDT 2016


On Fri, Jul 1, 2016 at 11:51 AM, Dave Abrahams via swift-evolution <
swift-evolution at swift.org> wrote:

>
> on Fri Jul 01 2016, Matthew Johnson <matthew-AT-anandabits.com> wrote:
>
> >> On Jun 30, 2016, at 12:26 PM, Dave Abrahams <dabrahams at apple.com>
> > wrote:
> >>
> >>
> >> on Wed Jun 29 2016, Haravikk <swift-evolution-AT-haravikk.me> wrote:
> >>
> >
> >>>> On 29 Jun 2016, at 00:10, Matthew Johnson via swift-evolution <
> swift-evolution at swift.org> wrote:
> >>>>
> >>>> Swift is a language that embraces value semantics.  Many common
> >>>> iterators *can* be implemented with value semantics.  Just because we
> >>>> can’t implement *all* iterators with value semantics doesn’t mean we
> >>>> should require them to have reference semantics.  It just means you
> >>>> can’t *assume* value semantics when working with iterators in generic
> >>>> code unless / until we have a way to specify a value semantics
> >>>> constraint.  That’s not necessarily a bad thing especially when it
> >>>> leaves the door open to interesting future possibilities.
> >>>>
> >>>> -Matthew
> >>>
> >>> I'm kind of undecided about this personally. I think one of the
> >>> problems with Swift is that the only indication that you have a
> >>> reference type is that you can declare it as a constant, yet still
> >>> call mutating methods upon it, this isn't a very positive way of
> >>> identifying it however. This may be more of a GUI/IDE issue though, in
> >>> that something being a class isn't always that obvious at a glance.
> >>>
> >>> I wonder, could we somehow force iterators stored in variables to be
> >>> passed via inout? This would make it pretty clear that you're using
> >>> the same iterator and not a copy in all cases, encouraging you to
> >>> obtain another if you really do need to perform multiple passes.
> >>
> >> I'm going to push single-pass iteration on the stack briefly and talk
> >> about the topic that's been under discussion here: infinite multipass
> >> sequences.
> >>
> >> ## Fitting “Infinite Multipass” Into the Model
> >>
> >> It remains to be decided whether it's worth doing, but if it's to
> >> happen
> >
> > I definitely think it’s worth doing.
>
> Opinions are nice, but rationales are better.  How will we understand
> *why* it's worth doing?
>
> > I really appreciate the attention that the library team has given to
> > this.
> >
> >> , the standard library team thinks the right design is roughly
> >> this:
> >>
> >>  /// A multipass sequence that may be infinite
> >>  protocol Collection {
> >>
> >>    // Only eager algorithms that can terminate available here
> >>    func index(where predicate: (Element)->Bool) -> Index
> >>
> >>    // all lazy algorithms available here
> >>    var lazy: ...
> >>
> >>    var startIndex: Index
> >>    var endIndex: Index // possibly not reachable from startIndex
> >>
> >>    associatedtype SubSequence : Collection
> >>    // do we need an associated FiniteSubsequence, e.g. for prefixes?
> >>  }
> >>
> >>  protocol FiniteCollection : Collection {
> >>
> >>    // All eager algorithms available here
> >>    func map(...) ->
> >>    var count: ...
> >>  }
> >>
> >>  protocol BidirectionalCollection : Collection { ... }
> >>
> >>  protocol RandomAccessCollection : BidirectionalCollection { … }
> >
> > Does this design entirely break with the relationship between
> > collections and iterators (dropping `makeIterator` as a protocol
> > requirement)?  If so, would for..in (over collections) be built on top
> > of indices and use `formIndex(after:)`?  Would it require a finite
> > collection (unless we add `until` to the language and then allow
> > `for..in..until` to work with infinite collections)?
>
> All of these points are up for discussion.  John McCall pointed out to
> me that an index-based for..in would make it possible to implement
>
>   for inout x in y { mutate(&x) }
>
> > Would we still retain `IndexingIterator`even if we break the
> > relationship in the protocol requirements?
>
> Yes: it should be possible to implement Collection algorithms in terms
> of Iterator algorithms, and IndexingIterator provides the means.  That
> said, I think the makeIterator requirement does little harm, especially
> when it can be defaulted for Collections.
>
> > Would it still be possible to do things like zip a multi-pass sequence
> > with a single-pass sequence (assuming we keep single-pass sequences or
> > add them back eventually)?  This seems like a use case worth
> > supporting in some way.
>
> Yes.  If you can create an Iterator from a Collection, and you can zip
> Iterators, you can do this.
>
> > One subtle change I think this implies is that things like
> > `LazyFilterSequence` can implement `makeIterator` with constant
> > complexity, deferring the O(N) complexity to the first call to `next`.
>
> I don't believe that's a difference, though I could be wrong.
>
> > `startIndex` for `LazyFilterCollection` currently has O(N) complexity.
> > The complexity of a complete iteration doesn’t change and probably
> > isn’t a big deal, but it’s worth noting.
>
> Filtered collection views always require a bit of hand-waving around
> performance guarantees; I don't think that changes.
>
> > I’ve been looking at some code that wraps a sequence and considering
> > how it would be impacted.  With iterators it looks like this:
> >
> > guard let element = base.next()
> >   else { return nil }
> >
> > With collections and indices it would look something like this:
> >
> > base.formIndex(after: &index)
> > guard index != baseEndIndex
> >    else { return endIndex }
> >
> > let element = base[index]
> >
> > That’s not too bad but it is more verbose.
>
> Sequence today is a single-pass thing.  If you are wrapping Sequence
> today presumably you'd wrap an Iterator tomorrow, and you wouldn't have
> to deal with indices.
>
> > If we’re going to push people towards collections and indices we
> > should try to make common patterns like “update the iteration state
> > and return the next element if there is one" simpler.
>
> That's IndexingIterator.
>
> > This could be accomplished with an extension method along these lines:
> >
> > guard let element = base.formIndex(after: &index,
> > .returningOptionalElement)
> >     else { return endIndex }
> >
> > With an implementation something like:
> >
> > enum FormIndexResult {
> >     .returningOptionalElement
> > }
> > extension Collection {
> >     func formIndex(after i: inout Self.Index, _ result:
> > FormIndexResult) -> Self.Element?
> > }
> >
> > This would provide similar functionality to `IndexingIterator` without
> > coupling the storage of `elements` and `position` (which is important
> > if you’re wrapping a collection and need to wrap the collection and
> > its indices independently).
>
> I'm afraid I don't understand.  Could you be more explicit about what
> you have in mind?
>
> >> Q: Why should there be indices on an infinite multipass sequence?
> >> A: Because the operations on indices apply equally well whether the
> >>   sequence is finite or not.  Find the index of a value in the
> >>   sequence, slice the sequence, find again, etc.
> >>
> >> Q: Why is there an endIndex on an infinite seque?
> >> A: So you can write algorithms such as index(where:) once.
> >>
> >> Q: Why not allow endIndex to have a different type from startIndex?
> >>
> >> A: It appears to offer insufficient benefit for the associated
> >>    complexity in typical usage.  A classic use case that argues for a
> >>    different endIndex type is the null-terminated C string.  But you
> >>    can't index one of those safely without actually counting the
> >>    length,
> >>    and once you've done that you can make the endIndex an Int.
> >
> > It’s also worth nothing that we can use `Optional` with `nil` as the
> > `endIndex` sentinel if necessary.
>
> True, that's a useful technique when there's no underlying storage in
> the collection (e.g. a fibonacci sequence)
>
> >>
> >> ## Single Pass Iteration
> >>
> >> The refinement relationship between Sequence and Collection is
> >> problematic, because it means either:
> >>
> >> a) algorithms such as map on single-pass sequences claim to be
> >>   nonmutating even though it's a lie (status quo)
> >>
> >> b) those algorithms can't be used on immutable (“let bound”)
> >>   multipass sequences. IMO that would be totally unacceptable.
> >>
> >> If we drop the refinement, we can have a saner world.  We also don't
> >> need to separate Sequence and Iterator anymore.  We can simply drop
> >> Sequence altogether, and the protocol for single-pass iteration
> >> becomes Iterator.
> >
> > Makes sense to me.
> >
> >>
> >> ### Mutation and Reference Semantics
> >>
> >> Everything in Swift is copiable via `let copy = thing` (let's please
> >> not argue over the definition of copy for classes; this is the one
> >> built into the lowest level of the language—I refer to the other one,
> >> that requires allocation, as “clone”).
> >>
> >> Anything you do with a sequence that's truly single-pass mutates the
> >> sequence *and of its copies*.  Therefore, such a type *fundamentally*
> >> has reference semantics. One day we may be able to model single-pass
> >> sequences with “move-only” value types, which cannot be copied. You
> >> can find move-only types in languages like Rust and C++, but they are
> >> not supported by Swift today.  So it seems reasonable that all
> >> Iterators in Swift today should be modeled as classes.
> >
> > I think this makes a lot of sense in the model you are proposing.  All
> > multipass structures are collections.  Any sequence that can only
> > support a single pass is modeled as an iterator which inherently has
> > identity.  Making this distinction strong will prevent any confusion.
> >
> >> The fact that Swift doesn't have a mutation model for classes,
> >> though, means that mutating methods on a class constrained protocol
> >> can't be labeled as such.  So consuming operations on a
> >> class-constrained Iterator protocol would not be labeled as mutating.
> >>
> >> The standard library team is currently trying to evaluate the
> >> tradeoffs in this area.  One possibility under consideration is
> >> simply dropping support for single-pass sequences until Swift can
> >> support move-only value types and/or gets a mutation model for class
> >> instances.  It would be very interesting to know about any real-world
> >> models of single-pass sequences that people are using in Swift, since
> >> we don't supply any in the standard library.
> >
> > I’m happy to see you mention a mutation model for class instances!  (I
> > don’t mean to sidetrack the discussion, but would love to see that
> > someday)
> >
> > I don’t have any objection to dropping support for single-pass
> > sequences temporarily.  It’s possible that I would feel differently if
> > I was making use of them in my own code but I’m not.
>
> On second thought, I believe it is important to have a way to support
> existing “partially formed” multipass sequences that don't expose
> copying or equality for their iteration states.  Iterator is the right
> way to do that.  So I think we need to keep Iterator around.
>
> > In the meantime, people would be able to implement their own protocol
> > for single pass sequences.  What they would lose is for..in as well as
> > the standard library algorithms.  I’m not sure how many people this
> > would impact or how big the impact would be for them.  We have seen a
> > couple of examples in this discussion, but probably not enough to
> > asses the overall impact.
> >
> > One thing you don’t mention here is a distinction between finite and
> > infinite single-pass sequences (iterators).  I don’t know if the
> > finite / infinite distinction is as important here, but wanted to
> > point it out.  Obviously if we remove support single-pass sequences
> > now we could defer that discussion until we’re ready to bring back
> > support for them.
>
> There are a few possible answers I can think of:
>
> 1. Do the “obvious” thing and create a separate protocol for finite
>    single-pass sequences
>
> 2. Decide that the combination of infinite and single-pass is rare
>    enough (/dev/urandom, temperature sensor) that it's better to just
>    ask people handling them to be careful and not, e.g., try to “count”
>    them.
>

Not really feeling sufficiently in my element (excuse the pun) to comment
on most of this discussion, but here I thought I'd chime in. What's
interesting about your two examples (/dev/urandom, temperature sensor) is
that, though single-pass, they should be insensitive to destructive
consumption, no? By which I mean, if some function returns 5 elements from
the "sequence", in both scenarios it would be undetectable whether it
consumes 5, 10 or 15 elements in the process, IIUC. Are there other
examples of infinite, single-pass sequences where destructive consumption
would make a difference?


>
> 3. Decide that everything on a single-pass sequence is lazy. Since you
>    can only take a single pass anyway, people won't observe their
>    closures being called more often than necessary, which was the main
>    motivator for making map, filter, et. al eager on collections without
>    an explicit .lazy.
>
> Implications of #3:
>
> * Any “partially-formed” multipass sequences (modeling only Iterator)
>   would be free to expose an accurate underestimatedCount, thereby
>   optimizing the process of copying into an array. The lazy filter
>   Iterator adaptor would have an underestimatedCount of 0.
>
> * All algorithms that require multiple passes, such as sorted(), would
>   be unavailable on Iterator.  You'd have to construct an Array (or
>   other MutableCollection) and sort that in-place.  Of course,
>   constructing an Array from an Iterator could still go on forever if
>   the Iterator turned out to be infinite, which means, at some level #3
>   is just a refinement of #2 that makes it less error-prone.
>
> --
> Dave
> _______________________________________________
> swift-evolution mailing list
> swift-evolution at swift.org
> https://lists.swift.org/mailman/listinfo/swift-evolution
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-evolution/attachments/20160701/8ace6b4d/attachment.html>


More information about the swift-evolution mailing list