<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Fri, Jul 1, 2016 at 11:51 AM, Dave Abrahams via swift-evolution <span dir="ltr"><<a href="mailto:swift-evolution@swift.org" target="_blank">swift-evolution@swift.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="HOEnZb"><div class="h5"><br>
on Fri Jul 01 2016, Matthew Johnson <matthew-AT-anandabits.com> wrote:<br>
<br>
>> On Jun 30, 2016, at 12:26 PM, Dave Abrahams <<a href="mailto:dabrahams@apple.com">dabrahams@apple.com</a>><br>
> wrote:<br>
>><br>
>><br>
>> on Wed Jun 29 2016, Haravikk <swift-evolution-AT-haravikk.me> wrote:<br>
>><br>
><br>
>>>> On 29 Jun 2016, at 00:10, Matthew Johnson via swift-evolution <<a href="mailto:swift-evolution@swift.org">swift-evolution@swift.org</a>> wrote:<br>
>>>><br>
>>>> Swift is a language that embraces value semantics. Many common<br>
>>>> iterators *can* be implemented with value semantics. Just because we<br>
>>>> can’t implement *all* iterators with value semantics doesn’t mean we<br>
>>>> should require them to have reference semantics. It just means you<br>
>>>> can’t *assume* value semantics when working with iterators in generic<br>
>>>> code unless / until we have a way to specify a value semantics<br>
>>>> constraint. That’s not necessarily a bad thing especially when it<br>
>>>> leaves the door open to interesting future possibilities.<br>
>>>><br>
>>>> -Matthew<br>
>>><br>
>>> I'm kind of undecided about this personally. I think one of the<br>
>>> problems with Swift is that the only indication that you have a<br>
>>> reference type is that you can declare it as a constant, yet still<br>
>>> call mutating methods upon it, this isn't a very positive way of<br>
>>> identifying it however. This may be more of a GUI/IDE issue though, in<br>
>>> that something being a class isn't always that obvious at a glance.<br>
>>><br>
>>> I wonder, could we somehow force iterators stored in variables to be<br>
>>> passed via inout? This would make it pretty clear that you're using<br>
>>> the same iterator and not a copy in all cases, encouraging you to<br>
>>> obtain another if you really do need to perform multiple passes.<br>
>><br>
>> I'm going to push single-pass iteration on the stack briefly and talk<br>
>> about the topic that's been under discussion here: infinite multipass<br>
>> sequences.<br>
>><br>
>> ## Fitting “Infinite Multipass” Into the Model<br>
>><br>
>> It remains to be decided whether it's worth doing, but if it's to<br>
>> happen<br>
><br>
> I definitely think it’s worth doing.<br>
<br>
</div></div>Opinions are nice, but rationales are better. How will we understand<br>
*why* it's worth doing?<br>
<div><div class="h5"><br>
> I really appreciate the attention that the library team has given to<br>
> this.<br>
><br>
>> , the standard library team thinks the right design is roughly<br>
>> this:<br>
>><br>
>> /// A multipass sequence that may be infinite<br>
>> protocol Collection {<br>
>><br>
>> // Only eager algorithms that can terminate available here<br>
>> func index(where predicate: (Element)->Bool) -> Index<br>
>><br>
>> // all lazy algorithms available here<br>
>> var lazy: ...<br>
>><br>
>> var startIndex: Index<br>
>> var endIndex: Index // possibly not reachable from startIndex<br>
>><br>
>> associatedtype SubSequence : Collection<br>
>> // do we need an associated FiniteSubsequence, e.g. for prefixes?<br>
>> }<br>
>><br>
>> protocol FiniteCollection : Collection {<br>
>><br>
>> // All eager algorithms available here<br>
>> func map(...) -><br>
>> var count: ...<br>
>> }<br>
>><br>
>> protocol BidirectionalCollection : Collection { ... }<br>
>><br>
>> protocol RandomAccessCollection : BidirectionalCollection { … }<br>
><br>
> Does this design entirely break with the relationship between<br>
> collections and iterators (dropping `makeIterator` as a protocol<br>
> requirement)? If so, would for..in (over collections) be built on top<br>
> of indices and use `formIndex(after:)`? Would it require a finite<br>
> collection (unless we add `until` to the language and then allow<br>
> `for..in..until` to work with infinite collections)?<br>
<br>
</div></div>All of these points are up for discussion. John McCall pointed out to<br>
me that an index-based for..in would make it possible to implement<br>
<br>
for inout x in y { mutate(&x) }<br>
<span class=""><br>
> Would we still retain `IndexingIterator`even if we break the<br>
> relationship in the protocol requirements?<br>
<br>
</span>Yes: it should be possible to implement Collection algorithms in terms<br>
of Iterator algorithms, and IndexingIterator provides the means. That<br>
said, I think the makeIterator requirement does little harm, especially<br>
when it can be defaulted for Collections.<br>
<span class=""><br>
> Would it still be possible to do things like zip a multi-pass sequence<br>
> with a single-pass sequence (assuming we keep single-pass sequences or<br>
> add them back eventually)? This seems like a use case worth<br>
> supporting in some way.<br>
<br>
</span>Yes. If you can create an Iterator from a Collection, and you can zip<br>
Iterators, you can do this.<br>
<span class=""><br>
> One subtle change I think this implies is that things like<br>
> `LazyFilterSequence` can implement `makeIterator` with constant<br>
> complexity, deferring the O(N) complexity to the first call to `next`.<br>
<br>
</span>I don't believe that's a difference, though I could be wrong.<br>
<span class=""><br>
> `startIndex` for `LazyFilterCollection` currently has O(N) complexity.<br>
> The complexity of a complete iteration doesn’t change and probably<br>
> isn’t a big deal, but it’s worth noting.<br>
<br>
</span>Filtered collection views always require a bit of hand-waving around<br>
performance guarantees; I don't think that changes.<br>
<span class=""><br>
> I’ve been looking at some code that wraps a sequence and considering<br>
> how it would be impacted. With iterators it looks like this:<br>
><br>
> guard let element = base.next()<br>
> else { return nil }<br>
><br>
> With collections and indices it would look something like this:<br>
><br>
> base.formIndex(after: &index)<br>
> guard index != baseEndIndex<br>
> else { return endIndex }<br>
><br>
> let element = base[index]<br>
><br>
> That’s not too bad but it is more verbose.<br>
<br>
</span>Sequence today is a single-pass thing. If you are wrapping Sequence<br>
today presumably you'd wrap an Iterator tomorrow, and you wouldn't have<br>
to deal with indices.<br>
<span class=""><br>
> If we’re going to push people towards collections and indices we<br>
> should try to make common patterns like “update the iteration state<br>
> and return the next element if there is one" simpler.<br>
<br>
</span>That's IndexingIterator.<br>
<span class=""><br>
> This could be accomplished with an extension method along these lines:<br>
><br>
> guard let element = base.formIndex(after: &index,<br>
> .returningOptionalElement)<br>
> else { return endIndex }<br>
><br>
> With an implementation something like:<br>
><br>
> enum FormIndexResult {<br>
> .returningOptionalElement<br>
> }<br>
> extension Collection {<br>
> func formIndex(after i: inout Self.Index, _ result:<br>
> FormIndexResult) -> Self.Element?<br>
> }<br>
><br>
> This would provide similar functionality to `IndexingIterator` without<br>
> coupling the storage of `elements` and `position` (which is important<br>
> if you’re wrapping a collection and need to wrap the collection and<br>
> its indices independently).<br>
<br>
</span>I'm afraid I don't understand. Could you be more explicit about what<br>
you have in mind?<br>
<span class=""><br>
>> Q: Why should there be indices on an infinite multipass sequence?<br>
>> A: Because the operations on indices apply equally well whether the<br>
>> sequence is finite or not. Find the index of a value in the<br>
>> sequence, slice the sequence, find again, etc.<br>
>><br>
>> Q: Why is there an endIndex on an infinite seque?<br>
>> A: So you can write algorithms such as index(where:) once.<br>
>><br>
>> Q: Why not allow endIndex to have a different type from startIndex?<br>
>><br>
>> A: It appears to offer insufficient benefit for the associated<br>
>> complexity in typical usage. A classic use case that argues for a<br>
>> different endIndex type is the null-terminated C string. But you<br>
>> can't index one of those safely without actually counting the<br>
>> length,<br>
>> and once you've done that you can make the endIndex an Int.<br>
><br>
> It’s also worth nothing that we can use `Optional` with `nil` as the<br>
> `endIndex` sentinel if necessary.<br>
<br>
</span>True, that's a useful technique when there's no underlying storage in<br>
the collection (e.g. a fibonacci sequence)<br>
<div><div class="h5"><br>
>><br>
>> ## Single Pass Iteration<br>
>><br>
>> The refinement relationship between Sequence and Collection is<br>
>> problematic, because it means either:<br>
>><br>
>> a) algorithms such as map on single-pass sequences claim to be<br>
>> nonmutating even though it's a lie (status quo)<br>
>><br>
>> b) those algorithms can't be used on immutable (“let bound”)<br>
>> multipass sequences. IMO that would be totally unacceptable.<br>
>><br>
>> If we drop the refinement, we can have a saner world. We also don't<br>
>> need to separate Sequence and Iterator anymore. We can simply drop<br>
>> Sequence altogether, and the protocol for single-pass iteration<br>
>> becomes Iterator.<br>
><br>
> Makes sense to me.<br>
><br>
>><br>
>> ### Mutation and Reference Semantics<br>
>><br>
>> Everything in Swift is copiable via `let copy = thing` (let's please<br>
>> not argue over the definition of copy for classes; this is the one<br>
>> built into the lowest level of the language—I refer to the other one,<br>
>> that requires allocation, as “clone”).<br>
>><br>
>> Anything you do with a sequence that's truly single-pass mutates the<br>
>> sequence *and of its copies*. Therefore, such a type *fundamentally*<br>
>> has reference semantics. One day we may be able to model single-pass<br>
>> sequences with “move-only” value types, which cannot be copied. You<br>
>> can find move-only types in languages like Rust and C++, but they are<br>
>> not supported by Swift today. So it seems reasonable that all<br>
>> Iterators in Swift today should be modeled as classes.<br>
><br>
> I think this makes a lot of sense in the model you are proposing. All<br>
> multipass structures are collections. Any sequence that can only<br>
> support a single pass is modeled as an iterator which inherently has<br>
> identity. Making this distinction strong will prevent any confusion.<br>
><br>
>> The fact that Swift doesn't have a mutation model for classes,<br>
>> though, means that mutating methods on a class constrained protocol<br>
>> can't be labeled as such. So consuming operations on a<br>
>> class-constrained Iterator protocol would not be labeled as mutating.<br>
>><br>
>> The standard library team is currently trying to evaluate the<br>
>> tradeoffs in this area. One possibility under consideration is<br>
>> simply dropping support for single-pass sequences until Swift can<br>
>> support move-only value types and/or gets a mutation model for class<br>
>> instances. It would be very interesting to know about any real-world<br>
>> models of single-pass sequences that people are using in Swift, since<br>
>> we don't supply any in the standard library.<br>
><br>
> I’m happy to see you mention a mutation model for class instances! (I<br>
> don’t mean to sidetrack the discussion, but would love to see that<br>
> someday)<br>
><br>
> I don’t have any objection to dropping support for single-pass<br>
> sequences temporarily. It’s possible that I would feel differently if<br>
> I was making use of them in my own code but I’m not.<br>
<br>
</div></div>On second thought, I believe it is important to have a way to support<br>
existing “partially formed” multipass sequences that don't expose<br>
copying or equality for their iteration states. Iterator is the right<br>
way to do that. So I think we need to keep Iterator around.<br>
<span class=""><br>
> In the meantime, people would be able to implement their own protocol<br>
> for single pass sequences. What they would lose is for..in as well as<br>
> the standard library algorithms. I’m not sure how many people this<br>
> would impact or how big the impact would be for them. We have seen a<br>
> couple of examples in this discussion, but probably not enough to<br>
> asses the overall impact.<br>
><br>
> One thing you don’t mention here is a distinction between finite and<br>
> infinite single-pass sequences (iterators). I don’t know if the<br>
> finite / infinite distinction is as important here, but wanted to<br>
> point it out. Obviously if we remove support single-pass sequences<br>
> now we could defer that discussion until we’re ready to bring back<br>
> support for them.<br>
<br>
</span>There are a few possible answers I can think of:<br>
<br>
1. Do the “obvious” thing and create a separate protocol for finite<br>
single-pass sequences<br>
<br>
2. Decide that the combination of infinite and single-pass is rare<br>
enough (/dev/urandom, temperature sensor) that it's better to just<br>
ask people handling them to be careful and not, e.g., try to “count”<br>
them.<br></blockquote><div><br></div><div>Not really feeling sufficiently in my element (excuse the pun) to comment on most of this discussion, but here I thought I'd chime in. What's interesting about your two examples (/dev/urandom, temperature sensor) is that, though single-pass, they should be insensitive to destructive consumption, no? By which I mean, if some function returns 5 elements from the "sequence", in both scenarios it would be undetectable whether it consumes 5, 10 or 15 elements in the process, IIUC. Are there other examples of infinite, single-pass sequences where destructive consumption would make a difference?</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
3. Decide that everything on a single-pass sequence is lazy. Since you<br>
can only take a single pass anyway, people won't observe their<br>
closures being called more often than necessary, which was the main<br>
motivator for making map, filter, et. al eager on collections without<br>
an explicit .lazy.<br>
<br>
Implications of #3:<br>
<br>
* Any “partially-formed” multipass sequences (modeling only Iterator)<br>
would be free to expose an accurate underestimatedCount, thereby<br>
optimizing the process of copying into an array. The lazy filter<br>
Iterator adaptor would have an underestimatedCount of 0.<br>
<br>
* All algorithms that require multiple passes, such as sorted(), would<br>
be unavailable on Iterator. You'd have to construct an Array (or<br>
other MutableCollection) and sort that in-place. Of course,<br>
constructing an Array from an Iterator could still go on forever if<br>
the Iterator turned out to be infinite, which means, at some level #3<br>
is just a refinement of #2 that makes it less error-prone.<br>
<div class="HOEnZb"><div class="h5"><br>
--<br>
Dave<br>
_______________________________________________<br>
swift-evolution mailing list<br>
<a href="mailto:swift-evolution@swift.org">swift-evolution@swift.org</a><br>
<a href="https://lists.swift.org/mailman/listinfo/swift-evolution" rel="noreferrer" target="_blank">https://lists.swift.org/mailman/listinfo/swift-evolution</a><br>
</div></div></blockquote></div><br></div></div>