[swift-evolution] Proposal: CollectionType.cycle property for an infinite sequence

Kevin Ballard kevin at sb.org
Tue Dec 29 13:17:11 CST 2015

On Tue, Dec 29, 2015, at 08:14 AM, plx via swift-evolution wrote:
> Personally I’d say this should be a -1 for standard-library inclusion.
> Swift’s not really built to handle infinite sequences right now; until they are handed better by the standard library convenience methods for creating them shouldn’t be in the standard library.

As far as I can tell, the only way in which it's "not really built" to handle this is that there are multiple constructs that attempt to eagerly consume an entire sequence, and using these with an infinite sequence would end up looping forever. But I don't consider that to be a particularly serious problem. You could "fix" it by refactoring SequenceType into two protocols, SequenceType (for possibly-infinite sequences) and FiniteSequenceType (for known-finite sequences) and then going over the entire standard library and updating various spots to use FiniteSequenceType, except this would be very limiting (sequences that are not known if they're infinite to the compiler could still be valid for various eager algorithms if the programmer knows it will be finite in practice).

> You’d also want to call `fatalError` for at least `reduce`, `reverse`, `sort`, `split`(?), `flatMap`, `dropLast`, `suffix`, and `forEach`.

You can only do it for the ones defined in the protocol, not ones defined in extensions. This means map, filter, forEach, and suffix.

split is actually perfectly implementable for a CycleSequence, although it does need a custom implementation. split is bounded by at most Int.max splits, which means it is guaranteed to terminate even for an infinite sequence (although the final subsequence does need to be infinite[1]). Even if there are no separators in the cycle, it can just return the cycle itself.

[1] Interestingly, the current implementation actually dumps the remainder into an Array and returns that (wrapped in AnySequence), which is curious because it would be perfectly legal for it to just wrap the generator up in AnySequence and return that instead. I'm tempted to submit a PR to change that now, as it just seems like unnecessary work to use an array.

> `startsWith` and `elementsEqual` and `lexicographicComparison` are all broken if you call them like e.g. `self.startsWith(self)`.

That's true, but what do you really expect when you're calling them with two infinite sequences? It's not so much that they're broken as it is that you're creating an infinite loop without any way to break out. And FWIW, lexicographicalCompare is actually something you might want to explicitly support on infinite sequences if you know your sequences aren't equal and want to find out which one is less than the other.

There are plenty of ways to shoot yourself in the foot. I don't think infinite sequences are really the big problem you're making them out to be.

> You can conceivably implement a non-crashing `contains`, `minElement` and `maxElement` on `CycleSequence` by calling through to the wrapped collection, but that’ll seemingly evaporate as soon as your `CycleSequence` winds up hidden inside an `AnySequence`.

You can't override those anyway in a generic context, because they're not members of the protocol, they're just extensions. You could implement them such that your implementation is called when working on the concrete CycleSequence type, but I'm not sure if that's a great idea to do that when the actual behavior differs from calling it generically on SequenceType (well, triggering a fatalError() instead of an infinite loop is fine because they're both Bottom, but returning a valid result in one context and looping infinitely in the other seems bad).

Of course, these methods could actually be moved into the protocol itself, which would let us override them. I'm not entirely sure why they aren't in the protocol to begin with, I guess because there's not much need for overriding these outside of infinite sequences (well, finite sorted sequences could provide an optimized min/maxElement, and an optimized version of contains(_: Self.Generator.Element), but maybe there's tradeoffs to doing this, e.g. maybe there's some reason why having a large protocol witness table is a bad idea).

> Which illustrates why this is a -1 for me; there's nothing wrong with the functionality in isolation and there’s nothing wrong with infinite sequences, but the standard library should play well with itself, and this wouldn’t play well with the rest of the standard library.

Ultimately, there's not much difference between an infinite sequence and a sequence of Int.max elements. The latter is finite, but it's so massive (especially on 64-bit) that any kind of eager processing is going to hit the same problems as an infinite sequence. Every problem you describe will be a problem with the simple sequence `(0..<Int.max)` as well.

-Kevin Ballard

> That opinion could change as the language changes or the standard library evolves.
> > On Dec 28, 2015, at 1:20 AM, Kevin Ballard via swift-evolution <swift-evolution at swift.org> wrote:
> > 
> > ## Introduction
> > 
> > Add a new property `cycle` to CollectionType that returns an infinite SequenceType that yields the elements of the collection in a loop.
> > 
> > ## Motivation
> > 
> > It's sometimes useful to be able to have an infinite sequence. For example, `CollectionOfOne(x).cycle` could be used to have an infinite sequence of a single element (similar to Repeat but without a count). A common use for infinite sequences is zipping with a finite sequence. As far as I'm aware, the stdlib does not currently provide any way to create such an infinite sequence.
> > 
> > ## Proposed solution
> > 
> > Extend CollectionType with a new property `cycle` that yields a type that conforms to SequenceType. This sequence yields each element of the collection in an infinite loop.
> > 
> > ## Detailed design
> > 
> > 2 new types would be added:
> > 
> > struct CycleSequence<Base : CollectionType> : LazySequenceType { ... }
> > struct CycleGenerator<Base : CollectionType> : GeneratorType { ... }
> > 
> > CollectionType would be extended with a property:
> > 
> > extension CollectionType {
> >    public var cycle: CycleSequence<Self> { get }
> > }
> > 
> > This is an extension of CollectionType instead of SequenceType because it requires a multi-pass sequence (and SequenceType does not provide that guarantee). The returned type conforms to SequenceType instead of CollectionType because there is no possible `endIndex` that satisfies the requirement of being reachable from `startIndex` by zero or more applications of `successor()`.
> > 
> > Because the default eager versions of map and filter will execute forever on an infinite sequence, CycleSequence conforms to LazySequenceType instead of SequenceType in order to provide lazy versions of those functions. Additionally, it will provide implementations of the eager versions that simply trigger a fatalError(), as the alternative is an infinite loop that consumes more and more memory.
> > 
> > ## Impact on existing code
> > 
> > None
> > 
> > -Kevin Ballard
> > _______________________________________________
> > swift-evolution mailing list
> > swift-evolution at swift.org
> > https://lists.swift.org/mailman/listinfo/swift-evolution
> _______________________________________________
> swift-evolution mailing list
> swift-evolution at swift.org
> https://lists.swift.org/mailman/listinfo/swift-evolution

More information about the swift-evolution mailing list