[swift-evolution] Proposal: CollectionType.cycle property for an infinite sequence

Tue Dec 29 18:54:46 CST 2015

On Tue, Dec 29, 2015, at 04:11 PM, Dave Abrahams wrote:
> 
> > On Dec 27, 2015, at 11:20 PM, Kevin Ballard via swift-evolution <swift-evolution at swift.org> wrote:
> > 
> > ## Introduction
> > 
> > Add a new property `cycle` to CollectionType that returns an infinite SequenceType that yields the elements of the collection in a loop.
> 
> The name should give an explicit indication that the result is infinite, e.g. “repeatedForever".  

All of the precedent I'm aware of calls this operation "cycle".

Haskell: http://hackage.haskell.org/package/base-4.8.1.0/docs/Prelude.html#v:cycle
Rust: https://doc.rust-lang.org/stable/std/iter/trait.Iterator.html#method.cycle
Ruby: http://ruby-doc.org/core-2.2.3/Enumerable.html#method-i-cycle
Python: https://docs.python.org/3/library/itertools.html?highlight=cycle#itertools.cycle
D: http://dlang.org/phobos/std_range.html#cycle

I'm not sure offhand what other languages to even look to for this.

> > ## Motivation
> > 
> > It's sometimes useful to be able to have an infinite sequence. For example, `CollectionOfOne(x).cycle` could be used to have an infinite sequence of a single element (similar to Repeat but without a count). A common use for infinite sequences is zipping with a finite sequence. As far as I'm aware, the stdlib does not currently provide any way to create such an infinite sequence.
> 
> If this is, as I suspect, the primary use case, I would prefer a much clearer incantation than CollectionOfOne(x).repeatedForever.  The part before the parentheses is mostly irrelevant.  I suppose [x].repeatedForever could work, but needing an array allocation makes me sad.  I wish I had something better to suggest… This may in fact be the best we can do.

Honestly, I'd actually like to take Repeat and remove the count. I've never used this type and I've never seen any code that uses this type. The current counted behavior of Repeat could then be recovered by saying `Repeat(elt).prefix(count)`. Of course, this would only repeat a single element, so we'd still need Cycle to repeat sequences (which is why I suggested `CollectionOfOne(x).cycle` as that has the same behavior, although of course Repeat then just becomes a special case of `CollectionOfOne(x).cycle.prefix(count)`).

> > ## Proposed solution
> > 
> > Extend CollectionType with a new property `cycle` that yields a type that conforms to SequenceType. This sequence yields each element of the collection in an infinite loop.
> > 
> > ## Detailed design
> > 
> > 2 new types would be added:
> > 
> > struct CycleSequence<Base : CollectionType> : LazySequenceType { ... }
> > struct CycleGenerator<Base : CollectionType> : GeneratorType { ... }
> > 
> > CollectionType would be extended with a property:
> > 
> > extension CollectionType {
> >    public var cycle: CycleSequence<Self> { get }
> > }
> > 
> > This is an extension of CollectionType instead of SequenceType because it requires a multi-pass sequence (and SequenceType does not provide that guarantee).
> 
> You can copy the elements into an array in that case.  Whether or not we should provide that is an open question, but we do similar things for, e.g., reverse().

You can, but I prefer not to hide array creation like that whenever possible. If I need to cycle some SequenceType I can just wrap it in Array myself, e.g. `Array(seq).cycle`.

> > The returned type conforms to SequenceType instead of CollectionType because there is no possible `endIndex` that satisfies the requirement of being reachable from `startIndex` by zero or more applications of `successor()`.
> 
> Yeah… I’m not sure we want to do so, but we should consider loosening that requirement.  After all, x.repeatedForever.prefix(1000) is in principle a perfectly cromulent collection that shouldn’t require copying x’s elements into new storage.

Loosening this breaks the `count` property. There's no valid value of `count` that can be returned for an infinite sequence. The property could of course just loop infinitely (or fatalError), but that strikes me as being a much worse idea than e.g. map() looping forever, because with map() we have a lazy replacement but there is no lazy replacement for count.

We could of course give CycleSequence a separate prefix() function that returns a collection (but the prefix() from SequenceType must return SubSequence, which cannot be a collection).

> > Because the default eager versions of map and filter will execute forever on an infinite sequence, CycleSequence conforms to LazySequenceType instead of SequenceType in order to provide lazy versions of those functions.
> 
> It would arguably be more appropriate to only provide repeatedForever on instances of LazySequenceType.  The idea has always been that users are surprised when they see their map/filter closure’s side-effects happen multiple times, or at odd times, so we make them write “.lazy” to specifically opt into that behavior.  I’ve always had mixed feelings about this, thinking that maybe it would be better to educate people about laziness, but that’s what we currently do.
>
> We could weasel out of the “multiple side-effects” problem by declaring that since the result is not a collection you can only make one pass through any part of the result, so if your side-effect over-fires, it’s on you.  I wouldn’t be in favor of making this sequence *actually* single-pass though, and this doesn’t solve the “side-effects at odd times” issue.

Good point. I figured that an infinite sequence is "obviously" lazy, but we do have a precedent right now of requiring `lazy` to get lazy sequences. I also have mixed feelings about this (the strongest argument in favor of the status quo is being able to use @noescape functions for the array versions, but I end up having to use `lazy` very often anyway to avoid intermediate array creation anyway).

The reason why I didn't put this on LazySequenceType to begin with is so far we only require `lazy` directly before invoking an operation that takes a closure. `cycle()` doesn't take a closure, so no need to ask that it be lazy. But since operations chained off of it (like `map()`) should be lazy, it makes sense to require the `lazy` anyway.

> > Additionally, it will provide implementations of the eager versions that simply trigger a fatalError(), as the alternative is an infinite loop that consumes more and more memory.
> 
> Why do this?  What would these eager APIs look like?

The only reason to do this is because the default implementations of the functions will crash the program anyway, after entering an infinite loop and gobbling up memory. So I figured it's nicer to just fatalError() immediately as we can provide a much better error that way.

Of course, technically, these eager functions could terminate anyway, if they throw an error. I still think fatalError() is a good idea because nobody writes code that calls one of these methods and expects to get an error in 100% of cases, but if this is a contentious point we can easily remove the fatalError() implementations from the proposal.

-Kevin Ballard