[swift-evolution] Removing enumerated?

Ben Cohen ben_cohen at apple.com
Tue Jan 31 13:16:29 CST 2017


I think whether enumerated() is justified as a method on Sequence is one example of a wider question which definitely needs some discussion, which is: where should the standard library draw the line in providing convenience functions that can easily be composed from other functions in the std lib? Here’s another example: 

SE-100 <https://github.com/apple/swift-evolution/blob/master/proposals/0100-add-sequence-based-init-and-merge-to-dictionary.md> is a proposal to add an init to Dictionary from a sequence of key/value pairs. It’s a commonly requested feature, and IMO much needed and should be added as soon as we move to the appropriate phase in Swift’s evolution. 

Another commonly requested Dictionary feature is similar: add a Dictionary.init that takes a sequence, and a closure that maps that sequence to keys. This is useful, for example, when you have a sequence of objects that you frequently need to index into via one property on those objects, so you want to build a fast lookup cache using that property.

Now, if we implement SE-100, that second request can be easily composed. It would be something like Dictionary(sequence.lazy.map { (key: $0.someProperty, value: $0) } )

Some people look at that line of code and think sure, that’s what I’d do and it’s easy enough that the second helper shouldn’t be added as it’s superfluous. Others look at it and say that it is unreadable clever-code FP nonsense, and we should just add the helper method because most programmers wouldn’t be able to read or write that easily.

As we expand (and maybe contract :) the standard library, this kind of question comes up a lot, so it is worth setting out some criteria for judging these “helper” methods. Here’s my take on such a list (warning: objectivity and subjectivity blended together in the below).

1. Is it truly a frequent operation?

The operation needs to carry its weight. Even once we have ABI stability, so the size of the std lib becomes less of a concern as it could ship as part of the OS, we still need to keep helper method growth under control. APIs bristling with methods like an over-decorated Xmas tree are bad for usability. As mentioned in the String manifesto, String+Foundation currently has over 200 methods/properties. Helpers are no good if you can’t find them to use them.

Someone mentioned that they actually don’t find themselves using enumerated() all that often. I suspect enumerated in its current form isn’t all that useful. In a quick (and non-scientific) review of search results for its use on GitHub, nearly all the examples I see of it are either 1) misuse – see more below, or 2) use of it to perform the equivalent of in-place map where the index is used to subscript into the array and replace it with a transformed element.

I think the std lib needs an in-place map, and if enumerated() is removed, this one most-common use case should be added at the same time.

2. Is the helper more readable? Is the composed equivalent obvious at a glance?

When an operation is very common, the big win with a helper method is it creates a readable vocabulary. If enumerated() is very common, and everyone sees it everywhere, it makes that code easier to read at a glance.

That said, I think that the alternative – zip(0…, sequence) – is just as readable once you are familiar with zip and ranges, two concepts that, IMO at least, it is important that every Swift programmer learn about early on. I would even go so far as to say that enumerated is harmful if it discourages new users from discovering zip.

OTOH, an example that I think is strongly justified on readability grounds is Sequence.all(match:), a function that checks that every element in a sequence matches a predicate. This is by far the most common extension for Sequence on GitHub. Yes, you can write this as !seq.contains(!predicate) but that far less readable – especially since it requires you write it with a closure that !’s the predicate i.e. !seq.contains { !predicate($0) }, because we don’t have a ! for composing with predicate functions (though maybe we should).

3. Does the helper have the flexibility to cover all common cases?

This, I think, is where enumerated() really starts to fall down.

Sometimes you want to start from not-zero, so maybe it should be Sequence.enumerated(startingFrom: Int = 0)
Sometimes you want non-integer indices so maybe we should add indexed().
Sometimes you want it the other way around (not for a for loop but for passing the elements directly into a mapping function) so now you need a flip.
Sometimes you want to enumeration to run backwards in some way.

Less of a problem if you’re already accustomed to composing your enumeration:

Enumerate starting from 1: zip(1…, c)
Enumerate indices: zip(c.indices, c)
Need the pair to be the other way around: zip(c, 0…)
Need the enumeration to run the other way: zip((0..<c.count).reversed,c) or zip(0…,c).reversed() or zip(0…,c.reversed()).

Similarly, for the Dictionary helper – what if you want a sequence, and a closure to map keys to values, rather than values to keys?

(zip also needs to retain its collection-ness, which is filed as SR-3760, a great starter proposal if anyone wants to take it on :)

4. Is there a correctness trap with the composed equivalent? Is there a correctness trap with the helper?

As others noted on the thread, enumerated() used for indices encourages a possible correctness error:

for (idx, element) = someSlice.enumerated() { } // slices aren’t zero based!

Now, chances are that since out-of-bounds subscripting traps on arrays, this bug would be caught early on. But maybe not – especially if you ended up using it on an UnsafeBufferPointer slice (perhaps dropped in for performance reasons to replace an Array) and now all of a sudden you have a memory access violation that could go undetected.

On the flip side: many operations on collections that use indices are hard to get right, especially ones that involve mutating as you go, like removing elements from a collection based on some criteria, where you have to work around index invalidation or fencepost errors. For this reason, the std lib probably ought to have a RangeReplaceableCollection.remove(where:) method so people don’t have to reinvent it and risk getting it wrong.

5. Is there a performance trap with the composed equivalent? Or with the helper?

The composed example of Dictionary from a sequence+closure, you need to remember the .lazy part in sequence.lazy.map to avoid creating a temporary array for no reason. A helper method might lift that burden from the user. remove(where:) can easily be accidentally quadtratic, or perform needless element copies when there’s little or nothing to remove.

Counter example: the fact that map is eager and chaining it creates temporary arrays needlessly is a performance problem caused by introducing it as a helper method.

6. Does the helper actually encourage misuse?

This is a very common pattern when searching GitHub for uses of enumerated(): 

for (index, _) in collection.enumerated() { 
    mutate collect[index] in-place i.e. collection[index] += 1
}.

(or even they assign the element then don’t use it, for which specific case we don’t currently emit a warning)

What the user really needs is just:

for index in collection.indices { etc. }

I expect if the standard way to do enumerated was with zip, people wouldn’t do this as often. In-place map would be even better.

8. Can a native implementation be made more performant than the equivalent composition?

Dictionary.mapValues(transform: (Value)->T) can be implemented internally much more efficiently than just composing it with map and the key/value sequence initializer, because the layout of the hash table storage can be re-used in the new dictionary, even when the Value type is different.

> On Jan 31, 2017, at 6:24 AM, Chris Eidhof via swift-evolution <swift-evolution at swift.org> wrote:
> 
> Hey everyone,
> 
> I've organized a number of Swift workshops over the last two years. There are a couple of things that keep coming up, and a couple of mistakes that I see people making over and over again. One of them is that in almost every workshop, there's someone who thinks that `enumerated()` returns a list of (index, element) pairs. This is only true for arrays. It breaks when using array slices, or any other kind of collection. In our workshops, I sometimes see people doing something like `x.reversed().enumerated()`, where `x` is an array, and somehow it produces behavior they don't understand.
> 
> A few ways I think this could be improved:
> 
> - Move enumerated to Array
> - Change enumerated to return `(Index, Iterator.Element)` (this would mean we at least need to move it to collection)
> - Remove enumerated
> - Keep things as is
> 
> In any case, just wanted to share my experience (gained from teaching people). 
> 
> -- 
> Chris Eidhof
> _______________________________________________
> swift-evolution mailing list
> swift-evolution at swift.org
> https://lists.swift.org/mailman/listinfo/swift-evolution

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-evolution/attachments/20170131/0b36038b/attachment.html>


More information about the swift-evolution mailing list