[swift-evolution] [Review] SE-0065 A New Model for Collections and Indices

Wed Apr 13 16:26:03 CDT 2016

on Tue Apr 12 2016, Brent Royal-Gordon <swift-evolution at swift.org> wrote:

> Yes, I totally agree that `Collection.subscript(_: Index)` should not
> be Optional.
>
> But I think that index-manipulation methods like `successor(of:)` are
> a different story. It is normal and expected that, when you alter an
> index, you will occasionally hit the boundaries of the
> collection. There certainly are cases where you know a particular
> index manipulation is safe, but most index manipulations need to be
> guarded by something like:
>
> 	while index < collection.endIndex {
> 		let nextIndex = collection.successor(of: index)
> 		…
> 		index = nextIndex
> 	}

I disagree that this should be considered a “guard.”  You are testing
against another index—a basic feature of what makes something an index.
It's probably a bit more likely that this other index is at the end of
the collection, but it might not be.  For example, look at the algorithm
for reversing a collection in place:

  extension BidirectionalCollection where Self : MutableCollection {
    mutating func reverse() {
      if isEmpty { return } // early exit eliminates a branch in the loop
      var i = startIndex
      var j = predecessor(of: endIndex)
      while i < j {
        swap(&self[i], &self[j])
        formSuccessor(&i)
        formPredecessor(&j)
      }
    }
  }

[Why isn't this in the stdlib?! Someone file a bug and write a proposal,
quick!]

> In these cases, it would be better if the `successor(of:)` method was
> designed in a way that acknowledged and encapsulated the bounds check
> that is usually required when it is used:
>
> 	while let nextIndex = collection.successor(of: index) {
> 		…
> 		index = nextIndex
> 	}

I disagree; it doesn't make sense that I should have to check for nil
when I'm not really interested in the end of the collection as in my
example above.  Many other nontrivial algorithms will have the same
characteristic.  If all you need is to loop through indices to the end,
there are lots of ways to do it.  

Looking closer, what you've got here is actually a *highly* unusual
case, where you want a pair of indices that you want to always point to
successive elements.  I do not know of any algorithm that would use this
except maybe bubblesort, and frankly I cannot see any reason to change
the standard library to support it.  Am I missing something?

> Given the difficulties of statically detecting index invalidation, I
> totally agree that (as you discussed in a section I've snipped) we
> can't statically prove indexes are safe. But we can, at the point
> where we generate an index, easily check if that index is *currently*
> valid. And it's something that most callers will have to do anyway if
> we don't do it ourselves.

I really disagree with the assertion that most callers will need to do
this.  It certainly isn't a normal thing to do in any algorithm I know
of.  If you think I'm wrong (not unheard of!), I suggest you code up the
method you've requested and try to apply it in actual code that
manipulates indices, and show us how it improved the code.

>> We will change the index(_:stepsFrom:limitedBy:) overload to return
>> an optional, and we will see what other implications it has, and how
>> it fits into the rest of the system.
>
> I'm glad to hear you'll evaluate this option, and I think it can give
> us both what we want from this API.
>
> I think having the most high-level operations incorporate bounds
> checks, while the lower-level ones don't, is a good compromise. 

That may be the pattern you discern in Swift's bounds checks, but I just
want to be very clear that it's not a criterion we use to make the
determination.  I would love it if we had a more systematic way to make
the choice to insert bounds checks or not, but for now it remains
something of an art, trying to balance usability and performance
concerns.

> If we encourage people to use `index(_:stepsFrom:limitedBy:)` unless
> they know what they're doing, naïve clients will get an implicit
> bounds check, while sophisticated, speed-sensitive clients can use
> methods like `successor(of:)` which require them to check bounds
> manually.

It's not a bounds check.  There are lots of ways to pass a limit that's
outside the bounds of the collection, and you can pass a limit that's in
the opposite direction from the offset.  If they happen to pass the
collection's endIndex and a positive offset, it degenerates to a bounds
check, but that's not what this method is for.

I will encourage people to use high-level algorithms where possible, so
they're not doing index manipulations directly.  Anyone doing low-level
index manipulations is probably implementing an algorithm, and I'll
encourage them to do whatever will be most efficient, and then test the
algorithm on some very strict models of Collection, that check
everything.  You can find some of these in the StdlibUnittest library in
the swift source tree.

> (There might even be a case for offering bounds-checked
> `successor(of:limitedBy:)` and `predecessor(of:limitedBy:)` methods to
> give people bounds-checked alternatives to all three.)
>
>> Thanks again, Brent.
>
> Thank you!

-- 
Dave