[swift-evolution] [Review] SE-0065 A New Model for Collections and Indices

plx plxswift at icloud.com
Thu Apr 14 18:11:30 CDT 2016


> On Apr 14, 2016, at 12:12 PM, Dave Abrahams via swift-evolution <swift-evolution at swift.org> wrote:
>>> 
>>> No, you can't, at least not usefully.  An Index that's at the end of one
>>> collection is in the middle of another, or with a suitably-modified version
>>> of the same collection.  
>> 
>> Sure, in certain concrete scenarios it’s possible for one collection’s
>> indices to have such relationships to some other collection.
>> 
>> But, what of it? 
>> 
>> In a generic context you can’t assume this; 
> 
> That's incorrect.  A slice's indices are *documented* as having a
> particular relationship to those of the thing it was sliced from.  This
> applies everywhere.  A dictionary's keys and values use the same indices
> as the dictionary itself, and have a correspondence.

You’re right, of course; I rarely use slices and completely overlooked them.

I also phrased it badly, b/c what I was trying to express is that code like the below is (I think?) unlikely to work generically:

  extension Collection where Element:Equatable {

     // plz don’t do this
     func hasValueMismatch(with other: Self, at index: Index) -> Bool {
       return self[index] != other[index]
     }

     // plz don’t do this either
    func hasValueMismatch<K:Collection where K.Index == Index, K.Element == Self.Element>(with other: K, at index: Index) -> Bool { 
      return self[index] != other[index]
    }

  }

…(you would’t write the above anyway, but it illustrates the kind of "generic context" I had in mind when I wrote it).

> 
>> in a concrete context you naturally have more information.
>> 
>> Slices would become problematic, I’ll grant.
>> 
>>> var x = [1, 2]
>>> let i = x.index(1, stepsFrom: x.startIndex)
>>> x.removeLast()
>>> x[i]           // fatal error: Index out of range
>> 
>> Indices can become invalid; this imposes preconditions. I don’t get
>> it.
> 
> My point is that whether i is at the end or not cannot be encoded in i.

I see the miscommunication, now. Of course you can’t encode that.

I’ve put a couple examples down below as a last effort at communicating what I’m getting at it. 

>>> The converse is also true: subscripting on a collection's endIndex is
>>> sometimes just fine, even with no mutation in sight.
>>> 
>>> let a = (0..<10).reversed()
>>> print(Array(a))      // “[9, 8, 7, 6, 5, 4, 3, 2, 1, 0]”
>>> 
>>> let b = a.prefix(9)
>>> print(Array(b))      // “[9, 8, 7, 6, 5, 4, 3, 2, 1]”
>>> 
>>> print(a[b.endIndex]) // “0” (correct, supported behavior)
>> 
>> I believe we are back to “subscripting one collection with *another*
>> collection's `endIndex`, no?
> 
> Totally legit, as mentioned above.  a.prefix(9) returns a slice of a.
> 
>> Are there any circumstances where a collection *can* be
>> usefully-subscripted with its *own* `endIndex`?
> 
> var a = [1]
> let i = a.endIndex
> a.append(2)
> print(a[i]) // “2”
> 
>>> 
>>> Of course,
>>> 
>>> b[b.endIndex]        // As a matter of QOI: fatal error: out of bounds: index >= endIndex
>>> 
>>>> 
>>>> It would’ve been awkward to do this under the previous status quo—e.g. even for
>>>> arrays your indices would have to have a back-reference to get the count, and
>>>> thus couldn’t be plain integers—but the collection will now always be present to
>>>> provide such info.
>>>> 
>>>> Cons:
>>>> 
>>>> - more overhead than “bare” indices
>>>> - doesn’t address invalidation (but what does, really?)
>>>> 
>>>> Pros:
>>>> 
>>>> - easier in some ways to handle things like e.g 0…Int.max
>>>> - the endIndex equivalent *never* invalidates 
>>>> - compile-time help for end-index checking
>>>> 
>>>> Overall this *would* bring the treatment of indices closer to that for `?`—e.g.,
>>>> redefine the core type to omit the `nil`-like value, 
>>> 
>>> Sorry, but that's the opposite of what `?` is doing: it *adds* a nil
>>> value.  
>> 
>> …I must have been unclear.
>> 
>> Step 1: Define T* = { "all memory addresses” (nil included) }
>> Step 2: Define T = T* \ { nil } (e.g. "non-null pointers")
>> 
>> …is what I was trying to summarize via “redefine the core type to omit
>> the `nil`-like value” (which is the important part here).
> 
> Sorry, that's still unclear to me.  I just don't see what you're getting
> at.
> 
>> Anyways, having `endIndex` directly inhabit the same type as the
>> “good” indices has some pros and some cons; it’s not an IMHO one-sided
>> situation as with `nil`.
> 
> Maybe, but my point is that many things in the current model are
> incompatible with the other arrangement.  If you wanted to change the
> arrangement, you'd need to re-think the current model from the ground
> up, including index invalidation, how algorithms interact, the
> relationship of slices to the thing they're sliced from, etc...
> 
> So what you're suggesting is an interesting hypothesis, but to me it's
> not by any means obviously workable.

You’re completely right about slices. I’ll provide a couple concrete examples before addressing the rest.

Here are three collection-combinators (or adapters I think you’d call them):

  // Collection with elements of A, then elements of B.
  struct ChainCollection<A:Collection,B:Collection> : Collection {
    let a: A; let b: B;
  }

  // Collection with elements `(a,b)` for each pair in the cartesian product of `A` and `B`.
  struct ProductCollection<A:Collection,B:Collection> : Collection {
    let a: A; let b: B;
  }

  // Collection with adjacent elements from A
  struct AdjacentElementCollection<A:Collection> : Collection { 
    let a: A
  }

…each of which I’ve declared `: Collection` but each which will still need some suitable `Index` implementation.

Here’s one way to write these indices (henceforth, the V1 indices):

  // `endIndex` will be .InB(b.endIndex)
  enum ChainCollectionIndex<A:Collection,B:Collection> {
    // Precondition: the index isn’t `a.endIndex`.
    case InA(A.Index)
    case InB(B.Index) 
  }

  // `endIndex` will have both `.aIndex` and `.bIndex` equal to their “source”'s `endIndex`
  struct ProductCollectionIndex<A:Collection,B:Collection> {
    let aIndex: A.Index
    let bIndex: B.Index
  }

  // `endIndex` will be both of these set to `a.endIndex`
  // - all other situations expect `upper` is `lower`’s successor, and both != `endIndex`
  struct AdjacentElementCollectionIndex<A:Collection> {
    let lower: A.Index
    let upper: A.Index
  }

…(I trust the index-manipulation boilerplate is easy to fill-in).

There’s absolutely nothing wrong with the above! Each of these types has the capability to represent the “good” indices, and also has a reasonable way to represent `endIndex` values.

But, they could also be written like so (henceforth, the V2 indices):

  enum ChainCollectionIndex<A:Collection,B:Collection> {
    // Precondition: the index isn’t `a.endIndex`.
    case InA(A.Index)
    // Precondition: the index isn’t `b.endIndex`.
    case InB(B.Index)
    // `endIndex` sentinel
    case End
  }

  enum ProductCollectionIndex<A:Collection,B:Collection> {
    // Precondition: neither index is the source collection’s `endIndex`
    case Item(A.Index, B.Index)
    // `endIndex` sentinel
    case End
  }

  enum AdjacentElementCollectionIndex<A:Collection> {
    // Precondition: `upper` is `lower`’s successor, *both* are != `a.endIndex`
    case Adjacency(lower: A.Index, upper: A.Index)
    // `endIndex` sentinel
    case End
  }

…each of which is essentially the V1 version, except now with a dedicated `endIndex` value tacked-on. 

Tacking on the dedicated `endIndex` isn’t necessary, but at least for me taking V2-style approaches has been very advantageous. 

Most of the advantage has been from being able to enforce stricter invariants on the non-endIndex indices, which generally makes the code simpler and also easier to reason about; I’m also fortunate in that I’m not working on the standard library, and thus can choose how heavily to weight “time to correct implementation” vis-a-vis “proximity to maximum possible efficiency”.

The above indices are drawn from what are admittedly simple combinators/adaptors, but they feel like representative examples for me; even for fancier, actually-custom things it’s almost always been much more natural to go with a V2-style index (and sometimes no natural V1-style approach even exists).

I’ve done a fair amount of experimenting with the model implied above—moving `endIndex` into a dedicated sentinel, etc.—and although it’s pretty easy overall to translate back and forth between the two models, the “alternative” approach *at best* comes out a wash...at best. 

For the most part, invalidation is about the same between the two models for all non-end-index—for such indices, the same mutations invalidate the same indices under either approach.

What *is* different between the two is that a cached `endIndex` will never become valid due to mutation, e.g. this won’t happen:

  // status quo:
  var items = [“a”, “b”]
  let cachedEndIndex = items.endIndex // this is just `2`
  items.append[“c”]
  items[cachedEndIndex] // returns “c"

  // dedicated `endIndex`:
  var items = [“a”, “b”]
  let cachedEndIndex = items.endIndex
  items.append[“c”]
  items[cachedEndIndex] // goes boom, b/c `cachedEndIndex` is still the `endIndex`

…and things like this wouldn't work the same way:

  // status quo:
  var items = [“a”, “b”]
  let cachedEndIndex = items.endIndex // this is just `2`
  items.insert(“c”, at:  cachedEndIndex) // [“a”, “b”, “c”]
  items.insert(“d”, at:  cachedEndIndex) // [“a”, “b”, “d”, “c”]

  // dedicated `endIndex`
  var items = [“a”, “b”]
  let cachedEndIndex = items.endIndex // this is `.End`
  items.insert(“c”, at:  cachedEndIndex) // [“a”, “b”, “c”]
  items.insert(“d”, at:  cachedEndIndex) // [“a”, “b”, “c", “d”]

…because the end-index sentinel would *always* refer to the logical end of the collection.

Slices would truly be problematic here. For the rest, it’s hard to see insurmountable difficulties—especially given the ease of converting between the two approaches, given access to the collection—but I’d expect it to be far clunkier and a tad slower.

>> On the one hand, in my own experience so far, it’s definitely been the
>> case that most custom collections I’d done have had indices that’re
>> effectively the `SaferIndex` above; it’s been rather rare that there’s
>> been a natural “1 past the rest” value to use of the same type as is
>> used to describe the position of a “good” index.
>> 
>>> Seriously, just because Swift has Optionals and they're useful for
>>> safety in some scenarios (compared with allowing everything to be
>>> nullable) does not mean that it's going to be “Swiftier” to apply a
>>> similar pattern everywhere.
>>> 
>>>> use an enum to reintroduce that value when necessary—than to `!`.
>>>> 
>>>> I don’t think the above is an *improvement* over the proposal, but it’s a route
>>>> that could have been taken.
>>> 
>>> I believe it would be hard to make such a design work at all, and if you
>>> could make it work I think you'd end up with exactly the problem this
>>> proposal aims to solve: references inside indices.  So, I don't think
>>> it's even a possibility, really.
>> 
>> I can’t say I see the impossibility. I definitely have experienced the
>> clunkiness.
>> 
>> This is getting too involved for a hypothetical I was explaining, but
>> not advocating.
> 
> I will happily agree to drop this topic :-)

It’s dropped, now; I only felt the need to reply one more time b/c I could tell I’d previously failed to communicate clearly-enough. 

> 
>> This proposal and the new design is a good design!
> 
> Thanks!

I really do mean it! Just look at those examples and think of how many redundant back-references they used to need...

> 
>>>> 
>>>> 
>>>>       To help illustrate the claim, here’s a strawman “safe” API—for
>>>>       illustration
>>>>       only, not advocacy!—that would be safer and thus perhaps more “Swift-y”:
>>>> 
>>>>   I think there's a prevalent misunderstanding (IOW, I don't mean to
>>>>   single out this post or this poster) about what “safe” means in Swift
>>>>   and what the features of a Swifty API are and should be. This
>>>>   is a big topic worthy of much more time than I can devote here, but
>>>>   here's a thought to start with:
>>>> 
>>>>   A Swifty API helps you reason effectively about the correctness of your
>>>>   code, and in part that means we provide enough preconditions on
>>>>   arguments to avoid complicating result types, and code to handle
>>>>   results, with optional-ness.
>>>> 
>>>>   -- 
>>>>   Dave
>>>> 
>>>>   _______________________________________________
>>>>   swift-evolution mailing list
>>>>   swift-evolution at swift.org
>>>>   https://lists.swift.org/mailman/listinfo/swift-evolution
>>>> 
>>>> _______________________________________________
>>>> swift-evolution mailing list
>>>> swift-evolution at swift.org
>>>> https://lists.swift.org/mailman/listinfo/swift-evolution
>>> 
>>> -- 
>>> Dave
>>> 
>>> _______________________________________________
>>> swift-evolution mailing list
>>> swift-evolution at swift.org
>>> https://lists.swift.org/mailman/listinfo/swift-evolution
>> 
>> _______________________________________________
>> swift-evolution mailing list
>> swift-evolution at swift.org
>> https://lists.swift.org/mailman/listinfo/swift-evolution
> 
> -- 
> Dave
> 
> _______________________________________________
> swift-evolution mailing list
> swift-evolution at swift.org
> https://lists.swift.org/mailman/listinfo/swift-evolution



More information about the swift-evolution mailing list