[swift-evolution] [Discussion] Dictionary Key as Index

Brent Royal-Gordon brent at architechies.com
Fri Apr 15 20:57:53 CDT 2016


>> * Given just an `Index`, you need to be able to calculate the next `Index`, and check if it lies before the `endIndex`. The complexity of this is hidden in your example because you're relying on a traditional `DictionaryIndex` to perform these tasks. Thus, the `Index` type in your dictionary design would be some sort of wrapper-around-the-key, not the key itself. That's the source of the `subscript(_: Index) -> Value` which apparently confused you.
> 
> This was an implementation detail, see the comment at the top of the file. If SE0065 is implemented then the previous/next/lastIndex can be found from the collection in constant amortized time. This is possible after SE0065.
> 
> I wasn't confused, but I was unsure of the best approach. Array and Dictionary probably need different Indexable-style protocols.

I suspect that's a non-starter.

The Collection protocol encapsulates this semantic:

	1. The type is a Sequence of finite length which always (unless it's mutated) contains the same elements in the same order.

	2. The Index represents a location in the Sequence.

	3. Indexing with an Index retrieves the element at that location.

	4. You can retrieve the successor index to a given index; sub-protocols provide other ways to transform indices.

	5. You can == one Index with another, and if they match, they refer to the same element.

	6. With SE-0065, you can also < one Index with another to see which one comes earlier in the collection.

Both Array and Dictionary (as well as other types like Set and String (or rather, String's views)) need to support these semantics. Array and String are ordered collections—any element can appear at any index—so the Index is the main means by which we access the collection's elements. Set and Dictionary are not ordered, so we have other ways to access their elements. This difference is 

Also note that SE-0065 makes this harder, not easier. Requirement #6—that Index must be Comparable—would require Dictionary keys to be Comparable and Dictionary iteration to happen in sorted order. Since the natural order of Dictionary key access is (roughly) by hashValue, this would make iterating over the Dictionary very slow, or would require a separate, sorted storage for the keys, or something along those lines.

The alternative to that is to disconnect Index from Key. If you do that, then Index is basically just the index of the element in the linear storage backing the Dictionary, and iteration is almost as fast as looping over an Array—you just have to skip over unused buckets. That's a much saner way to handle things. So this:

> I also think that, ideally, `Index == Key` for a Dictionary.

Turns out to be incorrect. Iterating over a Dictionary by Key is not a particularly good solution.

Alternatively, you might think of it this way: what you describe here:

> There may be an index-like type which allows lookup with a lower constant overhead (bucket index), but it's likely to only be used by extensions, if at all.

*is* what Collection is for.

> I'm fine for `Array(newDictionary)` to discard keys, there's always `Array(newDictionary.enumerate())`. I think it's more consistent for filter to work on values. I think all collections would also benefit from versions of map/filter/reduce etc. that also have an Index (so you could get every second value of an array, for example).

As I mentioned (and brought up on another thread for discussion), that's not what `enumerate()` is for.

> I'd like the protocol to have associatedtype FilterType, for NewDictionary it would be defined as something like:
>     typealias FilterType = NewDictionary<Key,Value>
> This would allow filter, reduce, etc. to preserve the keys.
> 
> * Your `map` is overloading `SequenceType.map` merely by return type. That means you can no longer assign it directly to a newly-declared variable, or use it in many other contexts where the type has to be inferred.
> 
> Overloading is an implementation detail at the moment, as mentioned in the comment at the top of the file. Ideally this would be Self.MapType once we have generic associatedtypes. For dictionaries:
>     typealias MapType<T> = NewDictionary<Key,T>

The use of "typealias" here is misleading, because these aren't really typealiases—they're associated types. Generic associated types are not currently a feature of Swift. I'm not sure if they're even in the generics manifesto.

> Either way I think all collections would benefit from a property that exposes a sequence of (Index,Value) pairs.

Agreed.

>> * The keys are disposable and the values are the important part—most operations on a Dictionary would throw away the keys and use only the values.
> 
> Agreed, this may add complexity to generic/default implementations, or require changes to some of the protocols (my preference is protocol changes). 

I'm not necessarily saying that a Value-oriented Dictionary is wrong, but I'm not sure it's right, either.

>> * Index still would not be a Key—it would be some kind of instance which wrapped or converted into a Key.
> 
> After SE0065 they can be the same.

They can't; see above.

-- 
Brent Royal-Gordon
Architechies



More information about the swift-evolution mailing list