[swift-evolution] Dropping Comparable requirement for indices

Fri Jul 8 17:32:43 CDT 2016

on Fri Jul 08 2016, plx <swift-evolution at swift.org> wrote:

>> On Jul 6, 2016, at 7:33 PM, Xiaodi Wu <xiaodi.wu at gmail.com> wrote:
>> 
>> I for one would be interested in your elaboration. Based on Dave's comments, I'm pretty convinced that the Comparable requirement is best left in place.
>
> I am short on time (and will remain so for several more weeks) so this should be seen as essentially a fire-and-forget message. 
>
> I'll preface the rest with the disclaimer that I don't think any of
> the assessments already made are *wrong* on the facts...I simply (a)
> disagree on what to make of them and also (b) think there is more cost
> to requiring `Comparable` than has been noted so far.
>
> For (a), I agree with the following pair of facts:
>
> - there are "collections" for which the "natural" choice of index is awkward to make `Comparable` (thus making conformance to `Collection` awkward)...
> - such "non-comparable indices" can easily be made comparable when
> needed, e.g. by stapling an `Int` onto each "natural" index (or via
> some similar augmentation)
>
> ...but to me this argues in favor of dropping the requirement:
>
> - dropping it allows more "collections" to become proper `Collection`s (while still using the "natural" index)
> - when you specifically need comparable indices, they are easy to add (so why not *not* pay for comparability except when you specifically need it)
>
> For (b), I think that the long-term costs of techniques like
> "`Int`-stapling" are actually rather more severe than just the
> overhead of the stapled-on `Int` (etc.); to illustrate this, consider
> linked-lists, since they've already been brought up.
>
> I'll state without proof that the following are reasonable:
>
> - if we only need `Equatable` indices:
>   - the list nodes are the natural choice of "index"
>   - the list nodes are *robust* indices (very few updates invalidate them)
> - if we instead need `Comparable` indices:
>   - the easiest index is a pair like `(list-node, counter)`
>   - these indices are *fragile* indices (invalidated after most updates)
>
> ...and proceed to my point:
>
> It's not hard to imagine at some point introducing a protocol for a
> mutable collection with *robust* indices -- e.g., behaving like the
> "natural" indices we could get away with here if we only needed
> `Equatable` -- and providing either (or both):
>
> - improved generic implementations of standard mutation operations
> - additional mutation operations that are only practical with robust indices
>
> ...and the unfortunate consequence of the `Comparable`-index
> requirement is that our linked-list would **not** be able to adopt
> such a protocol -- and benefit from such algorithms, etc. -- b/c that
> requirement means that the `Index` it will expose in a generic setting
> is the more-fragile, “stapled index” instead of the more-robust,
> “natural index” we could have used otherwise.
>
> This has been discussing linked-lists but I think it generalizes;
> e.g. for other things that “could be ‘collections’” without the
> `Comparable`-`Index` requirement, you can easily make indices that
> *do* satisfy the `Comparable` requirement…at the cost of winding up
> with comparatively-fragile indices that are a bit pessimal for use
> with mutation operations. I’ll speculate that the issues being
> alluded-to for trees are roughly similar: it’s not that it’s hard to
> make the tree indices comparable, it’s that there’s no good way to do
> that without making the resulting indices “artificially” fragile.
>
> And so on. 
>
> So to conclude, I don’t really disagree that the issues raised by
> non-comparable indices are real issues, but I *do* think the
> cost/benefit analysis should include the cost of having "locked-in"
> the use of fragile, easily-invalidated indices vis-a-vis what might
> otherwise have been used. 

Agreed, and that's basically what motivated the investigation of
dropping the Comparable requirement.  One thing to remember, though:
this ability to avoid invalidating indices would not apply to all
collections.  The most-common (and usually best)
RangeReplaceableCollection, Array, can't support it.  

The ability to maintain a stable notion of “position” in the collection
after it is restructured is a special property that only really applies
to “node-based” collections where each element gets a separate node.
Individual data structures with this property are free to provide their
own *additional* Index-like type that is less fragile.

Are there any generic algorithms over such data structures that take
advantage of index stability?  I don't know of any.  If they exist, it's
a potential concern for the protocol design.  If they don't, it's
something we can just add to individual concrete types.  

The concern that comparability may impose a performance penalty on some
data structures is a bigger one for me.  However, since node-based
collections tend to have terrible locality-of-reference, I feel sort of
OK about the idea that people who care about performance will be using
data structures containing at least some contiguous allocations
(e.g. Deques, B+ trees...)

I find either decision to have some distasteful effects on the design,
but at the moment I think the effects of dropping the Comparable
requirement are worse.

> It’s not amazingly convincing without more-concrete examples, but this
> is all I have time for.

Thanks for making some time despite your busy schedule!

-- 
Dave