[swift-evolution] Proposal: Python's indexing and slicing

Dave Abrahams dabrahams at apple.com
Mon Dec 21 13:56:04 CST 2015


> On Dec 19, 2015, at 8:52 PM, Kevin Ballard via swift-evolution <swift-evolution at swift.org> wrote:
> 
> On Fri, Dec 18, 2015, at 02:39 PM, Dave Abrahams via swift-evolution wrote:
>>  
>> Yes, we already have facilities to do most of what Python can do here, but one major problem IMO is that the “language” of slicing is so non-uniform: we have [a..<b], dropFirst, dropLast, prefix, and suffix.  Introducing “$” for this purpose could make it all hang together and also eliminate the “why does it have to be so hard to look at the 2nd character of a string?!” problem.  That is, use the identifier “$” (yes, that’s an identifier in Swift) to denote the beginning-or-end of a collection.  Thus,
>>  
>>   c[c.startIndex.advancedBy(3)] =>c[$+3]        // Python: c[3]
>>   c[c.endIndex.advancedBy(-3)] =>c[$-3]        // Python: c[-3]
>>  
>>   c.dropFirst(3)  =>c[$+3...]     // Python: c[3:]
>>   c.dropLast(3) =>c[..<$-3]     // Python: c[:-3]
>>   c.prefix(3) =>c[..<$+3]     // Python: c[:3]
>>   c.suffix(3) => c[$-3...]     // Python: c[-3:]
>>  
>> It even has the nice connotation that, “this might be a little more expen$ive than plain indexing” (which it might, for non-random-access collections).  I think the syntax is still a bit heavy, not least because of “..<“ and “...”, but the direction has potential. 
>>  
>>  I haven’t had the time to really experiment with a design like this; the community might be able to help by prototyping and using some alternatives.  You can do all of this outside the standard library with extensions.
>  
> Interesting idea.
>  
> One downside is it masks potentially O(N) operations (ForwardIndex.advancedBy()) behind the + operator, which is typically assumed to be an O(1) operation.

Yeah, but the “$” is sufficiently unusual that it doesn’t bother me too much.

> Alos, the $+3 syntax suggests that it requires there to be at least 3 elements in the sequence, but prefix()/suffix()/dropFirst/etc. all take maximum counts, so they operate on sequences of fewer elements.

For indexing, $+3 would make that requirement.  For slicing, it wouldn’t.  I’m not sure why you say something about the syntax suggests exceeding bounds would be an error.

> There's also some confusion with using $ for both start and end. What if I say c[$..<$]? We'd have to infer from position that the first $ is the start and the second $ is the end, but then what about c[$+n..<$+m]? We can't treat the usage of + as meaning "from start" because the argument might be negative. And if we use the overall sign of the operation/argument together, then the expression `$+n` could mean from start or from end, which comes right back to the problem with Python syntax.

There’s a problem with Python syntax?  I’m guessing you mean that c[a:b] can have very different interpretations depending on whether a and b are positive or negative?

First of all, I should say: that doesn’t really bother me.  The 99.9% use case for this operation uses literal constants for the offsets, and I haven’t heard of it causing confusion for Python programmers.  That said, if we wanted to address it, we could easily require n and m above to be literals, rather than Ints (which incidentally guarantees it’s an O(1) operation).  That has upsides and downsides of course.
>  
> I think Jacob's idea has some promise though:
>  
> c[c.startIndex.advancedBy(3)] => c[fromStart: 3]
> c[c.endIndex.advancedBy(-3)] => c[fromEnd: 3]

> But naming the slice operations is a little trickier. We could actually just go ahead and re-use the existing method names for those:
>  
> c.dropFirst(3) => c[dropFirst: 3]
> c.dropLast(3) => c[dropLast: 3]
> c.prefix(3) => c[prefix: 3]
> c.suffix(3) => c[suffix: 3]
>  
> That's not so compelling, since we already have the methods, but I suppose it makes sense if you want to try and make all slice-producing methods use subscript syntax (which I have mixed feelings about).

Once we get efficient in-place slice mutation (via slice addressors), it becomes a lot more compelling, IMO.  But I still don’t find the naming terribly clear, and I don’t love that one needs to combine two subscript operations in order to drop the first and last element or take just elements 3..<5.

Even if we need separate symbols for “start” and “end” (e.g. using “$” for both might just be too confusing for people in the end, even if it works otherwise), I still think a generalized form that allows ranges to be used everywhere for slicing is going to be much easier to understand than this hodgepodge of words we use today.

> But the [fromStart:] and [fromEnd:] subscripts seem useful.

Yeah… I really want a unified solution that covers slicing as well as offset indexing.

-Dave



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-evolution/attachments/20151221/0f560220/attachment.html>


More information about the swift-evolution mailing list