[swift-evolution] Proposal: Python's indexing and slicing
Dave Abrahams
dabrahams at apple.com
Mon Dec 21 13:56:04 CST 2015
> On Dec 19, 2015, at 8:52 PM, Kevin Ballard via swift-evolution <swift-evolution at swift.org> wrote:
>
> On Fri, Dec 18, 2015, at 02:39 PM, Dave Abrahams via swift-evolution wrote:
>>
>> Yes, we already have facilities to do most of what Python can do here, but one major problem IMO is that the “language” of slicing is so non-uniform: we have [a..<b], dropFirst, dropLast, prefix, and suffix. Introducing “$” for this purpose could make it all hang together and also eliminate the “why does it have to be so hard to look at the 2nd character of a string?!” problem. That is, use the identifier “$” (yes, that’s an identifier in Swift) to denote the beginning-or-end of a collection. Thus,
>>
>> c[c.startIndex.advancedBy(3)] =>c[$+3] // Python: c[3]
>> c[c.endIndex.advancedBy(-3)] =>c[$-3] // Python: c[-3]
>>
>> c.dropFirst(3) =>c[$+3...] // Python: c[3:]
>> c.dropLast(3) =>c[..<$-3] // Python: c[:-3]
>> c.prefix(3) =>c[..<$+3] // Python: c[:3]
>> c.suffix(3) => c[$-3...] // Python: c[-3:]
>>
>> It even has the nice connotation that, “this might be a little more expen$ive than plain indexing” (which it might, for non-random-access collections). I think the syntax is still a bit heavy, not least because of “..<“ and “...”, but the direction has potential.
>>
>> I haven’t had the time to really experiment with a design like this; the community might be able to help by prototyping and using some alternatives. You can do all of this outside the standard library with extensions.
>
> Interesting idea.
>
> One downside is it masks potentially O(N) operations (ForwardIndex.advancedBy()) behind the + operator, which is typically assumed to be an O(1) operation.
Yeah, but the “$” is sufficiently unusual that it doesn’t bother me too much.
> Alos, the $+3 syntax suggests that it requires there to be at least 3 elements in the sequence, but prefix()/suffix()/dropFirst/etc. all take maximum counts, so they operate on sequences of fewer elements.
For indexing, $+3 would make that requirement. For slicing, it wouldn’t. I’m not sure why you say something about the syntax suggests exceeding bounds would be an error.
> There's also some confusion with using $ for both start and end. What if I say c[$..<$]? We'd have to infer from position that the first $ is the start and the second $ is the end, but then what about c[$+n..<$+m]? We can't treat the usage of + as meaning "from start" because the argument might be negative. And if we use the overall sign of the operation/argument together, then the expression `$+n` could mean from start or from end, which comes right back to the problem with Python syntax.
There’s a problem with Python syntax? I’m guessing you mean that c[a:b] can have very different interpretations depending on whether a and b are positive or negative?
First of all, I should say: that doesn’t really bother me. The 99.9% use case for this operation uses literal constants for the offsets, and I haven’t heard of it causing confusion for Python programmers. That said, if we wanted to address it, we could easily require n and m above to be literals, rather than Ints (which incidentally guarantees it’s an O(1) operation). That has upsides and downsides of course.
>
> I think Jacob's idea has some promise though:
>
> c[c.startIndex.advancedBy(3)] => c[fromStart: 3]
> c[c.endIndex.advancedBy(-3)] => c[fromEnd: 3]
> But naming the slice operations is a little trickier. We could actually just go ahead and re-use the existing method names for those:
>
> c.dropFirst(3) => c[dropFirst: 3]
> c.dropLast(3) => c[dropLast: 3]
> c.prefix(3) => c[prefix: 3]
> c.suffix(3) => c[suffix: 3]
>
> That's not so compelling, since we already have the methods, but I suppose it makes sense if you want to try and make all slice-producing methods use subscript syntax (which I have mixed feelings about).
Once we get efficient in-place slice mutation (via slice addressors), it becomes a lot more compelling, IMO. But I still don’t find the naming terribly clear, and I don’t love that one needs to combine two subscript operations in order to drop the first and last element or take just elements 3..<5.
Even if we need separate symbols for “start” and “end” (e.g. using “$” for both might just be too confusing for people in the end, even if it works otherwise), I still think a generalized form that allows ranges to be used everywhere for slicing is going to be much easier to understand than this hodgepodge of words we use today.
> But the [fromStart:] and [fromEnd:] subscripts seem useful.
Yeah… I really want a unified solution that covers slicing as well as offset indexing.
-Dave
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-evolution/attachments/20151221/0f560220/attachment.html>
More information about the swift-evolution
mailing list