[swift-evolution] Proposal: Python's indexing and slicing

Kevin Ballard kevin at sb.org
Tue Dec 22 14:06:31 CST 2015


On Mon, Dec 21, 2015, at 08:28 PM, Donnacha Oisín Kidney wrote:
> Why not make the “forgiving” version the default? I mean, the majority of python-style composable slicing would be happening on arrays and array slices, for which there’s no performance overhead, and the forgiving version would seam to suit the “safe-by-default” philosophy. I’ve seen mistakes like this:
>
> let ar = [1, 2, 3, 4, 5] let arSlice = ar[2..<5] arSlice[1]
>
> on a few occasions, for instance. I would think something like this:
>
> let ar = [, 1, 2, 3, 4, 5]
>
> let arSlice = ar[2...] // [3, 4, 5] arSlice[..<3] // [2, 3, 4]
> arSlice[...3] // [2, 3, 4, 5] arSlice[direct: 2] // 2 arSlice[] // 2
>
> Would be what was expected from most programmers learning Swift, while
> leaving the unforgiving option open to those who need it.

You seem to be arguing against the notion that array slices preserve the
indexing of the base array, but that's not what's under discussion here.

-Kevin Ballard

>> On 22 Dec 2015, at 03:29, Dave Abrahams via swift-evolution <swift-
>> evolution at swift.org> wrote:
>>
>>>
>>> On Dec 21, 2015, at 1:51 PM, Kevin Ballard <kevin at sb.org> wrote:
>>>
>>> On Mon, Dec 21, 2015, at 11:56 AM, Dave Abrahams wrote:
>>>>
>>>>> On Dec 19, 2015, at 8:52 PM, Kevin Ballard via swift-evolution <swift-
>>>>> evolution at swift.org> wrote:
>>>>>
>>>>> On Fri, Dec 18, 2015, at 02:39 PM, Dave Abrahams via swift-
>>>>> evolution wrote:
>>>>>>
>>>>>> Yes, we already have facilities to do most of what Python can do
>>>>>> here, but one major problem IMO is that the “language” of slicing
>>>>>> is so non-uniform: we have [a..<b], dropFirst, dropLast, prefix,
>>>>>> and suffix.  Introducing “$” for this purpose could make it all
>>>>>> hang together and also eliminate the “why does it have to be so
>>>>>> hard to look at the 2nd character of a string?!” problem.  That
>>>>>> is, use the identifier “$” (yes, that’s an identifier in Swift)
>>>>>> to denote the beginning-or-end of a collection.  Thus,
>>>>>>
>>>>>> c[c.startIndex.advancedBy(3)] =>c[$+3]        // Python: c[3]
>>>>>> c[c.endIndex.advancedBy(-3)] =>c[$-3]        // Python: c[-3]
>>>>>>
>>>>>> c.dropFirst(3)  =>c[$+3...]     // Python: c[3:]  c.dropLast(3)
>>>>>> =>c[..<$-3]     // Python: c[:-3]  c.prefix(3) =>c[..<$+3]     //
>>>>>> Python: c[:3]  c.suffix(3) => c[$-3...]     // Python: c[-3:]
>>>>>>
>>>>>> It even has the nice connotation that, “this might be a little
>>>>>> more expen$ive than plain indexing” (which it might, for non-random-
>>>>>> access collections).  I think the syntax is still a bit heavy,
>>>>>> not least because of “..<“ and “...”, but the direction has
>>>>>> potential.
>>>>>>
>>>>>> I haven’t had the time to really experiment with a design like
>>>>>> this; the community might be able to help by prototyping and
>>>>>> using some alternatives.  You can do all of this outside the
>>>>>> standard library with extensions.
>>>>>
>>>>> Interesting idea.
>>>>>
>>>>> One downside is it masks potentially O(N) operations
>>>>> (ForwardIndex.advancedBy()) behind the + operator, which is
>>>>> typically assumed to be an O(1) operation.
>>>>
>>>> Yeah, but the “$” is sufficiently unusual that it doesn’t bother me
>>>> too much.
>>>>
>>>>> Alos, the $+3 syntax suggests that it requires there to be at
>>>>> least 3 elements in the sequence, but
>>>>> prefix()/suffix()/dropFirst/etc. all take maximum counts, so they
>>>>> operate on sequences of fewer elements.
>>>>
>>>> For indexing, $+3 would make that requirement.  For slicing, it
>>>> wouldn’t.  I’m not sure why you say something about
>>>> the_syntax_suggests exceeding bounds would be an error.
>>>
>>> Because there's no precedent for + behaving like a saturating
>>> addition, not in Swift and not, to my knowledge, anywhere else
>>> either. The closest example that comes to mind is floating-point
>>> numbers eventually ending up at Infinity, but that's not really
>>> saturating addition, that's just a consequence of Infinity +
>>> anything == Infinity. Nor do I think we should be establishing
>>> precedent of using + for saturating addition, because that would be
>>> surprising to people.
>>
>> To call this “saturating addition” is an…interesting…interpretation.
>> I don’t view it that way at all.  The “saturation,” if there is any,
>> happens as part of subscripting.  You don’t even know what the
>> “saturation limit” is until you couple the range expression with the
>> collection.
>>
>> In my view, the addition is part of an EDSL that represents a
>> notional position offset from the start or end, then the subscript
>> operation forgivingly trims these offsets as needed.
>>
>>> Additionally, I don't think adding a $ to an array slice expression
>>> should result in a behavioral difference, e.g.
>>> array[3..<array.endIndex] and array[$+3..<$] should behave the same
>>
>> I see your point, but don’t (necessarily) agree with you there.  “$”
>> here is used as an indicator of several of things, including not-necessarily-
>> O(1) and forgiving slicing.  We could introduce a label just to
>> handle that:
>>
>> array[forgivingAndNotO1: $+3..<$]
>>
>> but it doesn’t look like a win to me.
>>
>>>
>>>>> There's also some confusion with using $ for both start and end.
>>>>> What if I say c[$..<$]? We'd have to infer from position that the
>>>>> first $ is the start and the second $ is the end, but then what
>>>>> about c[$+n..<$+m]? We can't treat the usage of + as meaning "from
>>>>> start" because the argument might be negative. And if we use the
>>>>> overall sign of the operation/argument together, then the
>>>>> expression `$+n` could mean from start or from end, which comes
>>>>> right back to the problem with Python syntax.
>>>>
>>>> There’s a problem with Python syntax?  I’m guessing you mean that
>>>> c[a:b] can have very different interpretations depending on whether
>>>> a and b are positive or negative?
>>>
>>> Exactly.
>>>
>>>> First of all, I should say: that doesn’t really bother me.  The
>>>> 99.9% use case for this operation uses literal constants for the
>>>> offsets, and I haven’t heard of it causing confusion for Python
>>>> programmers.  That said, if we wanted to address it, we could
>>>> easily require n and m above to be literals, rather than Ints
>>>> (which incidentally guarantees it’s an O(1) operation).  That has
>>>> upsides and downsides of course.
>>>
>>> I don't think we should add this feature in any form if it only
>>> supports literals.
>>>
>>>>> I think Jacob's idea has some promise though:
>>>>>
>>>>> c[c.startIndex.advancedBy(3)] => c[fromStart: 3]
>>>>> c[c.endIndex.advancedBy(-3)] => c[fromEnd: 3]
>>>>
>>>>> But naming the slice operations is a little trickier. We could
>>>>> actually just go ahead and re-use the existing method names for
>>>>> those:
>>>>>
>>>>> c.dropFirst(3) => c[dropFirst: 3]
>>>>> c.dropLast(3) => c[dropLast: 3]
>>>>> c.prefix(3) => c[prefix: 3]
>>>>> c.suffix(3) => c[suffix: 3]
>>>>>
>>>>> That's not so compelling, since we already have the methods, but I
>>>>> suppose it makes sense if you want to try and make all slice-
>>>>> producing methods use subscript syntax (which I have mixed
>>>>> feelings about).
>>>>
>>>> Once we get efficient in-place slice mutation (via slice
>>>> addressors), it becomes a lot more compelling, IMO.  But I still
>>>> don’t find the naming terribly clear, and I don’t love that one
>>>> needs to combine two subscript operations in order to drop the
>>>> first and last element or take just elements 3..<5.
>>>
>>> You can always add more overloads, such as
>>>
>>> c[dropFirst: 3, dropLast: 5]
>>>
>>> but I admit that there's a bunch of combinations here that would
>>> need to be added.
>>>
>>
>> My point is that we have an English language soup that doesn’t
>> compose naturally.  Slicing in Python is much more elegant and
>> composes well.  If we didn’t currently have 6 separate methods (7
>> including subscript for index-based slicing) for handling this, that
>> need to be separately documented and understood, I wouldn’t be so
>> eager to replace the words with an EDSL, but in this case IMO it is
>> an overall simplification.
>>
>>> My concern over trying to make it easier to take elements 3..<5 is
>>> that incrementing indexes is verbose for a reason, and adding a
>>> feature that makes it really easy to index into any collection by
>>> using integers is a bad idea as it will hide O(N) operations behind
>>> code that looks like O(1). And hiding these operations makes it
>>> really easy to accidentally turn an O(N) algorithm into an O(N^2)
>>> algorithm.
>>
>> As I’ve said, I consider the presence of “$” to be enough of an
>> indicator that something co$tly is happening, though I’m open to
>> other ways of indicating it.  I’m trying to strike a balance
>> between “rigorous” and “easy to use,” here.  Remember that Swift
>> has to work in playgrounds and for beginning programmers, too.  I
>> am likewise unsatisfied with the (lack of) ease-of-use of String as
>> well (e.g. for lexing and parsing tasks), and have made improving
>> it a priority for Swift 3.  I view fixing the slicing interface as
>> part of that job.
>>
>>>> Even if we need separate symbols for “start” and “end” (e.g. using
>>>> “$” for both might just be too confusing for people in the end,
>>>> even if it works otherwise), I still think a generalized form that
>>>> allows ranges to be used everywhere for slicing is going to be much
>>>> easier to understand than this hodgepodge of words we use today.
>>>
>>> I'm tempted to say that if we do this, we should use two different
>>> sigils, and more importantly we should not use + and - but instead
>>> use methods on the sigils like advancedBy(), as if the sigils were
>>> literally placeholders for the start/end index. That way we won't
>>> write code that looks O(1) when it's not. For example:
>>>
>>> col[^.advancedBy(3)..<$]
>>>
>>> Although we'd need to revisit the names a little, because $.advancedBy(-
>>> 3) is a bit odd when we know that $ can't ever take a non-negative
>>> number for that.
>>>
>>> Or maybe we should just use $ instead as a token that means "the
>>> collection being indexed", so you'd actually say something like
>>>
>>> col[$.startIndex.advancedBy(3)..<$.startIndex.advancedBy(5)]
>>
>> I really like that direction, but I don’t think it does enough to
>> solve the ease-of-use problem; I still think the result looks and
>> feels horrible compared to Python for the constituencies
>> mentioned above.
>>
>> I briefly implemented this syntax, that was intended to suggest
>> repeated incrementation:
>>
>> col.startIndex++3 // col.startIndex.advancedBy(3)
>>
>> I don’t think that is viable, especially now that we’ve dropped “++”
>> and “--“. But this syntax
>>
>> col[$.start⛄️3..<$.start⛄️5]
>>
>> begins to be interesting for some definition of ⛄️.
>>
>>> This solves the problem of subscripting a collection without having
>>> to store it in a local variable, without discarding any of the
>>> intentional index overhead. Of course, if the goal is to make index
>>> operations more concise this doesn't really help much, but my
>>> argument here is that it's hard to cut down on the verbosity without
>>> hiding O(N) operations.
>>
>> That ship has already sailed somewhat, because e.g. every Collection
>> has to have a count property, which can be O(N).  But I still like to
>> uphold it where possible.  I just don’t think the combination of “+”
>> and “$” necessarily has such a strong O(1) connotation… especially
>> because the precedent for seeing those symbols together is regexps.
>>
>>>
>>> -Kevin Ballard
>>>
>>>>> But the [fromStart:] and [fromEnd:] subscripts seem useful.
>>>> Yeah… I really want a unified solution that covers slicing as well
>>>> as offset indexing.
>>>>
>>>> -Dave
>>>>
>>>
>>
>> -Dave
>>
>>
>>
>>  _______________________________________________
>> swift-evolution mailing list swift-evolution at swift.org
>> https://lists.swift.org/mailman/listinfo/swift-evolution
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-evolution/attachments/20151222/efa0e6d0/attachment.html>


More information about the swift-evolution mailing list