[swift-evolution] Pitch: String Index Overhaul

Jordan Rose jordan_rose at apple.com
Tue May 30 16:58:33 CDT 2017


> On May 30, 2017, at 14:53, Dave Abrahams <dabrahams at apple.com> wrote:
> 
> 
> on Tue May 30 2017, Jordan Rose <jordan_rose-AT-apple.com> wrote:
> 
>> My knee-jerk reaction is to say it's too late in Swift 4 for this kind
>> of change, but with that out of the way, I'm most concerned about what
>> it means to have, say, a UTF-8 index that's not on a UTF-16 boundary.
>> 
>> let str = "言"
>> let oneUnitIn = str.utf8.index(after: str.utf8.startIndex)
>> let trailingBytes = str.utf8[oneUnitIn...]
> 
> This is not new; it exists today.

Yes, I think that’s valuable. What’s different is that it’s not a String.Index.


> 
>> What can I do with 'oneUnitIn'? 
> 
> All the usual stuff; we're not proposing to change what you can do with
> it.

By changing the type, you have increased the scope of where an index can be used. What happens when I use it in one of the other views and it’s not on a boundary?

(I suspect the answer is “it traps” but the proposal should spell that out explicitly.)


> 
>> How do I test to see if it's on a Character boundary or a
>> UnicodeScalar boundary?
> 
> as noted,
> 
>  Replacing the failable APIs listed [above](#motivation) that detect
>  whether an index represents a valid position in a given view, and
>  enhancement that explicitly round index positions to nearby boundaries
>  in a given view, are left to a later proposal.  For now, we do not
>  propose to remove the existing index conversion APIs.
> 
> That means you can use oneUnitIn.samePosition(in: str) or
> oneUnitIn.samePosition(in: str.unicodeScalars) to find out if it's on ta
> character or unicode scalar boundary.

I’m sorry, I completely missed that. This part of the question is withdrawn.

I’m also concerned about putting “UTF-16” in the documentation for encodedOffset. Either it’s a ‘utf16Offset’ or it isn’t; if it’s an opaque value then it should be treated as such. (It’s also a little disturbing that round-tripping through encodedOffset isn’t guaranteed to give you the same index back.)

Jordan

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-evolution/attachments/20170530/14b453f0/attachment.html>


More information about the swift-evolution mailing list