[swift-evolution] [Pitch] Changing the behavior of Subsequences of String Views

Loïc Lecrenier loiclecrenier at icloud.com
Tue Jun 28 11:46:27 CDT 2016


Hi swift-evolution 😊

String’s Views have a few odd properties that have bothered me for a while. I initially did not bring it up because I thought a String redesign was coming. But since Swift 3 will be released very soon—and with the recent focus on breaking changes—I thought now might be a good time to talk about it. 

## Subsequences of UTF16View and CharacterView don’t use the same indices as the original collection

One requirement of the Collection protocol is 

public subscript(bounds: Range<Self.Index>) -> Self.SubSequence { get }

whose documentation says:

/// Accesses a contiguous subrange of the collection's elements.
///
/// The accessed slice uses the same indices for the same elements as the
/// original collection uses.

However, it appears that UTF16View and CharacterView don’t follow the documentation. For example:

let str = "Hello World!".utf16
let (start, end) = (str.index(str.startIndex, offsetBy: 2), str.index(str.startIndex, offsetBy: 9))

let sub1 = str[start ..< end]
print(sub1) // llo Wor

let sub2 = str[sub1.startIndex ..< sub1.endIndex]
print(sub2) // Hello W

Here, using `sub1`’s indices on the original collection `str` returns a completely different subsequence.
I think that, ideally, `sub2` should be equal to `sub1`, just like when using UTF8View and UnicodeScalarView.

## Accessing elements past the end of the subsequence

Consider this piece of code:

let str = "Hello World!".utf8
let (start, end) = (str.index(str.startIndex, offsetBy: 2), str.index(str.startIndex, offsetBy: 9))

let sub1 = str[start ..< end]
print(sub1) // llo Wor

let pastEnd = sub1.index(sub1.endIndex, offsetBy: 2)

let sub2 = sub1[sub1.startIndex ..< pastEnd]
print(sub2) // llo World

I was able to access elements of the original string that should be beyond the reach of `sub1`.
Using a UnicodeScalarView gives an odd result too: indices past the end are seemingly ignored, and `sub2` is equal to `sub1`.

## Conclusion

I think String’s Views should
1. Follow Collection’s documentation by using the same indices for their subsequences
2. Provide safe, consistent behavior when using a subscript operation with a past-the-end index  

However, this means more breaking changes that won’t be easy to detect.

Thoughts?

Loïc

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-evolution/attachments/20160628/8abc4aa3/attachment.html>


More information about the swift-evolution mailing list