[swift-users] What are these types with regular expressions?
Brent Royal-Gordon
brent at architechies.com
Sun Aug 7 02:27:12 CDT 2016
> On Aug 6, 2016, at 5:25 AM, 晓敏 褚 via swift-users <swift-users at swift.org> wrote:
>
> And when I try to use range to get a substring, I got a Range<Int>, but the substring:with: method requies a Range<Index>. But there is no way I could find any information about the type(or protocol?) Index, and passing a Int fails.
> What are they, and how can I work with them?
"The Swift Programming Language" discusses this in more detail, but briefly: String indexing is much more complicated than NSString might make you think. For instance, the character 𠀋 is spread across two "indices", because it is in the Supplementary Ideographic Plane of Unicode. Moreover, there are actually several different mechanisms that can make a single "character" actually take up multiple indices. To model this, a Swift String offers several views (`characters`, `unicodeScalars`, `utf16`, and `utf8`), each of which handles indices in a different way. In Swift 2, each of these has its own `Index` type; I believe the plan was for Swift 3 to use one Index type shared between all views, but I'm not sure if that change will make the release version.
`NSString`, on the other hand, uses bare `Int`s interpreted a UTF-16 indices. So the way to convert is to translate the `Int` into a `String.UTF16Index`, and then if you want to go from there, further translate the `UTF16Index` into `String.Index`. (This second step can fail if, for instance, the `UTF16Index` points to the second index within 𠀋.) You can do that with an extension like this one:
// Swift 3:
extension String.UTF16View {
func convertedIndex(_ intIndex: Int) -> Index {
return index(startIndex, offsetBy: intIndex)
}
func convertedRange(_ intRange: Range<Int>) -> Range<Index> {
let lower = convertedIndex(intRange.lowerBound)
let offset = intRange.upperBound - intRange.lowerBound
let upper = index(lower, offsetBy: offset)
return lower ..< upper
}
}
extension String {
func convertedIndex(_ intIndex: Int) -> Index? {
let utfIndex = utf16.convertedIndex(intIndex)
return utfIndex.samePosition(in: self)
}
func convertedRange(_ intRange: Range<Int>) -> Range<Index>? {
let utfRange = utf16.convertedRange(intRange)
guard let lower = utfRange.lowerBound.samePosition(in: self),
let upper = utfRange.upperBound.samePosition(in: self) else {
return nil
}
return lower ..< upper
}
}
// Swift 2:
extension String.UTF16View {
func convertedIndex(intIndex: Int) -> Index {
return startIndex.advancedBy(intIndex)
}
func convertedRange(intRange: Range<Int>) -> Range<Index> {
let lower = convertedIndex(intRange.startIndex)
let offset = intRange.endIndex - intRange.startIndex
let upper = lower.advancedBy(offset)
return lower ..< upper
}
}
extension String {
func convertedIndex(intIndex: Int) -> Index? {
let utfIndex = utf16.convertedIndex(intIndex)
return utfIndex.samePositionIn(self)
}
func convertedRange(intRange: Range<Int>) -> Range<Index>? {
let utfRange = utf16.convertedRange(intRange)
guard let lower = utfRange.startIndex.samePositionIn(self),
let upper = utfRange.startIndex.samePositionIn(self) else {
return nil
}
return lower ..< upper
}
}
Use it like this:
let range: Range<Int> = …
// If you want to use String.UTF16Index:
let convertedRange = string.utf16.convertedRange(range)
print(string.utf16[convertedRange])
// If you want to use String.Index:
if let convertedRange = string.convertedRange(range) {
print(string[convertedRange])
}
else {
print("[Invalid range]")
}
Hope this helps,
--
Brent Royal-Gordon
Architechies
More information about the swift-users
mailing list