[swift-evolution] [Review] SE-0065 A New Model for Collections and Indices
Brent Royal-Gordon
brent at architechies.com
Tue Apr 12 06:27:37 CDT 2016
>> (On the other hand, it might be that I'm conceiving of the purpose of `limitedBy` differently from you—I think of it as a safety measure, but you may be thinking of it specifically as an automatic truncation mechanism.)
>
> Hi Brent,
>
> Could you explain what kind of safety do you have in mind? Swift will
> guarantee memory safety even if you attempt to advance an index past
> endIndex using the non-limiting overload.
By "safety" here, I mean what I will call "index safety": not accidentally using an index which would violate the preconditions of the methods or properties you are planning to use it with. I think it's too easy to accidentally overrun the permitted range of indices, and the API should help you avoid doing that.
For instance, suppose I'm porting XCTest to Swift, and I decide to rewrite its `demangleSimpleClass` function, which extracts the identifiers from a mangled Swift symbol name. Specifically, I'm implementing `scanIdentifier`, which reads one particular identifier out of the middle of a string. (For those unfamiliar: an identifier in a mangled symbol name consists of one or more digits to represent a length, followed by that many characters.) I will assume that the mangled symbol name is in a Swift.String.
Here's a direct port:
func scanIdentifier(partialMangled: String) -> (identifier: String, remainder: String) {
let chars = partialMangled.characters
var lengthRange = chars.startIndex ..< chars.startIndex
while chars[lengthRange.endIndex].isDigit {
lengthRange.endIndex = chars.successor(of: lengthRange.endDigit)
}
let lengthString = String(chars[lengthRange])
let length = Int(lengthString)!
let identifierRange = lengthRange.endIndex ..< chars.index(length, stepsFrom: lengthRange.endIndex)
let remainderRange = chars.suffix(from: identifierRange.endIndex)
return (String(chars[identifierRange]), String(chars[identifierRange]))
}
This works (note: probably, I haven't actually tested it), but it fails a precondition if the mangled symbol is invalid. Suppose we want to detect this condition so that our parent function can throw a nice error instead:
func scanIdentifier(partialMangled: String) -> (identifier: String, remainder: String)? {
let chars = partialMangled.characters
var lengthRange = chars.startIndex ..< chars.startIndex
while chars[lengthRange.endIndex].isDigit {
lengthRange.endIndex = chars.successor(of: lengthRange.endDigit)
if lengthRange.endIndex == chars.endIndex {
return nil
}
}
let lengthString = String(chars[lengthRange])
guard let length = Int(lengthString) else {
return nil
}
let identifierRange = lengthRange.endIndex ..< chars.index(length, stepsFrom: lengthRange.endIndex)
if identifierRange.endIndex > chars.endIndex {
return nil
}
let remainderRange = chars.suffix(from: identifierRange.endIndex)
return (String(chars[identifierRange]), String(chars[identifierRange]))
}
That's really not the greatest. To tell the truth, I've actually guessed what bounds-checking is needed here; I'm not 100% sure I caught all the cases. And, um, I'm not really sure that `index(length, stepsFrom: lengthRange.endIndex)` is guaranteed to return anything valid if `length` is too large. Even `limitedBy:` wouldn't help me here—I would end up silently accepting and truncating an invalid string instead of detecting the error.
Now, imagine if `successor(of:)` and `index(_:stepsFrom:)` instead had variants which performed range checks on their results and returned `nil` if they failed:
func scanIdentifier(partialMangled: String) -> (identifier: String, remainder: String)? {
let chars = partialMangled.characters
var lengthRange = chars.startIndex ..< chars.startIndex
while chars[lengthRange.endIndex].isDigit {
guard let nextIndex = chars.successor(of: lengthRange.endDigit, permittingEnd: false) else {
return nil
}
lengthRange.endIndex = nextIndex
}
let lengthString = String(chars[lengthRange])
guard let length = Int(lengthString) else {
return nil
}
guard let identifierEndIndex = chars.index(length, stepsFrom: lengthRange.endIndex, permittingEnd: true) else {
return nil
}
let identifierRange = lengthRange.endIndex ..< identifierEndIndex
let remainderRange = chars.suffix(from: identifierRange.endIndex)
return (String(chars[identifierRange]), String(chars[identifierRange]))
}
By using these variants of the index-manipulation operations, the Collection API itself tells me where I need to handle bounds-check violations. Just like the failable `Int(_: String)` initializer, if I forget to check bounds after manipulating an index, the code will not type-check. That's a nice victory for correct semantics.
* * *
Incidentally, rather than having Valid<Index>, an alternative would be to have Unchecked<Index>. This would mark an index which had *not* been checked. You could use its `uncheckedIndex` property to access the index directly, or you could pass it to `Collection.check(_: Unchecked<Index>) -> Index?` to perform the check.
This would not serve to eliminate redundant checks; it would merely get the type system to help you catch index-checking mistakes. You could, of course, perform the check and then invalidate the index with a mutation, but that's just as true today. I believe that, with aggressive enough optimization, this could be costless at runtime. *And* it would offer a way to provide the so-called "safe indexing" many people ask for: you could offer a subscript which took an Unchecked<Index> and returned an Optional<Element>.
--
Brent Royal-Gordon
Architechies
More information about the swift-evolution
mailing list