[swift-evolution] Strings in Swift 4
Karl Wagner
razielim at gmail.com
Tue Jan 24 21:15:45 CST 2017
>
>> I hope I am correct about the no-copy thing, and I would also like to
>> permit promoting C strings to Swift strings without validation. This
>> is obviously unsafe in general, but I know my strings... and I care
>> about performance. ;)
>
> We intend to support that use-case. That's part of the reason for the
> ValidUTF8 and ValidUTF16 encodings you see here:
> https://github.com/apple/swift/blob/unicode-rethink/stdlib/public/core/Unicode2.swift#L598 <https://github.com/apple/swift/blob/unicode-rethink/stdlib/public/core/Unicode2.swift#L598>
> and here:
> https://github.com/apple/swift/blob/unicode-rethink/stdlib/public/core/Unicode2.swift#L862 <https://github.com/apple/swift/blob/unicode-rethink/stdlib/public/core/Unicode2.swift#L862>
It seems a little strange to me that a pre-validated UTF8 string from C would have different types to a UTF8String (i.e. using ValidUTF8 vs UTF8). It defeats the point of having the encoding represented in the type-system.
For example, if I write a generic function:
func sendMessage<Source: Unicode where Source.Encoding == UTF8>(from: Source)
I would only be able to accept UTF-8 text which hasn’t already been validated.
What about if we allowed each encoding to provide multiple kinds of decoder? That would also allow us to substitute our own decoders in, if there are application-specific shortcuts we can take.
protocol UnicodeEncoding {
associatedtype CodeUnit
associatedtype ValidatingDecoder: UnicodeDecoder
associatedtype NonValidatingDecoder: UnicodeDecoder
}
protocol UnicodeDecoder {
associatedtype Encoding: UnicodeEncoding
associatedtype DecodedScalar: RandomAccessCollection where Iterator.Element == Encoding.CodeUnit
static func parse1Forward<C>(…) -> ParseResult<DecodedScalar, C.Index>
static func parse1Backward<C>(…) -> ParseResult<DecodedScalar, C.Index>
}
// Not shown: UnicodeEncoder protocol, with transcodeScalar<T> function.
struct UTF8: UnicodeEncoding {
typealias CodeUnit = UInt8
typealias ValidatingDecoder = ValidatingUTF8Decoder
typealias NonValidatingDecoder = NonValidatingUTF8Decoder
}
struct NonValidatingUTF8Decoder: UnicodeDecoder {
typealias Encoding = UTF8
struct DecodedScalar: RandomAccessCollection { … }
// Parsing functions
}
struct ValidatingUTF8Decoder: UnicodeDecoder {
typealias Encoding = UTF8
typealias DecodedScalar = NonValidatingUTF8Decoder.DecodedScalar // newtype would be cool here
// Parsing functions
}
struct String {
init<C, Encoding, Decoder>(from: C, encodedAs: Encoding, using: Decoder = Encoding.ValidatingDecoder)
where C: Collection, C.Iterator.Element == Encoding.CodeUnit, Decoder.Encoding == Encoding {
// transcode to native String encoding using ‘Decoder’ we were given
}
}
- Karl
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-evolution/attachments/20170125/5358bbc0/attachment.html>
More information about the swift-evolution
mailing list