[swift-evolution] SE-163: String Revision: Collection Conformance, C Interop, Transcoding

Ben Cohen ben_cohen at apple.com
Thu Apr 6 19:34:01 CDT 2017

> On Apr 5, 2017, at 10:32 PM, Félix Cloutier <felixcca at yahoo.ca> wrote:
> During the proposal phase, we asked how this would handle fixed-length strings with an optional NUL terminator. For instance, in a C `struct Foo { char name[8]; };`, `name` stops at the first \0, or at the eighth byte, whichever comes first. IIRC, Ben said that it would be handled, but I'd like to have it clarified.
> Is it correct to assume that a UnicodeEncoding is expected to return UnicodeParseResult.emptyInput when it sees a NUL character (thus stopping before the end of the buffer if necessary)? Is it also correct to assume that if you need this functionality, you'll be looking at code like this?
> var result = ""
> UnicodeEncoding.parseForward(bufferPointer) { result += $0 }

Hi Félix,

Having talked about it among the team, it feels like we should add an initializer from a Collection of code units to this proposal.  Therefore given a pointer p to some utf8 and a length n, you would write:

let b = UnsafeBuffer(start: p, count: n)
// naming opinions on the argument labels welcomed, this is probably what I’d go for...
let s = String(b, fromEncoding: UTF8.self)

Similar to the C string inits, this would only be a repairing initializer.

Your request goes a little bit further though. For that, I would say that it probably doesn’t deserve a special dedicated initializer. You could instead search for the nil using index(of:):

let i = b.index(of: 0)
let s = String(b[..<i], UTF8.self)  // one-sided ranges pitch forthcoming ;)

Does this sound reasonable?

