[swift-evolution] Trial balloon: Ensure that String always contains valid Unicode

Dmitri Gribenko gribozavr at gmail.com
Sat Dec 19 21:59:06 CST 2015


On Fri, Dec 18, 2015 at 1:47 PM, Paul Cantrell via swift-evolution <
swift-evolution at swift.org> wrote:

> I was quite surprised to learn that it’s possible to create Swift strings
> that do not contain things other than valid Unicode characters. Is it
> feasible to guarantee that this cannot happen?
>
> String.init(bytes:encoding:) is failable, and does in fact validate that
> the given bytes are decodable with the given encoding in most circumstances:
>
>     // Returns nil
>     String(
>         bytes: [0xD8, 0x00] as [UInt8],
>         encoding: NSUTF8StringEncoding)
>
> However, that initializer does *not* reject invalid surrogate characters
> in UTF-16:
>
>     // Succeeds (wat?!)
>     let bogusStr = String(
>         bytes: [0xD8, 0x00] as [UInt8],
>         encoding: NSUTF16BigEndianStringEncoding)!
>

Adding this would be a useful guarantee, I support this.  The current
behavior looks inconsistent to me.  OTOH, the current behavior of
String(bytes:encoding:) mirrors the behavior of the NSString method, so
this would create inconsistency.  But I think the extra guarantee is worth
it.

Tony, what do you think?

Dmitri

-- 
main(i,j){for(i=2;;i++){for(j=2;j<i;j++){if(!(i%j)){j=0;break;}}if
(j){printf("%d\n",i);}}} /*Dmitri Gribenko <gribozavr at gmail.com>*/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-evolution/attachments/20151219/2630d6c8/attachment.html>


More information about the swift-evolution mailing list