<!DOCTYPE html>
<html>
<head>
<title></title>
</head>
<body><div style="font-family:Arial;">I'll use Karl's point here as a minor jumping-off point for a semi-related train of thought… I'm excited by the content of the original manifesto, including a powerful Unicode  namespace and types. But as I've continued down the thread, I've had growing concern about  modeling strings breadthwise in the type system i.e., with UTF8String and so on.<br></div>
<div style="font-family:Arial;"><br></div>
<div style="font-family:Arial;">I strongly want Swift to have world-class string processing, but I believe even more strongly in the language's spirit of progressive disclosure. Newcomers to Swift's current String API find it difficult (something I personally disagree with, but that's neither here nor there); I don't think that difficulty is solved by aggressively use-specific type modeling. I instead think it gives rise to the same severe cargo-culting that gets us the scarily prevalent String.Index.init(offset:) extensions in the current model.<br></div>
<div style="font-family:Arial;"><br></div>
<div id="sig40804545"><div class="signature"><span class="font" style="font-family:arial, sans-serif, sans-serif">Best</span><span class="font" style="font-family:arial, sans-serif, sans-serif"></span><br></div>
<div class="signature"><span class="font" style="font-family:arial, sans-serif, sans-serif">&nbsp; Zach Waldowski</span><span class="font" style="font-family:arial, sans-serif, sans-serif"></span><br></div>
<div class="signature"><span class="font" style="font-family:arial, sans-serif, sans-serif">&nbsp;&nbsp;</span><a href="mailto:zach@waldowski.me"><span class="font" style="font-family:arial, sans-serif, sans-serif">zach@waldowski.me</span></a><br></div>
<div style="font-family:Arial;"><br></div>
</div>
<div>On Tue, Jan 24, 2017, at 10:15 PM, Karl Wagner via swift-evolution wrote:<br></div>
<blockquote type="cite"><div style="font-family:Arial;"><br></div>
<div><blockquote type="cite"><div><div style="font-family:Arial;"><br></div>
<blockquote type="cite" style="font-family:Helvetica;font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;-webkit-text-stroke-width:0px;"><div style="font-family:Arial;">I hope I am correct about the no-copy thing, and I would also like to<br></div>
<div style="font-family:Arial;">permit promoting C strings to Swift strings without validation. &nbsp;This<br></div>
<div style="font-family:Arial;">is obviously unsafe in general, but I know my strings... and I care<br></div>
<div style="font-family:Arial;">about performance. ;)<br></div>
</blockquote><div style="font-family:Arial;"><br></div>
<div style="font-family:Arial;"><span class="font" style="font-family:Helvetica"><span class="size" style="font-size:12px">We intend to support that use-case. &nbsp;That's part of the reason for the</span></span><br></div>
<div style="font-family:Arial;"><span class="font" style="font-family:Helvetica"><span class="size" style="font-size:12px">ValidUTF8 and ValidUTF16 encodings you see here:</span></span><br></div>
<div style="font-family:Arial;"><a href="https://github.com/apple/swift/blob/unicode-rethink/stdlib/public/core/Unicode2.swift#L598" style="font-family:Helvetica;font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;-webkit-text-stroke-width:0px;">https://github.com/apple/swift/blob/unicode-rethink/stdlib/public/core/Unicode2.swift#L598</a><br></div>
<div style="font-family:Arial;"><span class="font" style="font-family:Helvetica"><span class="size" style="font-size:12px">and here:</span></span><br></div>
<div style="font-family:Arial;"><a href="https://github.com/apple/swift/blob/unicode-rethink/stdlib/public/core/Unicode2.swift#L862" style="font-family:Helvetica;font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;-webkit-text-stroke-width:0px;">https://github.com/apple/swift/blob/unicode-rethink/stdlib/public/core/Unicode2.swift#L862</a><br></div>
</div>
</blockquote></div>
<div style="font-family:Arial;"><br></div>
<div>It seems a little strange to me that a pre-validated UTF8 string from C would have different types to a UTF8String (i.e. using ValidUTF8 vs UTF8). It defeats the point of having the encoding represented in the type-system.<br></div>
<div><br></div>
<div>For example, if I write a generic function:<br></div>
<div><br></div>
<blockquote style="margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:40px;border-top-width:initial;border-right-width:initial;border-bottom-width:initial;border-left-width:initial;border-top-style:none;border-right-style:none;border-bottom-style:none;border-left-style:none;border-top-color:initial;border-right-color:initial;border-bottom-color:initial;border-left-color:initial;border-image-source:initial;border-image-slice:initial;border-image-width:initial;border-image-outset:initial;border-image-repeat:initial;padding-top:0px;padding-right:0px;padding-bottom:0px;padding-left:0px;"><div><span class="font" style="font-family:Courier">func sendMessage&lt;Source: Unicode where Source.Encoding == UTF8&gt;(from: Source)</span><br></div>
</blockquote><div><br></div>
<div>I would only be able to accept UTF-8 text which hasn’t already been validated.&nbsp;<br></div>
<div><br></div>
<div>What about if we allowed each encoding to provide multiple kinds of decoder? That would also allow us to substitute our own decoders in, if there are application-specific shortcuts we can take.<br></div>
<div><br></div>
<blockquote style="margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:40px;border-top-width:initial;border-right-width:initial;border-bottom-width:initial;border-left-width:initial;border-top-style:none;border-right-style:none;border-bottom-style:none;border-left-style:none;border-top-color:initial;border-right-color:initial;border-bottom-color:initial;border-left-color:initial;border-image-source:initial;border-image-slice:initial;border-image-width:initial;border-image-outset:initial;border-image-repeat:initial;padding-top:0px;padding-right:0px;padding-bottom:0px;padding-left:0px;"><div><span class="font" style="font-family:Courier">protocol UnicodeEncoding {</span><br></div>
<div><span class="font" style="font-family:Courier">&nbsp; associatedtype CodeUnit</span><br></div>
<div><span class="font" style="font-family:Courier"></span><br></div>
<div><span class="font" style="font-family:Courier">&nbsp; associatedtype ValidatingDecoder: UnicodeDecoder</span><br></div>
<div style="font-family:Arial;"><span class="font" style="font-family:Courier">&nbsp; associatedtype NonValidatingDecoder: UnicodeDecoder</span><br></div>
</blockquote><blockquote style="margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:40px;border-top-width:initial;border-right-width:initial;border-bottom-width:initial;border-left-width:initial;border-top-style:none;border-right-style:none;border-bottom-style:none;border-left-style:none;border-top-color:initial;border-right-color:initial;border-bottom-color:initial;border-left-color:initial;border-image-source:initial;border-image-slice:initial;border-image-width:initial;border-image-outset:initial;border-image-repeat:initial;padding-top:0px;padding-right:0px;padding-bottom:0px;padding-left:0px;"><span class="font" style="font-family:Courier">}</span><br></blockquote><blockquote style="margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:40px;border-top-width:initial;border-right-width:initial;border-bottom-width:initial;border-left-width:initial;border-top-style:none;border-right-style:none;border-bottom-style:none;border-left-style:none;border-top-color:initial;border-right-color:initial;border-bottom-color:initial;border-left-color:initial;border-image-source:initial;border-image-slice:initial;border-image-width:initial;border-image-outset:initial;border-image-repeat:initial;padding-top:0px;padding-right:0px;padding-bottom:0px;padding-left:0px;"><div><span class="font" style="font-family:Courier"></span><br></div>
<div><span class="font" style="font-family:Courier">protocol UnicodeDecoder {</span><br></div>
<div><span class="font" style="font-family:Courier">&nbsp; &nbsp; associatedtype Encoding: UnicodeEncoding</span><br></div>
<div><span class="font" style="font-family:Courier">&nbsp; &nbsp; associatedtype DecodedScalar: RandomAccessCollection where Iterator.Element == Encoding.CodeUnit</span><br></div>
<div><span class="font" style="font-family:Courier"></span><br></div>
<div><span class="font" style="font-family:Courier">&nbsp; &nbsp; static func parse1Forward&lt;C&gt;(…) -&gt; ParseResult&lt;DecodedScalar, C.Index&gt;</span><br></div>
<div><span class="font" style="font-family:Courier">&nbsp; &nbsp; static func parse1Backward&lt;C&gt;(…) -&gt; ParseResult&lt;DecodedScalar, C.Index&gt;</span><br></div>
<div><span class="font" style="font-family:Courier">}</span><br></div>
<div><span class="font" style="font-family:Courier">// Not shown: UnicodeEncoder protocol, with transcodeScalar&lt;T&gt; function.</span><br></div>
<div><span class="font" style="font-family:Courier"></span><br></div>
<div><div><span class="font" style="font-family:Courier">struct UTF8: UnicodeEncoding &nbsp;{&nbsp;</span><br></div>
<div><span class="font" style="font-family:Courier">&nbsp; &nbsp; typealias CodeUnit &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; = UInt8 &nbsp;</span><br></div>
<div><span class="font" style="font-family:Courier">&nbsp; &nbsp; typealias ValidatingDecoder &nbsp; &nbsp;= ValidatingUTF8Decoder</span><br></div>
<div><span class="font" style="font-family:Courier">&nbsp; &nbsp; typealias NonValidatingDecoder = NonValidatingUTF8Decoder</span><br></div>
<div><span class="font" style="font-family:Courier">}</span><br></div>
</div>
<div><span class="font" style="font-family:Courier"></span><br></div>
<div><div><span class="font" style="font-family:Courier">struct NonValidatingUTF8Decoder: UnicodeDecoder {</span><br></div>
</div>
<div><div><span class="font" style="font-family:Courier">&nbsp; &nbsp; typealias Encoding = UTF8</span><br></div>
</div>
<div><div><span class="font" style="font-family:Courier">&nbsp; &nbsp; struct DecodedScalar: RandomAccessCollection { … }</span><br></div>
</div>
<div><div><span class="font" style="font-family:Courier">&nbsp; &nbsp; // Parsing functions</span><br></div>
</div>
<div><div><span class="font" style="font-family:Courier">}</span><br></div>
</div>
<div><span class="font" style="font-family:Courier"></span><br></div>
<div><span class="font" style="font-family:Courier">struct ValidatingUTF8Decoder: UnicodeDecoder {</span><br></div>
<div><span class="font" style="font-family:Courier">&nbsp; &nbsp; typealias Encoding = UTF8</span><br></div>
<div><span class="font" style="font-family:Courier">&nbsp; &nbsp; typealias DecodedScalar = NonValidatingUTF8Decoder.DecodedScalar // newtype would be cool here</span><br></div>
<div><span class="font" style="font-family:Courier">&nbsp; &nbsp; // Parsing functions</span><br></div>
<div><span class="font" style="font-family:Courier">}</span><br></div>
<div><span class="font" style="font-family:Courier"></span><br></div>
<div><span class="font" style="font-family:Courier">struct String {</span><br></div>
<div><span class="font" style="font-family:Courier">&nbsp; &nbsp; init&lt;C, Encoding, Decoder&gt;(from: C, encodedAs: Encoding, using: Decoder = Encoding.ValidatingDecoder)&nbsp;</span><br></div>
<div><span class="font" style="font-family:Courier">&nbsp; &nbsp; &nbsp; &nbsp; where C: Collection, C.Iterator.Element == Encoding.CodeUnit, Decoder.Encoding == Encoding {</span><br></div>
<div><span class="font" style="font-family:Courier"></span><br></div>
<div><span class="font" style="font-family:Courier">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;// transcode to native String encoding using&nbsp;‘Decoder’ we were given</span><br></div>
<div><span class="font" style="font-family:Courier">&nbsp; &nbsp; }</span><br></div>
<div><span class="font" style="font-family:Courier">}</span><br></div>
</blockquote><div><br></div>
<div>- Karl<br></div>
<div><u>_______________________________________________</u><br></div>
<div>swift-evolution mailing list<br></div>
<div><a href="mailto:swift-evolution@swift.org">swift-evolution@swift.org</a><br></div>
<div><a href="https://lists.swift.org/mailman/listinfo/swift-evolution">https://lists.swift.org/mailman/listinfo/swift-evolution</a><br></div>
</blockquote><div style="font-family:Arial;"><br></div>
</body>
</html>