<table cellspacing="0" cellpadding="0" border="0"><tr><td valign="top"><div>When I say "reinterpret", I mean taking the UTF-8 bytes and pretend that they're UTF-16. This is an extremely clear bug whenever it happens. The correct conversion between UTF-8 and UTF-16 is lossless.<br /><br />The vast majority of systems, including file systems and email addresses, support Unicode. I'm struggling to come up with an example where a restriction isn't the result of lazy assumptions. It's not like we have to pause and check that every link on the network path is 8-bit clean anymore. <br /><br />Félix</div></td></tr></table>            <div id="_origMsg_">
                <div>
                    <br />
                    <div>
                        <div style="font-size:0.9em">
                            <hr size="1">
                            <b>
                                <span style="font-weight:bold">From:</span>
                            </b>
                            Xiaodi Wu via swift-evolution &lt;swift-evolution@swift.org&gt;;                            <br>
                            <b>
                                <span style="font-weight:bold">To:</span>
                            </b>
                            Kenny Leung &lt;kenny_leung@pobox.com&gt;;                                                     <br>
                            <b>
                                <span style="font-weight:bold">Cc:</span>
                            </b>
                            swift-evolution &lt;swift-evolution@swift.org&gt;;                                                     <br>
                            <b>
                                <span style="font-weight:bold">Subject:</span>
                            </b>
                            Re: [swift-evolution] InternalString class for easy String        manipulation                            <br>
                            <b>
                                <span style="font-weight:bold">Sent:</span>
                            </b>
                            Thu, Aug 18, 2016 7:41:21 PM                            <br>
                        </div>
                            <br>
                            <table cellspacing="0" cellpadding="0" border="0">
                                <tbody>
                                    <tr>
                                        <td valign="top"><div dir="ltr">Actually, if I&#39;m not mistaken, String (or at least, CFStringRef, to which String is toll-free bridged) does not re-encode anything eagerly. If you initialize with UTF8 bytes, it&#39;s stored internally as UTF8 bytes; if you initialize with UTF16 code units, it&#39;s stored internally as UTF16 code units. Re-encoding happens only when necessary--i.e. when you ask for UTF8 bytes from a UTF16-encoded string.<div class="gmail_extra"><br clear="none"></div><div class="yqt9872608810" id="yqt64885"><div class="gmail_extra"><br clear="none"><div class="gmail_quote">On Thu, Aug 18, 2016 at 2:34 PM, Kenny Leung via swift-evolution <span dir="ltr">&lt;<a rel="nofollow" shape="rect" ymailto="mailto:swift-evolution@swift.org" target="_blank" href="javascript:return">swift-evolution@swift.org</a>&gt;</span> wrote:<br clear="none"><blockquote class="gmail_quote" style="margin:0 0 0
 .8ex;border-left:1px #ccc solid;padding-left:1ex;"><span class=""><br clear="none">
&gt; On Aug 18, 2016, at 11:51 AM, Félix Cloutier &lt;<a rel="nofollow" shape="rect" ymailto="mailto:felixcca@yahoo.ca" target="_blank" href="javascript:return">felixcca@yahoo.ca</a>&gt; wrote:<br clear="none">
&gt;  Of course you&#39;ll have problems if you try to interpret UTF-8 as UTF-16 and vice-versa, but that&#39;ll do you regardless of whether you use international characters or not.<br clear="none">
<br clear="none">
</span>This is exactly my point. Even if the internal representation is UTF-8 (or UTF-16), you are not free from having to do conversions. You still need to convert to the encoding format that is understood by the receiver. I make a distinction between Unicode and Unicode encodings.<br clear="none">
<br clear="none">
-Kenny<br clear="none">
<div class="HOEnZb"><div class="h5"><br clear="none">
<br clear="none">
&gt; On Thursday, August 18, 2016 9:33 AM, Kenny Leung via swift-evolution &lt;<a rel="nofollow" shape="rect" ymailto="mailto:swift-evolution@swift.org" target="_blank" href="javascript:return">swift-evolution@swift.org</a>&gt; wrote:<br clear="none">
&gt;<br clear="none">
&gt;<br clear="none">
&gt; &gt;&gt; Just because you are using UTF-8 as the internal format, it does not mean that universal support is guaranteed.<br clear="none">
&gt;<br clear="none">
&gt; All I meant was this, and nothing more. If the internal format was UTF-8, and you were using a filesystem whose filenames were UTF-16, you would have the same problems.<br clear="none">
&gt;<br clear="none">
&gt; -Kenny<br clear="none">
&gt;<br clear="none">
&gt;<br clear="none">
&gt; &gt; On Aug 17, 2016, at 10:40 PM, Félix Cloutier &lt;<a rel="nofollow" shape="rect" ymailto="mailto:felixcca@yahoo.ca" target="_blank" href="javascript:return">felixcca@yahoo.ca</a>&gt; wrote:<br clear="none">
&gt; &gt;<br clear="none">
&gt; &gt;&gt; In Félix’s case, I would expect to have to ask for a mail-friendly representation of his name, just like you have to ask for a filesystem-friendly representation of a filename regardless of what the internal representation is. Just because you are using UTF-8 as the internal format, it does not mean that universal support is guaranteed.<br clear="none">
&gt; &gt;<br clear="none">
&gt; &gt; Would you imagine if &quot;n&quot; turned out to be poorly supported by systems throughout the world and dead-serious people argued that it&#39;s too hard for beginners?<br clear="none">
&gt; &gt;<br clear="none">
&gt; &gt; &quot;Filesystem-friendly&quot; and &quot;email-friendly&quot; names are not backed by modern standards. You can have essentially any character that you like in a file name save for the directory separator on almost every platform out there (except on Windows, but the constraints are implemented in a layer above NTFS), and addresses like félix@... are RFC-legal. Restrictions are merely wished into existence by programmers who don&#39;t want to complicate their mental model of text processing, to everyone else&#39;s detriment.<br clear="none">
&gt; &gt;<br clear="none">
&gt; &gt; Félix<br clear="none">
&gt;<br clear="none">
&gt; ______________________________ _________________<br clear="none">
&gt; swift-evolution mailing list<br clear="none">
&gt; <a rel="nofollow" shape="rect" ymailto="mailto:swift-evolution@swift.org" target="_blank" href="javascript:return">swift-evolution@swift.org</a><br clear="none">
&gt; <a rel="nofollow" shape="rect" target="_blank" href="https://lists.swift.org/mailman/listinfo/swift-evolution">https://lists.swift.org/ mailman/listinfo/swift- evolution</a><br clear="none">
&gt;<br clear="none">
&gt;<br clear="none">
<br clear="none">
______________________________ _________________<br clear="none">
swift-evolution mailing list<br clear="none">
<a rel="nofollow" shape="rect" ymailto="mailto:swift-evolution@swift.org" target="_blank" href="javascript:return">swift-evolution@swift.org</a><br clear="none">
<a rel="nofollow" shape="rect" target="_blank" href="https://lists.swift.org/mailman/listinfo/swift-evolution">https://lists.swift.org/ mailman/listinfo/swift- evolution</a><br clear="none">
</div></div></blockquote></div><br clear="none"></div></div></div></td>
                                    </tr>
                                </tbody>
                            </table>
                    </div>
                </div>
            </div>