<html><head><meta http-equiv="Content-Type" content="text/html charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class="">


<title class=""></title>

<div class=""><div class=""><span class="highlight" style="background-color:rgb(255, 255, 255)"><span class="font" style="font-family:-apple-system-body, Helvetica, arial, sans-serif"><a href="https://github.com/apple/swift-evolution/blob/master/proposals/0180-string-index-overhaul.md" class="">https://github.com/apple/swift-evolution/blob/master/proposals/0180-string-index-overhaul.md</a></span></span><br class=""></div>
<div class=""><br class=""></div>
<div class="">Overall it looks pretty good. But unfortunately the answer to "Will applications still compile but produce different behavior than they used to?" is actually "Yes", when using APIs provided by Foundation. This is because Foundation is currently able to return&nbsp;<font color="#3e1e81" style="font-family: Menlo;" class=""><span style="font-size: 11px;" class="">String</span></font><span style="font-family: Menlo; font-size: 11px;" class="">.</span><font color="#3e1e81" style="font-family: Menlo;" class=""><span style="font-size: 11px;" class="">Index</span></font>&nbsp;values that don't point to Character boundaries.</div>
<div class=""><br class=""></div>
<div class="">Specifically, in Swift 3, the following code:<br class=""></div>
<div class=""><br class=""></div>
<div class=""><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo;" class=""><span style="font-variant-ligatures: no-common-ligatures; color: #ba2da2" class="">import</span><span style="font-variant-ligatures: no-common-ligatures" class=""> Foundation</span></div><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo; min-height: 13px;" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""></span><br class=""></div><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo; color: rgb(209, 47, 27);" class=""><span style="font-variant-ligatures: no-common-ligatures; color: #ba2da2" class="">let</span><span style="font-variant-ligatures: no-common-ligatures; color: #000000" class=""> str = </span><span style="font-variant-ligatures: no-common-ligatures" class="">"e\u{301}galite\u{301}"</span></div><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo;" class=""><span style="font-variant-ligatures: no-common-ligatures; color: #ba2da2" class="">let</span><span style="font-variant-ligatures: no-common-ligatures" class=""> r = </span><span style="font-variant-ligatures: no-common-ligatures; color: #4f8187" class="">str</span><span style="font-variant-ligatures: no-common-ligatures" class="">.</span><span style="font-variant-ligatures: no-common-ligatures; color: #3e1e81" class="">rangeOfCharacter</span><span style="font-variant-ligatures: no-common-ligatures" class="">(from: [</span><span style="font-variant-ligatures: no-common-ligatures; color: #d12f1b" class="">"\u{301}"</span><span style="font-variant-ligatures: no-common-ligatures" class="">])!</span></div><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo; color: rgb(209, 47, 27);" class=""><span style="font-variant-ligatures: no-common-ligatures; color: #3e1e81" class="">print</span><span style="font-variant-ligatures: no-common-ligatures; color: #000000" class="">(</span><span style="font-variant-ligatures: no-common-ligatures; color: #4f8187" class="">str</span><span style="font-variant-ligatures: no-common-ligatures; color: #000000" class="">[</span><span style="font-variant-ligatures: no-common-ligatures; color: #4f8187" class="">r</span><span style="font-variant-ligatures: no-common-ligatures; color: #000000" class="">] </span><span style="font-variant-ligatures: no-common-ligatures; color: #3e1e81" class="">==</span><span style="font-variant-ligatures: no-common-ligatures; color: #000000" class=""> </span><span style="font-variant-ligatures: no-common-ligatures" class="">"\u{301}"</span><span style="font-variant-ligatures: no-common-ligatures; color: #000000" class="">)</span></div></div><div class=""><span style="font-variant-ligatures: no-common-ligatures; color: #000000" class=""><br class=""></span></div><div class=""><span style="font-variant-ligatures: no-common-ligatures; color: #000000" class="">will print “true”, because the returned range identifies the combining acute accent only. But with the proposed&nbsp;<font color="#3e1e81" style="font-family: Menlo;" class=""><span style="font-size: 11px;" class="">String</span></font><span style="font-family: Menlo; font-size: 11px;" class="">.</span><font color="#3e1e81" style="font-family: Menlo;" class=""><span style="font-size: 11px;" class="">Index</span></font>&nbsp;revisions, the `<span style="font-family: Menlo; font-size: 11px; font-variant-ligatures: no-common-ligatures; color: rgb(79, 129, 135);" class="">str</span><span style="font-family: Menlo; font-size: 11px; font-variant-ligatures: no-common-ligatures;" class="">[</span><span style="font-family: Menlo; font-size: 11px; font-variant-ligatures: no-common-ligatures; color: rgb(79, 129, 135);" class="">r</span><span style="font-family: Menlo; font-size: 11px; font-variant-ligatures: no-common-ligatures;" class="">]</span>` subscript will return the whole&nbsp;</span><span style="color: rgb(209, 47, 27); font-family: Menlo; font-size: 11px;" class="">"e\u{301}</span><font color="#d12f1b" face="Menlo" class=""><span style="font-size: 11px;" class="">”</span></font>&nbsp;combined character.</div><div class=""><br class=""></div><div class="">This is, of course, an edge case, but we need to consider the implications of this and determine if it actually affects anything that’s likely to be a problem in practice.</div><div class=""><br class=""></div><div class="">There’s also the curious case where I can have two <font face="Menlo" class=""><font color="#3e1e81" class=""><span style="font-size: 11px;" class="">String</span></font><span style="font-size: 11px;" class="">.</span><font color="#3e1e81" class=""><span style="font-size: 11px;" class="">Index</span></font></font> values that compare unequal but actually return the same value when used in a subscript. For example, with the above string, if I have a <font face="Menlo" class=""><font color="#3e1e81" class=""><span style="font-size: 11px;" class="">String</span></font><span style="font-size: 11px;" class="">.</span><font color="#3e1e81" class=""><span style="font-size: 11px;" class="">Index</span></font><span style="font-size: 11px;" class="">(encodedOffset: 0)</span></font> and a <font face="Menlo" class=""><font color="#3e1e81" class=""><span style="font-size: 11px;" class="">String</span></font><span style="font-size: 11px;" class="">.</span><font color="#3e1e81" class=""><span style="font-size: 11px;" class="">Index</span></font><span style="font-size: 11px;" class="">(encodedOffset: 1)</span></font>. This may not be a problem in practice, but it’s something to be aware of.</div><div class=""><br class=""></div><div class="">I’m also confused by the paragraph about index comparison. It talks about if two indices are valid in a single String view, comparison semantics are according to Collection, and otherwise indexes are compared using encodedOffsets, and this means indexes aren’t totally ordered. But I’m not sure what the first part is supposed to mean. How is comparing indices that are valid within a single view any different than comparing the encodedOffsets?</div><div class=""><br class=""></div><div class="">-Kevin Ballard</div>
</div>

</body></html>