<html><head><meta http-equiv="Content-Type" content="text/html charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><br class=""><div><blockquote type="cite" class=""><div class="">On Jan 4, 2016, at 5:39 PM, Kevin Ballard <<a href="mailto:kevin@sb.org" class="">kevin@sb.org</a>> wrote:</div><br class="Apple-interchange-newline"><div class="">
<title class=""></title>
<div class=""><div class="">On Mon, Jan 4, 2016, at 03:22 PM, Paul Cantrell wrote:<br class=""></div>
<blockquote type="cite" class=""><div class=""> </div>
<div class=""><div class="">The bottom line is that not every NSString → String bridge need to be O(n). At least in theory. Someone with more intimate knowledge of NSString can correct me if I’m wrong.<br class=""></div>
</div>
</blockquote><div class=""> </div>
<div class="">I thought it was a given that we can't modify NSString. If we can modify it, all bets are off; heck, if we can modify it, why not just make NSString reject invalid sequences to begin with?<br class=""></div></div></div></blockquote><div><br class=""></div><div>Good question. And if we can’t modify NSString, then yes, we’re up against a tough problem.</div><div><br class=""></div><div>But should NSString legacy constraints really compromise the design of Swift’s native String type?</div><div><br class=""></div><div>Félix and Dmitri’s comments suggest that there are ways to prevent that, and that there’s precedent for placing any distasteful behavior necessary for compatibility in the bridging, not in the core type.</div><br class=""><blockquote type="cite" class=""><div class="">
<blockquote type="cite" class=""><div class=""><blockquote type="cite" class=""></blockquote></div></blockquote></div></blockquote><div class=""><blockquote type="cite" class=""><div class=""><blockquote type="cite" class=""><div class=""><div class="">Keep in mind that we’re <i class="">already</i> incurring that O(n) expense right now for every Swift operation that turns an NSString-backed string into characters — that plus the API burden of having that check deferred, which is what originally motivated this thread.</div></div></blockquote></div></blockquote></div><blockquote type="cite" class=""><div class=""><blockquote type="cite" class=""><div class="">
</div>
</blockquote><div class=""> </div>
<div class="">That's true for native Strings as well. The native String storage is actually a sequence of UTF-16 code units, it's not a sequence of characters. Any time you iterate over the CharacterView, it has to calculate the grapheme cluster boundaries.</div></div></blockquote><div><br class=""></div><div>Aren’t Swift strings encoded as UTF-8, —or at least designed to behave as if they are, however they might be stored under the hood?</div><div><br class=""></div><div><a href="https://github.com/apple/swift/blob/master/docs/StringDesign.rst#strings-are-encoded-as-utf-8" class="">https://github.com/apple/swift/blob/master/docs/StringDesign.rst#strings-are-encoded-as-utf-8</a></div><div><div><a href="https://github.com/apple/swift/blob/master/docs/StringDesign.rst#how-would-you-design-it" class="">https://github.com/apple/swift/blob/master/docs/StringDesign.rst#how-would-you-design-it</a></div><div class=""><br class=""></div></div><div>Given the warning at the top about this having been a planning document, I see that this may no longer be true. But at least the original design rationale strongly suggests that String’s failable initializers should fail when given invalid Unicode.</div><br class=""><blockquote type="cite" class=""><div class=""><div class="">But that's ok, because unless you call `count` on it, you're typically doing an O(N) operation _anyway_. But there's plenty of things you can do with strings that don't require iterating over the CharacterView.<br class=""></div>
</div></blockquote></div><div class=""><br class=""></div><div class="">Indeed, but per my earlier message, those things <i class="">could</i> all still be O(1) except in the case when you’re transcoding a string from something other than ASCII or UTF-8 — and those transcoding cases are O(n) already. That certainly seems like a better design for the core lib.</div><br class=""><div class="">Really hoping a core team member can weigh in on this….</div><div class=""><br class=""></div><div class="">Cheers,</div><div class=""><br class=""></div><div class="">Paul</div><div class=""><br class=""></div></body></html>