<div dir="ltr">I'm excited to see this taking shape. Thanks for all the hard work putting this together!<div><br></div><div>A few random thoughts I had while reading it:</div><div><br></div><div>* You talk about an integer `codeUnitOffset` property for indexes. Since the current String implementation can switch between backing storage of ASCII or UTF-16 depending on the content of the string and how it's obtained, presumably this means that integer is not necessarily the same as the offset into the buffer, correct? (In other words, for a UTF-16-stored string, you would have to multiply it by 2.)<br></div><div><br></div><div>* You discuss the possibility of exposing some String methods, like `uppercase()`, on Character. Since Swift abstracts away the encoding, it seems like Characters are essentially Strings that are enforced at runtime (and sometimes at compile time, in the case of initialization from literals) to contain exactly 1 grapheme cluster. Given that, I think it would be worthwhile for Character to support *any* method on String that would be sensical to operate on a single character—case transformations (though perhaps not titlecase?), accessing its UTF-8 or UTF-16 views, and so forth. I would ask whether it makes sense to have a shared protocol between Character and String that defines those methods, but I'll defer on that because it feels like it would be a "bag of methods" rather than semantically meaningful.</div><div><br></div><div>On that same point, if I have a lightweight (<= 63 bit) Character, many of those operations can only currently be performed by constructing a String from it, which incurs a time and heap allocation penalty. (And indeed, there are TODOs in the code base to avoid doing such things internally, in the case of Character comparisons.) Which leads me to my next thought, since I've been doing a lot with Swift String performance lately...</div><div><br></div><div>* Currently, Character and String have divergent internal implementations. A Character can be "small" (<= 63 bits in UTF-8 packed into an integer) or "large" (> 63 bits with a heap-allocated buffer). Strings are just backed by a heap-allocated buffer. In this write-up, you say "Many strings are short enough to store in 64 bits"—not just characters. If that's the case, can those optimizations be lowered into _StringCore (or its new-world counterpart), which would allow both Characters *and* small Strings to reap the benefits of the more efficient implementation? This would let Characters get implementations of common methods like `uppercase()` for free, and there would be a zero-cost conversion from Characters to Strings. The only real difference between the types would be the APIs they vend, the semantic concept that they represent to users, and validation.</div><div><br></div><div>* The talk about implicit conversions between Substring and String bums me out, even though I see the importance of it in this context and know that it outweighs the alternatives. Given that the Swift team seems to prefer explicit to implicit conversions in general, I would hope that if they feel it's important enough to make a special case for the standard library, it could be a language feature that you'd consider making available to anyone.</div><div><br></div></div><br><div class="gmail_quote"><div dir="ltr">On Fri, Jan 20, 2017 at 7:35 AM Ben Cohen via swift-evolution <<a href="mailto:swift-evolution@swift.org">swift-evolution@swift.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word" class="gmail_msg"><br class="gmail_msg"><div class="gmail_msg"><blockquote type="cite" class="gmail_msg"><div class="gmail_msg">On Jan 19, 2017, at 10:42 PM, Jose Cheyo Jimenez <<a href="mailto:cheyo@masters3d.com" class="gmail_msg" target="_blank">cheyo@masters3d.com</a>> wrote:</div><br class="m_-5442222865958844201Apple-interchange-newline gmail_msg"><div class="gmail_msg"><span style="font-family:Helvetica;font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;float:none;display:inline!important" class="gmail_msg">I just have one concern about the slice of a string being called Substring. Why not StringSlice? The word substring can mean so many things, specially in cocoa.<span class="m_-5442222865958844201Apple-converted-space gmail_msg"> </span></span><br style="font-family:Helvetica;font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px" class="gmail_msg"></div></blockquote></div><br class="gmail_msg"></div><div style="word-wrap:break-word" class="gmail_msg"><div class="gmail_msg">This idea has a lot of merit, as does the option of not giving them a top-level name at all e.g. they could be String.Slice or String.SubSequence. It would underscore that they really aren’t meant to be used except as the result of a slicing operation or to efficiently pass a slice. OTOH, Substring is a term of art so can help with clarity.</div><div class="gmail_msg"><br class="gmail_msg"></div><div class="gmail_msg"><br class="gmail_msg"></div></div>_______________________________________________<br class="gmail_msg">
swift-evolution mailing list<br class="gmail_msg">
<a href="mailto:swift-evolution@swift.org" class="gmail_msg" target="_blank">swift-evolution@swift.org</a><br class="gmail_msg">
<a href="https://lists.swift.org/mailman/listinfo/swift-evolution" rel="noreferrer" class="gmail_msg" target="_blank">https://lists.swift.org/mailman/listinfo/swift-evolution</a><br class="gmail_msg">
</blockquote></div>