<font size=2 face="sans-serif">Thanks for the great write-up!</font><br><br><font size=2 face="sans-serif">The manifesto recognizes the importance
of machine processing and performance.</font><br><font size=2 face="sans-serif">I am surprised that there is no mention
of any kind of "unsafe" strings or string processing.</font><br><font size=2 face="sans-serif">In general, Swift does an amazing job
at incorporating unsafe mechanism into a safe-by-default programming paradigm.</font><br><font size=2 face="sans-serif">But for some reason, Strings seem to
be left out of the unsafe discussion.</font><br><br><font size=2 face="sans-serif">A lot of machine processing of strings
continues to deal with 8-bit quantities (even 7-bit quantities, not UTF-8).</font><br><font size=2 face="sans-serif">Swift strings are not very good at that.
I see progress in the manifesto but nothing to really close the performance
gap with C.</font><br><font size=2 face="sans-serif">That's where "unsafe" mechanisms
could come into play.</font><br><br><font size=2 face="sans-serif">To guarantee Unicode correctness, a
C string must be validated or transformed to be considered a Swift string.</font><br><font size=2 face="sans-serif">If I understand the C String interop
section correctly, in Swift 4, this should not force a copy, but traversing
the string is still required.</font><br><font size=2 face="sans-serif">I hope I am correct about the no-copy
thing, and I would also like to permit promoting C strings to Swift strings
without validation.</font><br><font size=2 face="sans-serif">This is obviously unsafe in general,
but I know my strings... and I care about performance. ;)</font><br><br><font size=2 face="sans-serif">More importantly, it is not possible
to mutate bytes in a Swift string at will.</font><br><font size=2 face="sans-serif">Again it makes sense from the point
of view of always correct Unicode sequences.</font><br><font size=2 face="sans-serif">But it does not for machine processing
of C strings with C-like performance.</font><br><font size=2 face="sans-serif">Today, I can cheat using a "_public"
API for this, i.e., </font><font size=2 face="Menlo-Regular">myString.</font><font size=2 color=#603181 face="Menlo-Regular">_core</font><font size=2 face="Menlo-Regular">.</font><font size=2 color=#603181 face="Menlo-Regular">_baseAddress</font><font size=2 face="Menlo-Regular">!</font><font size=2 face="sans-serif">.</font><br><font size=2 face="sans-serif">This should be doable from an official
"unsafe" API.</font><br><br><font size=2 face="sans-serif">Memory safety is also at play here,
as well as ownership.</font><br><font size=2 face="sans-serif">A proper API could guarantee the backing
store is writable for instance, that it is not shared.</font><br><font size=2 face="sans-serif">A memory-safe but not unicode-safe API
could do bounds checks.</font><br><br><font size=2 face="sans-serif">While low-level C string processing
can be done using unsafe memory buffers with performance, the lack of bridging
with "real" Swift strings kills the deal.</font><br><font size=2 face="sans-serif">No literals syntax (or costly coercions),
none of the many useful string APIs.</font><br><br><font size=2 face="sans-serif">To illustrate these points here is a
simple experiment: code written to synthesize an http date string from
a bunch of integers.</font><br><font size=2 face="sans-serif">There are four versions of the code
going from nice high-level Swift code to low-level C-like code.</font><br><font size=2 face="sans-serif">(Some of this code is also about avoiding
ARC overheads, and string interpolation overheads, hence the four versions.)</font><br><br><font size=2 face="sans-serif">On my macbook pro (swiftc -O), the performance
is as follows:</font><br><br><font size=2 face="Menlo-Regular">interpolation + func: 2.303032365s</font><br><font size=2 face="Menlo-Regular">interpolation + array: 1.224858418s</font><br><font size=2 face="Menlo-Regular">append:
0.918512377s</font><br><font size=2 face="Menlo-Regular">memcpy:
0.182104674s</font><br><br><font size=2 face="sans-serif">While the benchmarking could be done
more carefully, I think the main observation is valid.</font><br><font size=2 face="sans-serif">The nice code is more than 10x slower
than the C-like code.</font><br><font size=2 face="sans-serif">Moreover, the ugly-but-still-valid-Swift
code is still about 5x slower than the C like code.</font><br><font size=2 face="sans-serif">For some applications, e.g. web servers,
this kind of numbers matter...</font><br><br><font size=2 face="sans-serif">Some of the proposed improvements would
help with this, e.g., small strings optimization, and maybe changes to
the concatenation semantics.</font><br><font size=2 face="sans-serif">But it seems to me that a big performance
gap will remain.</font><br><font size=2 face="sans-serif">(Concatenation even with strncat is
significantly slower than memcpy for fixed-size strings.)</font><br><br><font size=2 face="sans-serif">I believe there is a need and an opportunity
for a fast "less safe" String API.</font><br><font size=2 face="sans-serif">I hope it will be on the roadmap soon.</font><br><br><font size=2 face="sans-serif">Best,</font><br><br><font size=2 face="sans-serif">Olivier</font><br><br><font size=2 color=#a1009f face="Menlo-Regular">import</font><font size=2 face="Menlo-Regular">Foundation</font><br><br><font size=2 color=#008000 face="Menlo-Regular">// get current date
as a series of integers</font><br><font size=2 color=#008000 face="Menlo-Regular">// (could be done differently...
faster... not the topic)</font><br><br><font size=2 color=#a1009f face="Menlo-Regular">var</font><font size=2 face="Menlo-Regular">theTime = </font><font size=2 color=#1f007f face="Menlo-Regular">time</font><font size=2 face="Menlo-Regular">(</font><font size=2 color=#a1009f face="Menlo-Regular">nil</font><font size=2 face="Menlo-Regular">)</font><br><font size=2 color=#a1009f face="Menlo-Regular">var</font><font size=2 face="Menlo-Regular">timeStruct = </font><font size=2 color=#603181 face="Menlo-Regular">tm</font><font size=2 face="Menlo-Regular">()</font><br><font size=2 color=#1f007f face="Menlo-Regular">gmtime_r</font><font size=2 face="Menlo-Regular">(&</font><font size=2 color=#3f8080 face="Menlo-Regular">theTime</font><font size=2 face="Menlo-Regular">,
&</font><font size=2 color=#3f8080 face="Menlo-Regular">timeStruct</font><font size=2 face="Menlo-Regular">)</font><br><font size=2 color=#a1009f face="Menlo-Regular">let</font><font size=2 face="Menlo-Regular">wday = </font><font size=2 color=#603181 face="Menlo-Regular">Int</font><font size=2 face="Menlo-Regular">(</font><font size=2 color=#3f8080 face="Menlo-Regular">timeStruct</font><font size=2 face="Menlo-Regular">.</font><font size=2 color=#603181 face="Menlo-Regular">tm_wday</font><font size=2 face="Menlo-Regular">)</font><br><font size=2 color=#a1009f face="Menlo-Regular">let</font><font size=2 face="Menlo-Regular">mday = </font><font size=2 color=#603181 face="Menlo-Regular">Int</font><font size=2 face="Menlo-Regular">(</font><font size=2 color=#3f8080 face="Menlo-Regular">timeStruct</font><font size=2 face="Menlo-Regular">.</font><font size=2 color=#603181 face="Menlo-Regular">tm_mday</font><font size=2 face="Menlo-Regular">)</font><br><font size=2 color=#a1009f face="Menlo-Regular">let</font><font size=2 face="Menlo-Regular">mon = </font><font size=2 color=#603181 face="Menlo-Regular">Int</font><font size=2 face="Menlo-Regular">(</font><font size=2 color=#3f8080 face="Menlo-Regular">timeStruct</font><font size=2 face="Menlo-Regular">.</font><font size=2 color=#603181 face="Menlo-Regular">tm_mon</font><font size=2 face="Menlo-Regular">)</font><br><font size=2 color=#a1009f face="Menlo-Regular">let</font><font size=2 face="Menlo-Regular">year = </font><font size=2 color=#603181 face="Menlo-Regular">Int</font><font size=2 face="Menlo-Regular">(</font><font size=2 color=#3f8080 face="Menlo-Regular">timeStruct</font><font size=2 face="Menlo-Regular">.</font><font size=2 color=#603181 face="Menlo-Regular">tm_year</font><font size=2 face="Menlo-Regular">)
+ </font><font size=2 color=#0000e0 face="Menlo-Regular">1900</font><br><font size=2 color=#a1009f face="Menlo-Regular">let</font><font size=2 face="Menlo-Regular">hour = </font><font size=2 color=#603181 face="Menlo-Regular">Int</font><font size=2 face="Menlo-Regular">(</font><font size=2 color=#3f8080 face="Menlo-Regular">timeStruct</font><font size=2 face="Menlo-Regular">.</font><font size=2 color=#603181 face="Menlo-Regular">tm_hour</font><font size=2 face="Menlo-Regular">)</font><br><font size=2 color=#a1009f face="Menlo-Regular">let</font><font size=2 face="Menlo-Regular">min = </font><font size=2 color=#603181 face="Menlo-Regular">Int</font><font size=2 face="Menlo-Regular">(</font><font size=2 color=#3f8080 face="Menlo-Regular">timeStruct</font><font size=2 face="Menlo-Regular">.</font><font size=2 color=#603181 face="Menlo-Regular">tm_min</font><font size=2 face="Menlo-Regular">)</font><br><font size=2 color=#a1009f face="Menlo-Regular">let</font><font size=2 face="Menlo-Regular">sec = </font><font size=2 color=#603181 face="Menlo-Regular">Int</font><font size=2 face="Menlo-Regular">(</font><font size=2 color=#3f8080 face="Menlo-Regular">timeStruct</font><font size=2 face="Menlo-Regular">.</font><font size=2 color=#603181 face="Menlo-Regular">tm_sec</font><font size=2 face="Menlo-Regular">)</font><br><br><font size=2 color=#a1009f face="Menlo-Regular">let</font><font size=2 face="Menlo-Regular">months = [</font><font size=2 color=#c21212 face="Menlo-Regular">"Jan"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"Feb"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"Mar"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"Apr"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"May"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"Jun"</font><font size=2 face="Menlo-Regular">,</font><br><font size=2 face="Menlo-Regular">
</font><font size=2 color=#c21212 face="Menlo-Regular">"Jul"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"Aug"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"Sep"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"Oct"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"Nov"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"Dec"</font><font size=2 face="Menlo-Regular">]</font><br><br><font size=2 color=#a1009f face="Menlo-Regular">let</font><font size=2 face="Menlo-Regular">days = [</font><font size=2 color=#c21212 face="Menlo-Regular">"Sun"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"Mon"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"Tue"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"Wed"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"Thu"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"Fri"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"Sat"</font><font size=2 face="Menlo-Regular">]</font><br><br><font size=2 color=#a1009f face="Menlo-Regular">func</font><font size=2 face="Menlo-Regular">twoDigit(</font><font size=2 color=#a1009f face="Menlo-Regular">_</font><font size=2 face="Menlo-Regular">num: </font><font size=2 color=#603181 face="Menlo-Regular">Int</font><font size=2 face="Menlo-Regular">)
-> </font><font size=2 color=#603181 face="Menlo-Regular">String</font><font size=2 face="Menlo-Regular">{</font><br><font size=2 face="Menlo-Regular"> </font><font size=2 color=#a1009f face="Menlo-Regular">return</font><font size=2 face="Menlo-Regular">(num < </font><font size=2 color=#0000e0 face="Menlo-Regular">10</font><font size=2 face="Menlo-Regular">? </font><font size=2 color=#c21212 face="Menlo-Regular">"0"</font><font size=2 face="Menlo-Regular">: </font><font size=2 color=#c21212 face="Menlo-Regular">""</font><font size=2 face="Menlo-Regular">)
</font><font size=2 color=#1f007f face="Menlo-Regular">+</font><font size=2 face="Menlo-Regular"></font><font size=2 color=#603181 face="Menlo-Regular">String</font><font size=2 face="Menlo-Regular">(num)</font><br><font size=2 face="Menlo-Regular">}</font><br><br><font size=2 color=#a1009f face="Menlo-Regular">let</font><font size=2 face="Menlo-Regular">twoDigit = [</font><font size=2 color=#c21212 face="Menlo-Regular">"00"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"01"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"02"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"03"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"04"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"05"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"06"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"07"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"08"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"09"</font><font size=2 face="Menlo-Regular">,</font><br><font size=2 face="Menlo-Regular">
</font><font size=2 color=#c21212 face="Menlo-Regular">"10"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"11"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"12"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"13"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"14"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"15"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"16"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"17"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"18"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"19"</font><font size=2 face="Menlo-Regular">,</font><br><font size=2 face="Menlo-Regular">
</font><font size=2 color=#c21212 face="Menlo-Regular">"20"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"21"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"22"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"23"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"24"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"25"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"26"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"27"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"28"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"29"</font><font size=2 face="Menlo-Regular">,</font><br><font size=2 face="Menlo-Regular">
</font><font size=2 color=#c21212 face="Menlo-Regular">"30"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"31"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"32"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"33"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"34"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"35"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"36"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"37"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"38"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"39"</font><font size=2 face="Menlo-Regular">,</font><br><font size=2 face="Menlo-Regular">
</font><font size=2 color=#c21212 face="Menlo-Regular">"40"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"41"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"42"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"43"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"44"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"45"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"46"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"47"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"48"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"49"</font><font size=2 face="Menlo-Regular">,</font><br><font size=2 face="Menlo-Regular">
</font><font size=2 color=#c21212 face="Menlo-Regular">"50"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"51"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"52"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"53"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"54"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"55"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"56"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"57"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"58"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"59"</font><font size=2 face="Menlo-Regular">,</font><br><font size=2 face="Menlo-Regular">
</font><font size=2 color=#c21212 face="Menlo-Regular">"60"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"61"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"62"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"63"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"64"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"65"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"66"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"67"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"68"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"69"</font><font size=2 face="Menlo-Regular">,</font><br><font size=2 face="Menlo-Regular">
</font><font size=2 color=#c21212 face="Menlo-Regular">"70"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"71"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"72"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"73"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"74"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"75"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"76"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"77"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"78"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"79"</font><font size=2 face="Menlo-Regular">,</font><br><font size=2 face="Menlo-Regular">
</font><font size=2 color=#c21212 face="Menlo-Regular">"80"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"81"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"82"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"83"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"84"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"85"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"86"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"87"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"88"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"89"</font><font size=2 face="Menlo-Regular">,</font><br><font size=2 face="Menlo-Regular">
</font><font size=2 color=#c21212 face="Menlo-Regular">"90"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"91"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"92"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"93"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"94"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"95"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"96"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"97"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"98"</font><font size=2 face="Menlo-Regular">,
</font><font size=2 color=#c21212 face="Menlo-Regular">"99"</font><font size=2 face="Menlo-Regular">]</font><br><br><font size=2 color=#008000 face="Menlo-Regular">// interpolation +
func</font><br><br><font size=2 color=#a1009f face="Menlo-Regular">func</font><font size=2 face="Menlo-Regular">httpDate() -> </font><font size=2 color=#603181 face="Menlo-Regular">String</font><font size=2 face="Menlo-Regular">{</font><br><font size=2 face="Menlo-Regular"> </font><font size=2 color=#a1009f face="Menlo-Regular">return</font><font size=2 face="Menlo-Regular"></font><font size=2 color=#c21212 face="Menlo-Regular">"</font><font size=2 face="Menlo-Regular">\</font><font size=2 color=#c21212 face="Menlo-Regular">(</font><font size=2 color=#3f8080 face="Menlo-Regular">days</font><font size=2 face="Menlo-Regular">[</font><font size=2 color=#3f8080 face="Menlo-Regular">wday</font><font size=2 face="Menlo-Regular">]</font><font size=2 color=#c21212 face="Menlo-Regular">),
</font><font size=2 face="Menlo-Regular">\</font><font size=2 color=#c21212 face="Menlo-Regular">(</font><font size=2 color=#104160 face="Menlo-Regular">twoDigit</font><font size=2 face="Menlo-Regular">(</font><font size=2 color=#3f8080 face="Menlo-Regular">mday</font><font size=2 face="Menlo-Regular">)</font><font size=2 color=#c21212 face="Menlo-Regular">)
</font><font size=2 face="Menlo-Regular">\</font><font size=2 color=#c21212 face="Menlo-Regular">(</font><font size=2 color=#3f8080 face="Menlo-Regular">months</font><font size=2 face="Menlo-Regular">[</font><font size=2 color=#3f8080 face="Menlo-Regular">mon</font><font size=2 face="Menlo-Regular">]</font><font size=2 color=#c21212 face="Menlo-Regular">)
</font><font size=2 face="Menlo-Regular">\</font><font size=2 color=#c21212 face="Menlo-Regular">(</font><font size=2 color=#3f8080 face="Menlo-Regular">year</font><font size=2 color=#c21212 face="Menlo-Regular">)
</font><font size=2 face="Menlo-Regular">\</font><font size=2 color=#c21212 face="Menlo-Regular">(</font><font size=2 color=#104160 face="Menlo-Regular">twoDigit</font><font size=2 face="Menlo-Regular">(</font><font size=2 color=#3f8080 face="Menlo-Regular">hour</font><font size=2 face="Menlo-Regular">)</font><font size=2 color=#c21212 face="Menlo-Regular">):</font><font size=2 face="Menlo-Regular">\</font><font size=2 color=#c21212 face="Menlo-Regular">(</font><font size=2 color=#104160 face="Menlo-Regular">twoDigit</font><font size=2 face="Menlo-Regular">(</font><font size=2 color=#3f8080 face="Menlo-Regular">min</font><font size=2 face="Menlo-Regular">)</font><font size=2 color=#c21212 face="Menlo-Regular">):</font><font size=2 face="Menlo-Regular">\</font><font size=2 color=#c21212 face="Menlo-Regular">(</font><font size=2 color=#104160 face="Menlo-Regular">twoDigit</font><font size=2 face="Menlo-Regular">(</font><font size=2 color=#3f8080 face="Menlo-Regular">sec</font><font size=2 face="Menlo-Regular">)</font><font size=2 color=#c21212 face="Menlo-Regular">)
GMT"</font><br><font size=2 face="Menlo-Regular">}</font><br><br><font size=2 color=#008000 face="Menlo-Regular">// interpolation +
array</font><br><br><font size=2 color=#a1009f face="Menlo-Regular">func</font><font size=2 face="Menlo-Regular">httpDate1() -> </font><font size=2 color=#603181 face="Menlo-Regular">String</font><font size=2 face="Menlo-Regular">{</font><br><font size=2 face="Menlo-Regular"> </font><font size=2 color=#a1009f face="Menlo-Regular">return</font><font size=2 face="Menlo-Regular"></font><font size=2 color=#c21212 face="Menlo-Regular">"</font><font size=2 face="Menlo-Regular">\</font><font size=2 color=#c21212 face="Menlo-Regular">(</font><font size=2 color=#3f8080 face="Menlo-Regular">days</font><font size=2 face="Menlo-Regular">[</font><font size=2 color=#3f8080 face="Menlo-Regular">wday</font><font size=2 face="Menlo-Regular">]</font><font size=2 color=#c21212 face="Menlo-Regular">),
</font><font size=2 face="Menlo-Regular">\</font><font size=2 color=#c21212 face="Menlo-Regular">(</font><font size=2 color=#3f8080 face="Menlo-Regular">twoDigit</font><font size=2 face="Menlo-Regular">[</font><font size=2 color=#3f8080 face="Menlo-Regular">mday</font><font size=2 face="Menlo-Regular">]</font><font size=2 color=#c21212 face="Menlo-Regular">)
</font><font size=2 face="Menlo-Regular">\</font><font size=2 color=#c21212 face="Menlo-Regular">(</font><font size=2 color=#3f8080 face="Menlo-Regular">months</font><font size=2 face="Menlo-Regular">[</font><font size=2 color=#3f8080 face="Menlo-Regular">mon</font><font size=2 face="Menlo-Regular">]</font><font size=2 color=#c21212 face="Menlo-Regular">)
</font><font size=2 face="Menlo-Regular">\</font><font size=2 color=#c21212 face="Menlo-Regular">(</font><font size=2 color=#3f8080 face="Menlo-Regular">year</font><font size=2 color=#c21212 face="Menlo-Regular">)
</font><font size=2 face="Menlo-Regular">\</font><font size=2 color=#c21212 face="Menlo-Regular">(</font><font size=2 color=#3f8080 face="Menlo-Regular">twoDigit</font><font size=2 face="Menlo-Regular">[</font><font size=2 color=#3f8080 face="Menlo-Regular">hour</font><font size=2 face="Menlo-Regular">]</font><font size=2 color=#c21212 face="Menlo-Regular">):</font><font size=2 face="Menlo-Regular">\</font><font size=2 color=#c21212 face="Menlo-Regular">(</font><font size=2 color=#3f8080 face="Menlo-Regular">twoDigit</font><font size=2 face="Menlo-Regular">[</font><font size=2 color=#3f8080 face="Menlo-Regular">min</font><font size=2 face="Menlo-Regular">]</font><font size=2 color=#c21212 face="Menlo-Regular">):</font><font size=2 face="Menlo-Regular">\</font><font size=2 color=#c21212 face="Menlo-Regular">(</font><font size=2 color=#3f8080 face="Menlo-Regular">twoDigit</font><font size=2 face="Menlo-Regular">[</font><font size=2 color=#3f8080 face="Menlo-Regular">sec</font><font size=2 face="Menlo-Regular">]</font><font size=2 color=#c21212 face="Menlo-Regular">)
GMT"</font><br><font size=2 face="Menlo-Regular">}</font><br><br><font size=2 color=#008000 face="Menlo-Regular">// append + array</font><br><br><font size=2 color=#a1009f face="Menlo-Regular">func</font><font size=2 face="Menlo-Regular">httpDate2() -> </font><font size=2 color=#603181 face="Menlo-Regular">String</font><font size=2 face="Menlo-Regular">{</font><br><font size=2 face="Menlo-Regular"> </font><font size=2 color=#a1009f face="Menlo-Regular">var</font><font size=2 face="Menlo-Regular">s = </font><font size=2 color=#3f8080 face="Menlo-Regular">days</font><font size=2 face="Menlo-Regular">[</font><font size=2 color=#3f8080 face="Menlo-Regular">wday</font><font size=2 face="Menlo-Regular">]</font><br><font size=2 face="Menlo-Regular"> s.</font><font size=2 color=#1f007f face="Menlo-Regular">append</font><font size=2 face="Menlo-Regular">(</font><font size=2 color=#c21212 face="Menlo-Regular">",
"</font><font size=2 face="Menlo-Regular">)</font><br><font size=2 face="Menlo-Regular"> s.</font><font size=2 color=#1f007f face="Menlo-Regular">append</font><font size=2 face="Menlo-Regular">(</font><font size=2 color=#3f8080 face="Menlo-Regular">twoDigit</font><font size=2 face="Menlo-Regular">[</font><font size=2 color=#3f8080 face="Menlo-Regular">mday</font><font size=2 face="Menlo-Regular">])</font><br><font size=2 face="Menlo-Regular"> s.</font><font size=2 color=#1f007f face="Menlo-Regular">append</font><font size=2 face="Menlo-Regular">(</font><font size=2 color=#c21212 face="Menlo-Regular">"
"</font><font size=2 face="Menlo-Regular">)</font><br><font size=2 face="Menlo-Regular"> s.</font><font size=2 color=#1f007f face="Menlo-Regular">append</font><font size=2 face="Menlo-Regular">(</font><font size=2 color=#3f8080 face="Menlo-Regular">months</font><font size=2 face="Menlo-Regular">[</font><font size=2 color=#3f8080 face="Menlo-Regular">mon</font><font size=2 face="Menlo-Regular">])</font><br><font size=2 face="Menlo-Regular"> s.</font><font size=2 color=#1f007f face="Menlo-Regular">append</font><font size=2 face="Menlo-Regular">(</font><font size=2 color=#c21212 face="Menlo-Regular">"
"</font><font size=2 face="Menlo-Regular">)</font><br><font size=2 face="Menlo-Regular"> s.</font><font size=2 color=#1f007f face="Menlo-Regular">append</font><font size=2 face="Menlo-Regular">(</font><font size=2 color=#3f8080 face="Menlo-Regular">twoDigit</font><font size=2 face="Menlo-Regular">[</font><font size=2 color=#3f8080 face="Menlo-Regular">year</font><font size=2 face="Menlo-Regular">/</font><font size=2 color=#0000e0 face="Menlo-Regular">100</font><font size=2 face="Menlo-Regular">])</font><br><font size=2 face="Menlo-Regular"> s.</font><font size=2 color=#1f007f face="Menlo-Regular">append</font><font size=2 face="Menlo-Regular">(</font><font size=2 color=#3f8080 face="Menlo-Regular">twoDigit</font><font size=2 face="Menlo-Regular">[</font><font size=2 color=#3f8080 face="Menlo-Regular">year</font><font size=2 face="Menlo-Regular">%</font><font size=2 color=#0000e0 face="Menlo-Regular">100</font><font size=2 face="Menlo-Regular">])</font><br><font size=2 face="Menlo-Regular"> s.</font><font size=2 color=#1f007f face="Menlo-Regular">append</font><font size=2 face="Menlo-Regular">(</font><font size=2 color=#c21212 face="Menlo-Regular">"
"</font><font size=2 face="Menlo-Regular">)</font><br><font size=2 face="Menlo-Regular"> s.</font><font size=2 color=#1f007f face="Menlo-Regular">append</font><font size=2 face="Menlo-Regular">(</font><font size=2 color=#3f8080 face="Menlo-Regular">twoDigit</font><font size=2 face="Menlo-Regular">[</font><font size=2 color=#3f8080 face="Menlo-Regular">hour</font><font size=2 face="Menlo-Regular">])</font><br><font size=2 face="Menlo-Regular"> s.</font><font size=2 color=#1f007f face="Menlo-Regular">append</font><font size=2 face="Menlo-Regular">(</font><font size=2 color=#c21212 face="Menlo-Regular">":"</font><font size=2 face="Menlo-Regular">)</font><br><font size=2 face="Menlo-Regular"> s.</font><font size=2 color=#1f007f face="Menlo-Regular">append</font><font size=2 face="Menlo-Regular">(</font><font size=2 color=#3f8080 face="Menlo-Regular">twoDigit</font><font size=2 face="Menlo-Regular">[</font><font size=2 color=#3f8080 face="Menlo-Regular">min</font><font size=2 face="Menlo-Regular">])</font><br><font size=2 face="Menlo-Regular"> s.</font><font size=2 color=#1f007f face="Menlo-Regular">append</font><font size=2 face="Menlo-Regular">(</font><font size=2 color=#c21212 face="Menlo-Regular">":"</font><font size=2 face="Menlo-Regular">)</font><br><font size=2 face="Menlo-Regular"> s.</font><font size=2 color=#1f007f face="Menlo-Regular">append</font><font size=2 face="Menlo-Regular">(</font><font size=2 color=#3f8080 face="Menlo-Regular">twoDigit</font><font size=2 face="Menlo-Regular">[</font><font size=2 color=#3f8080 face="Menlo-Regular">sec</font><font size=2 face="Menlo-Regular">])</font><br><font size=2 face="Menlo-Regular"> s.</font><font size=2 color=#1f007f face="Menlo-Regular">append</font><font size=2 face="Menlo-Regular">(</font><font size=2 color=#c21212 face="Menlo-Regular">"
GMT"</font><font size=2 face="Menlo-Regular">)</font><br><font size=2 face="Menlo-Regular"> </font><font size=2 color=#a1009f face="Menlo-Regular">return</font><font size=2 face="Menlo-Regular">s</font><br><font size=2 face="Menlo-Regular">}</font><br><br><font size=2 color=#008000 face="Menlo-Regular">// memcpy + array</font><br><br><font size=2 color=#a1009f face="Menlo-Regular">func</font><font size=2 face="Menlo-Regular">httpDate3() -> </font><font size=2 color=#603181 face="Menlo-Regular">String</font><font size=2 face="Menlo-Regular">{</font><br><font size=2 face="Menlo-Regular"> </font><font size=2 color=#a1009f face="Menlo-Regular">var</font><font size=2 face="Menlo-Regular">s = </font><font size=2 color=#c21212 face="Menlo-Regular">"XXX, XX
XXX XXXX XX:XX:XX GMT"</font><br><font size=2 face="Menlo-Regular"> s.</font><font size=2 color=#1f007f face="Menlo-Regular">append</font><font size=2 face="Menlo-Regular">(</font><font size=2 color=#c21212 face="Menlo-Regular">""</font><font size=2 face="Menlo-Regular">)
</font><font size=2 color=#008000 face="Menlo-Regular">// force alloc</font><br><font size=2 face="Menlo-Regular"> </font><font size=2 color=#a1009f face="Menlo-Regular">let</font><font size=2 face="Menlo-Regular">ptr = s.</font><font size=2 color=#603181 face="Menlo-Regular">_core</font><font size=2 face="Menlo-Regular">.</font><font size=2 color=#603181 face="Menlo-Regular">_baseAddress</font><font size=2 face="Menlo-Regular">!</font><br><font size=2 face="Menlo-Regular"> </font><font size=2 color=#1f007f face="Menlo-Regular">memcpy</font><font size=2 face="Menlo-Regular">(ptr,
</font><font size=2 color=#3f8080 face="Menlo-Regular">days</font><font size=2 face="Menlo-Regular">[</font><font size=2 color=#3f8080 face="Menlo-Regular">wday</font><font size=2 face="Menlo-Regular">].</font><font size=2 color=#603181 face="Menlo-Regular">_core</font><font size=2 face="Menlo-Regular">.</font><font size=2 color=#603181 face="Menlo-Regular">_baseAddress</font><font size=2 face="Menlo-Regular">!,
</font><font size=2 color=#0000e0 face="Menlo-Regular">3</font><font size=2 face="Menlo-Regular">)</font><br><font size=2 face="Menlo-Regular"> </font><font size=2 color=#1f007f face="Menlo-Regular">memcpy</font><font size=2 face="Menlo-Regular">(ptr.</font><font size=2 color=#1f007f face="Menlo-Regular">advanced</font><font size=2 face="Menlo-Regular">(by:
</font><font size=2 color=#0000e0 face="Menlo-Regular">8</font><font size=2 face="Menlo-Regular">),
</font><font size=2 color=#3f8080 face="Menlo-Regular">months</font><font size=2 face="Menlo-Regular">[</font><font size=2 color=#3f8080 face="Menlo-Regular">mon</font><font size=2 face="Menlo-Regular">].</font><font size=2 color=#603181 face="Menlo-Regular">_core</font><font size=2 face="Menlo-Regular">.</font><font size=2 color=#603181 face="Menlo-Regular">_baseAddress</font><font size=2 face="Menlo-Regular">!,
</font><font size=2 color=#0000e0 face="Menlo-Regular">3</font><font size=2 face="Menlo-Regular">)</font><br><font size=2 face="Menlo-Regular"> </font><font size=2 color=#1f007f face="Menlo-Regular">memcpy</font><font size=2 face="Menlo-Regular">(ptr.</font><font size=2 color=#1f007f face="Menlo-Regular">advanced</font><font size=2 face="Menlo-Regular">(by:
</font><font size=2 color=#0000e0 face="Menlo-Regular">5</font><font size=2 face="Menlo-Regular">),
</font><font size=2 color=#3f8080 face="Menlo-Regular">twoDigit</font><font size=2 face="Menlo-Regular">[</font><font size=2 color=#3f8080 face="Menlo-Regular">mday</font><font size=2 face="Menlo-Regular">].</font><font size=2 color=#603181 face="Menlo-Regular">_core</font><font size=2 face="Menlo-Regular">.</font><font size=2 color=#603181 face="Menlo-Regular">_baseAddress</font><font size=2 face="Menlo-Regular">!,
</font><font size=2 color=#0000e0 face="Menlo-Regular">2</font><font size=2 face="Menlo-Regular">)</font><br><font size=2 face="Menlo-Regular"> </font><font size=2 color=#1f007f face="Menlo-Regular">memcpy</font><font size=2 face="Menlo-Regular">(ptr.</font><font size=2 color=#1f007f face="Menlo-Regular">advanced</font><font size=2 face="Menlo-Regular">(by:
</font><font size=2 color=#0000e0 face="Menlo-Regular">12</font><font size=2 face="Menlo-Regular">),
</font><font size=2 color=#3f8080 face="Menlo-Regular">twoDigit</font><font size=2 face="Menlo-Regular">[</font><font size=2 color=#3f8080 face="Menlo-Regular">year</font><font size=2 face="Menlo-Regular">/</font><font size=2 color=#0000e0 face="Menlo-Regular">100</font><font size=2 face="Menlo-Regular">].</font><font size=2 color=#603181 face="Menlo-Regular">_core</font><font size=2 face="Menlo-Regular">.</font><font size=2 color=#603181 face="Menlo-Regular">_baseAddress</font><font size=2 face="Menlo-Regular">!,
</font><font size=2 color=#0000e0 face="Menlo-Regular">2</font><font size=2 face="Menlo-Regular">)</font><br><font size=2 face="Menlo-Regular"> </font><font size=2 color=#1f007f face="Menlo-Regular">memcpy</font><font size=2 face="Menlo-Regular">(ptr.</font><font size=2 color=#1f007f face="Menlo-Regular">advanced</font><font size=2 face="Menlo-Regular">(by:
</font><font size=2 color=#0000e0 face="Menlo-Regular">14</font><font size=2 face="Menlo-Regular">),
</font><font size=2 color=#3f8080 face="Menlo-Regular">twoDigit</font><font size=2 face="Menlo-Regular">[</font><font size=2 color=#3f8080 face="Menlo-Regular">year</font><font size=2 face="Menlo-Regular">%</font><font size=2 color=#0000e0 face="Menlo-Regular">100</font><font size=2 face="Menlo-Regular">].</font><font size=2 color=#603181 face="Menlo-Regular">_core</font><font size=2 face="Menlo-Regular">.</font><font size=2 color=#603181 face="Menlo-Regular">_baseAddress</font><font size=2 face="Menlo-Regular">!,
</font><font size=2 color=#0000e0 face="Menlo-Regular">2</font><font size=2 face="Menlo-Regular">)</font><br><font size=2 face="Menlo-Regular"> </font><font size=2 color=#1f007f face="Menlo-Regular">memcpy</font><font size=2 face="Menlo-Regular">(ptr.</font><font size=2 color=#1f007f face="Menlo-Regular">advanced</font><font size=2 face="Menlo-Regular">(by:
</font><font size=2 color=#0000e0 face="Menlo-Regular">17</font><font size=2 face="Menlo-Regular">),
</font><font size=2 color=#3f8080 face="Menlo-Regular">twoDigit</font><font size=2 face="Menlo-Regular">[</font><font size=2 color=#3f8080 face="Menlo-Regular">hour</font><font size=2 face="Menlo-Regular">].</font><font size=2 color=#603181 face="Menlo-Regular">_core</font><font size=2 face="Menlo-Regular">.</font><font size=2 color=#603181 face="Menlo-Regular">_baseAddress</font><font size=2 face="Menlo-Regular">!,
</font><font size=2 color=#0000e0 face="Menlo-Regular">2</font><font size=2 face="Menlo-Regular">)</font><br><font size=2 face="Menlo-Regular"> </font><font size=2 color=#1f007f face="Menlo-Regular">memcpy</font><font size=2 face="Menlo-Regular">(ptr.</font><font size=2 color=#1f007f face="Menlo-Regular">advanced</font><font size=2 face="Menlo-Regular">(by:
</font><font size=2 color=#0000e0 face="Menlo-Regular">20</font><font size=2 face="Menlo-Regular">),
</font><font size=2 color=#3f8080 face="Menlo-Regular">twoDigit</font><font size=2 face="Menlo-Regular">[</font><font size=2 color=#3f8080 face="Menlo-Regular">min</font><font size=2 face="Menlo-Regular">].</font><font size=2 color=#603181 face="Menlo-Regular">_core</font><font size=2 face="Menlo-Regular">.</font><font size=2 color=#603181 face="Menlo-Regular">_baseAddress</font><font size=2 face="Menlo-Regular">!,
</font><font size=2 color=#0000e0 face="Menlo-Regular">2</font><font size=2 face="Menlo-Regular">)</font><br><font size=2 face="Menlo-Regular"> </font><font size=2 color=#1f007f face="Menlo-Regular">memcpy</font><font size=2 face="Menlo-Regular">(ptr.</font><font size=2 color=#1f007f face="Menlo-Regular">advanced</font><font size=2 face="Menlo-Regular">(by:
</font><font size=2 color=#0000e0 face="Menlo-Regular">23</font><font size=2 face="Menlo-Regular">),
</font><font size=2 color=#3f8080 face="Menlo-Regular">twoDigit</font><font size=2 face="Menlo-Regular">[</font><font size=2 color=#3f8080 face="Menlo-Regular">sec</font><font size=2 face="Menlo-Regular">].</font><font size=2 color=#603181 face="Menlo-Regular">_core</font><font size=2 face="Menlo-Regular">.</font><font size=2 color=#603181 face="Menlo-Regular">_baseAddress</font><font size=2 face="Menlo-Regular">!,
</font><font size=2 color=#0000e0 face="Menlo-Regular">2</font><font size=2 face="Menlo-Regular">)</font><br><font size=2 face="Menlo-Regular"> </font><font size=2 color=#a1009f face="Menlo-Regular">return</font><font size=2 face="Menlo-Regular">s</font><br><font size=2 face="Menlo-Regular">}</font><br><br><font size=2 color=#a1009f face="Menlo-Regular">var</font><font size=2 face="Menlo-Regular">s = </font><font size=2 color=#c21212 face="Menlo-Regular">""</font><br><br><font size=2 color=#a1009f face="Menlo-Regular">var</font><font size=2 face="Menlo-Regular">now = </font><font size=2 color=#1f007f face="Menlo-Regular">mach_absolute_time</font><font size=2 face="Menlo-Regular">()</font><br><font size=2 color=#a1009f face="Menlo-Regular">for</font><font size=2 face="Menlo-Regular"></font><font size=2 color=#a1009f face="Menlo-Regular">_</font><font size=2 face="Menlo-Regular"></font><font size=2 color=#a1009f face="Menlo-Regular">in</font><font size=2 face="Menlo-Regular"></font><font size=2 color=#0000e0 face="Menlo-Regular">0</font><font size=2 face="Menlo-Regular">..<</font><font size=2 color=#0000e0 face="Menlo-Regular">1000000</font><font size=2 face="Menlo-Regular">{</font><br><font size=2 face="Menlo-Regular"> </font><font size=2 color=#3f8080 face="Menlo-Regular">s</font><font size=2 face="Menlo-Regular">= </font><font size=2 color=#104160 face="Menlo-Regular">httpDate</font><font size=2 face="Menlo-Regular">()</font><br><font size=2 face="Menlo-Regular">}</font><br><font size=2 color=#1f007f face="Menlo-Regular">print</font><font size=2 face="Menlo-Regular">(</font><font size=2 color=#3f8080 face="Menlo-Regular">s</font><font size=2 face="Menlo-Regular">)</font><br><font size=2 color=#1f007f face="Menlo-Regular">print</font><font size=2 face="Menlo-Regular">(</font><font size=2 color=#c21212 face="Menlo-Regular">"interpolation
+ func: </font><font size=2 face="Menlo-Regular">\</font><font size=2 color=#c21212 face="Menlo-Regular">(</font><font size=2 color=#603181 face="Menlo-Regular">Double</font><font size=2 face="Menlo-Regular">(</font><font size=2 color=#1f007f face="Menlo-Regular">mach_absolute_time</font><font size=2 face="Menlo-Regular">()
- </font><font size=2 color=#3f8080 face="Menlo-Regular">now</font><font size=2 face="Menlo-Regular">)
/ </font><font size=2 color=#0000e0 face="Menlo-Regular">1e9</font><font size=2 color=#c21212 face="Menlo-Regular">)s\n"</font><font size=2 face="Menlo-Regular">)</font><br><br><font size=2 color=#3f8080 face="Menlo-Regular">now</font><font size=2 face="Menlo-Regular">= </font><font size=2 color=#1f007f face="Menlo-Regular">mach_absolute_time</font><font size=2 face="Menlo-Regular">()</font><br><font size=2 color=#a1009f face="Menlo-Regular">for</font><font size=2 face="Menlo-Regular"></font><font size=2 color=#a1009f face="Menlo-Regular">_</font><font size=2 face="Menlo-Regular"></font><font size=2 color=#a1009f face="Menlo-Regular">in</font><font size=2 face="Menlo-Regular"></font><font size=2 color=#0000e0 face="Menlo-Regular">0</font><font size=2 face="Menlo-Regular">..<</font><font size=2 color=#0000e0 face="Menlo-Regular">1000000</font><font size=2 face="Menlo-Regular">{</font><br><font size=2 face="Menlo-Regular"> </font><font size=2 color=#3f8080 face="Menlo-Regular">s</font><font size=2 face="Menlo-Regular">= </font><font size=2 color=#104160 face="Menlo-Regular">httpDate1</font><font size=2 face="Menlo-Regular">()</font><br><font size=2 face="Menlo-Regular">}</font><br><font size=2 color=#1f007f face="Menlo-Regular">print</font><font size=2 face="Menlo-Regular">(</font><font size=2 color=#3f8080 face="Menlo-Regular">s</font><font size=2 face="Menlo-Regular">)</font><br><font size=2 color=#1f007f face="Menlo-Regular">print</font><font size=2 face="Menlo-Regular">(</font><font size=2 color=#c21212 face="Menlo-Regular">"interpolation
+ array: </font><font size=2 face="Menlo-Regular">\</font><font size=2 color=#c21212 face="Menlo-Regular">(</font><font size=2 color=#603181 face="Menlo-Regular">Double</font><font size=2 face="Menlo-Regular">(</font><font size=2 color=#1f007f face="Menlo-Regular">mach_absolute_time</font><font size=2 face="Menlo-Regular">()
- </font><font size=2 color=#3f8080 face="Menlo-Regular">now</font><font size=2 face="Menlo-Regular">)
/ </font><font size=2 color=#0000e0 face="Menlo-Regular">1e9</font><font size=2 color=#c21212 face="Menlo-Regular">)s\n"</font><font size=2 face="Menlo-Regular">)</font><br><br><font size=2 color=#3f8080 face="Menlo-Regular">now</font><font size=2 face="Menlo-Regular">= </font><font size=2 color=#1f007f face="Menlo-Regular">mach_absolute_time</font><font size=2 face="Menlo-Regular">()</font><br><font size=2 color=#a1009f face="Menlo-Regular">for</font><font size=2 face="Menlo-Regular"></font><font size=2 color=#a1009f face="Menlo-Regular">_</font><font size=2 face="Menlo-Regular"></font><font size=2 color=#a1009f face="Menlo-Regular">in</font><font size=2 face="Menlo-Regular"></font><font size=2 color=#0000e0 face="Menlo-Regular">0</font><font size=2 face="Menlo-Regular">..<</font><font size=2 color=#0000e0 face="Menlo-Regular">1000000</font><font size=2 face="Menlo-Regular">{</font><br><font size=2 face="Menlo-Regular"> </font><font size=2 color=#3f8080 face="Menlo-Regular">s</font><font size=2 face="Menlo-Regular">= </font><font size=2 color=#104160 face="Menlo-Regular">httpDate2</font><font size=2 face="Menlo-Regular">()</font><br><font size=2 face="Menlo-Regular">}</font><br><font size=2 color=#1f007f face="Menlo-Regular">print</font><font size=2 face="Menlo-Regular">(</font><font size=2 color=#3f8080 face="Menlo-Regular">s</font><font size=2 face="Menlo-Regular">)</font><br><font size=2 color=#1f007f face="Menlo-Regular">print</font><font size=2 face="Menlo-Regular">(</font><font size=2 color=#c21212 face="Menlo-Regular">"append:
</font><font size=2 face="Menlo-Regular">\</font><font size=2 color=#c21212 face="Menlo-Regular">(</font><font size=2 color=#603181 face="Menlo-Regular">Double</font><font size=2 face="Menlo-Regular">(</font><font size=2 color=#1f007f face="Menlo-Regular">mach_absolute_time</font><font size=2 face="Menlo-Regular">()
- </font><font size=2 color=#3f8080 face="Menlo-Regular">now</font><font size=2 face="Menlo-Regular">)
/ </font><font size=2 color=#0000e0 face="Menlo-Regular">1e9</font><font size=2 color=#c21212 face="Menlo-Regular">)s\n"</font><font size=2 face="Menlo-Regular">)</font><br><br><font size=2 color=#3f8080 face="Menlo-Regular">now</font><font size=2 face="Menlo-Regular">= </font><font size=2 color=#1f007f face="Menlo-Regular">mach_absolute_time</font><font size=2 face="Menlo-Regular">()</font><br><font size=2 color=#a1009f face="Menlo-Regular">for</font><font size=2 face="Menlo-Regular"></font><font size=2 color=#a1009f face="Menlo-Regular">_</font><font size=2 face="Menlo-Regular"></font><font size=2 color=#a1009f face="Menlo-Regular">in</font><font size=2 face="Menlo-Regular"></font><font size=2 color=#0000e0 face="Menlo-Regular">0</font><font size=2 face="Menlo-Regular">..<</font><font size=2 color=#0000e0 face="Menlo-Regular">1000000</font><font size=2 face="Menlo-Regular">{</font><br><font size=2 face="Menlo-Regular"> </font><font size=2 color=#3f8080 face="Menlo-Regular">s</font><font size=2 face="Menlo-Regular">= </font><font size=2 color=#104160 face="Menlo-Regular">httpDate3</font><font size=2 face="Menlo-Regular">()</font><br><font size=2 face="Menlo-Regular">}</font><br><font size=2 color=#1f007f face="Menlo-Regular">print</font><font size=2 face="Menlo-Regular">(</font><font size=2 color=#3f8080 face="Menlo-Regular">s</font><font size=2 face="Menlo-Regular">)</font><br><font size=2 color=#1f007f face="Menlo-Regular">print</font><font size=2 face="Menlo-Regular">(</font><font size=2 color=#c21212 face="Menlo-Regular">"memcpy:
</font><font size=2 face="Menlo-Regular">\</font><font size=2 color=#c21212 face="Menlo-Regular">(</font><font size=2 color=#603181 face="Menlo-Regular">Double</font><font size=2 face="Menlo-Regular">(</font><font size=2 color=#1f007f face="Menlo-Regular">mach_absolute_time</font><font size=2 face="Menlo-Regular">()
- </font><font size=2 color=#3f8080 face="Menlo-Regular">now</font><font size=2 face="Menlo-Regular">)
/ </font><font size=2 color=#0000e0 face="Menlo-Regular">1e9</font><font size=2 color=#c21212 face="Menlo-Regular">)s\n"</font><font size=2 face="Menlo-Regular">)</font><br><br><br><br><br><font size=1 color=#5f5f5f face="sans-serif">From:
</font><font size=1 face="sans-serif">Ben Cohen via swift-evolution
<swift-evolution@swift.org></font><br><font size=1 color=#5f5f5f face="sans-serif">To:
</font><font size=1 face="sans-serif">swift-evolution <swift-evolution@swift.org></font><br><font size=1 color=#5f5f5f face="sans-serif">Cc:
</font><font size=1 face="sans-serif">Dave Abrahams <dabrahams@apple.com></font><br><font size=1 color=#5f5f5f face="sans-serif">Date:
</font><font size=1 face="sans-serif">01/19/2017 09:56 PM</font><br><font size=1 color=#5f5f5f face="sans-serif">Subject:
</font><font size=1 face="sans-serif">[swift-evolution]
Strings in Swift 4</font><br><font size=1 color=#5f5f5f face="sans-serif">Sent by:
</font><font size=1 face="sans-serif">swift-evolution-bounces@swift.org</font><br><hr noshade><br><br><br><tt><font size=2>Hi all,<br><br>Below is our take on a design manifesto for Strings in Swift 4 and beyond.<br><br>Probably best read in rendered markdown on GitHub:<br></font></tt><a href=https://github.com/apple/swift/blob/master/docs/StringManifesto.md><tt><font size=2>https://github.com/apple/swift/blob/master/docs/StringManifesto.md</font></tt></a><tt><font size=2><br><br>We’re eager to hear everyone’s thoughts.<br><br>Regards,<br>Ben and Dave<br><br><br># String Processing For Swift 4<br><br>* Authors: [Dave Abrahams](</font></tt><a href=https://github.com/dabrahams><tt><font size=2>https://github.com/dabrahams</font></tt></a><tt><font size=2>),
[Ben Cohen](</font></tt><a href=https://github.com/airspeedswift><tt><font size=2>https://github.com/airspeedswift</font></tt></a><tt><font size=2>)<br><br>The goal of re-evaluating Strings for Swift 4 has been fairly ill-defined
thus<br>far, with just this short blurb in the<br>[list of goals](</font></tt><a href="https://lists.swift.org/pipermail/swift-evolution/Week-of-Mon-20160725/025676.html"><tt><font size=2>https://lists.swift.org/pipermail/swift-evolution/Week-of-Mon-20160725/025676.html</font></tt></a><tt><font size=2>):<br><br>> **String re-evaluation**: String is one of the most important fundamental<br>> types in the language. The standard library leads have numerous
ideas of how<br>> to improve the programming model for it, without jeopardizing the
goals of<br>> providing a unicode-correct-by-default model. Our goal is to
be better at<br>> string processing than Perl!<br> <br>For Swift 4 and beyond we want to improve three dimensions of text processing:<br><br> 1. Ergonomics<br> 2. Correctness<br> 3. Performance<br><br>This document is meant to both provide a sense of the long-term vision
<br>(including undecided issues and possible approaches), and to define the
scope of<br>work that could be done in the Swift 4 timeframe.<br><br>## General Principles<br><br>### Ergonomics<br><br>It's worth noting that ergonomics and correctness are mutually-reinforcing.
An<br>API that is easy to use—but incorrectly—cannot be considered an ergonomic<br>success. Conversely, an API that's simply hard to use is also hard
to use<br>correctly. Acheiving optimal performance without compromising ergonomics
or<br>correctness is a greater challenge.<br><br>Consistency with the Swift language and idioms is also important for<br>ergonomics. There are several places both in the standard library and in
the<br>foundation additions to `String` where patterns and practices found elsewhere<br>could be applied to improve usability and familiarity.<br><br>### API Surface Area<br><br>Primary data types such as `String` should have APIs that are easily understood<br>given a signature and a one-line summary. Today, `String` fails that
test. As<br>you can see, the Standard Library and Foundation both contribute significantly
to<br>its overall complexity.<br><br>**Method Arity** | **Standard Library** | **Foundation**<br>---|:---:|:---:<br>0: `ƒ()` | 5 | 7<br>1: `ƒ(:)` | 19 | 48<br>2: `ƒ(::)` | 13 | 19<br>3: `ƒ(:::)` | 5 | 11<br>4: `ƒ(::::)` | 1 | 7<br>5: `ƒ(:::::)` | - | 2<br>6: `ƒ(::::::)` | - | 1<br><br>**API Kind** | **Standard Library** | **Foundation**<br>---|:---:|:---:<br>`init` | 41 | 18<br>`func` | 42 | 55<br>`subscript` | 9 | 0<br>`var` | 26 | 14<br><br>**Total: 205 APIs**<br><br>By contrast, `Int` has 80 APIs, none with more than two parameters.[0]
String processing is complex enough; users shouldn't have<br>to press through physical API sprawl just to get started.<br><br>Many of the choices detailed below contribute to solving this problem,<br>including:<br><br> * Restoring `Collection` conformance and dropping the `.characters`
view.<br> * Providing a more general, composable slicing syntax.<br> * Altering `Comparable` so that parameterized<br> (e.g. case-insensitive) comparison fits smoothly into the
basic syntax.<br> * Clearly separating language-dependent operations on text produced
<br> by and for humans from language-independent<br> operations on text produced by and for machine processing.<br> * Relocating APIs that fall outside the domain of basic string processing
and<br> discouraging the proliferation of ad-hoc extensions.<br><br><br>### Batteries Included<br><br>While `String` is available to all programs out-of-the-box, crucial APIs
for<br>basic string processing tasks are still inaccessible until `Foundation`
is<br>imported. While it makes sense that `Foundation` is needed for domain-specific<br>jobs such as<br>[linguistic tagging](</font></tt><a href=https://developer.apple.com/reference/foundation/nslinguistictagger><tt><font size=2>https://developer.apple.com/reference/foundation/nslinguistictagger</font></tt></a><tt><font size=2>),<br>one should not need to import anything to, for example, do case-insensitive<br>comparison.<br><br>### Unicode Compliance and Platform Support<br><br>The Unicode standard provides a crucial objective reference point for what<br>constitutes correct behavior in an extremely complex domain, so<br>Unicode-correctness is, and will remain, a fundamental design principle
behind<br>Swift's `String`. That said, the Unicode standard is an evolving
document, so<br>this objective reference-point is not fixed.[1] While<br>many of the most important operations—e.g. string hashing, equality, and<br>non-localized comparison—will be stable, the semantics<br>of others, such as grapheme breaking and localized comparison and case<br>conversion, are expected to change as platforms are updated, so programs
should<br>be written so their correctness does not depend on precise stability of
these<br>semantics across OS versions or platforms. Although it may be possible
to<br>imagine static and/or dynamic analysis tools that will help users find
such<br>errors, the only sure way to deal with this fact of life is to educate
users.<br><br>## Design Points<br><br>### Internationalization<br><br>There is strong evidence that developers cannot determine how to use<br>internationalization APIs correctly. Although documentation could
and should be<br>improved, the sheer size, complexity, and diversity of these APIs is a
major<br>contributor to the problem, causing novices to tune out, and more experienced<br>programmers to make avoidable mistakes.<br><br>The first step in improving this situation is to regularize all localized<br>operations as invocations of normal string operations with extra<br>parameters. Among other things, this means:<br><br>1. Doing away with `localizedXXX` methods <br>2. Providing a terse way to name the current locale as a parameter<br>3. Automatically adjusting defaults for options such<br> as case sensitivity based on whether the operation is localized.<br>4. Removing correctness traps like `localizedCaseInsensitiveCompare` (see<br> guidance in the<br> [Internationalization and Localization Guide](</font></tt><a href=https://developer.apple.com/library/content/documentation/MacOSX/Conceptual/BPInternational/InternationalizingYourCode/InternationalizingYourCode.html><tt><font size=2>https://developer.apple.com/library/content/documentation/MacOSX/Conceptual/BPInternational/InternationalizingYourCode/InternationalizingYourCode.html</font></tt></a><tt><font size=2>).<br><br>Along with appropriate documentation updates, these changes will make localized<br>operations more teachable, comprehensible, and approachable, thereby lowering
a<br>barrier that currently leads some developers to ignore localization issues<br>altogether.<br><br>#### The Default Behavior of `String`<br><br>Although this isn't well-known, the most accessible form of many operations
on<br>Swift `String` (and `NSString`) are really only appropriate for text that
is<br>intended to be processed for, and consumed by, machines. The semantics
of the<br>operations with the simplest spellings are always non-localized and<br>language-agnostic.<br><br>Two major factors play into this design choice:<br><br>1. Machine processing of text is important, so we should have first-class,<br> accessible functions appropriate to that use case.<br> <br>2. The most general localized operations require a locale parameter not
required<br> by their un-localized counterparts. This naturally skews
complexity towards<br> localized operations.<br><br>Reaffirming that `String`'s simplest APIs have<br>language-independent/machine-processed semantics has the benefit of clarifying<br>the proper default behavior of operations such as comparison, and allows
us to<br>make [significant optimizations](#collation-semantics) that were previously<br>thought to conflict with Unicode.<br><br>#### Future Directions<br><br>One of the most common internationalization errors is the unintentional<br>presentation to users of text that has not been localized, but regularizing
APIs<br>and improving documentation can go only so far in preventing this error.<br>Combined with the fact that `String` operations are non-localized by default,<br>the environment for processing human-readable text may still be somewhat<br>error-prone in Swift 4.<br><br>For an audience of mostly non-experts, it is especially important that
naïve<br>code is very likely to be correct if it compiles, and that more sophisticated<br>issues can be revealed progressively. For this reason, we intend
to<br>specifically and separately target localization and internationalization<br>problems in the Swift 5 timeframe.<br><br>### Operations With Options<br><br>There are three categories of common string operation that commonly need
to be<br>tuned in various dimensions:<br><br>**Operation**|**Applicable Options**<br>---|---<br>sort ordering | locale, case/diacritic/width-insensitivity<br>case conversion | locale<br>pattern matching | locale, case/diacritic/width-insensitivity<br><br>The defaults for case-, diacritic-, and width-insensitivity are different
for<br>localized operations than for non-localized operations, so for example
a<br>localized sort should be case-insensitive by default, and a non-localized
sort<br>should be case-sensitive by default. We propose a standard “language”
of<br>defaulted parameters to be used for these purposes, with usage roughly
like this:<br><br>```swift<br> x.compared(to: y, case: .sensitive, in: swissGerman)<br> <br> x.lowercased(in: .currentLocale)<br> <br> x.allMatches(<br> somePattern, case: .insensitive, diacritic: .insensitive)<br>```<br><br>This usage might be supported by code like this:<br><br>```swift<br>enum StringSensitivity {<br>case sensitive<br>case insensitive<br>}<br><br>extension Locale {<br> static var currentLocale: Locale { ... }<br>}<br><br>extension Unicode {<br> // An example of the option language in declaration context,<br> // with nil defaults indicating unspecified, so defaults can be<br> // driven by the presence/absence of a specific Locale<br> func frobnicated(<br> case caseSensitivity: StringSensitivity? = nil,<br> diacritic diacriticSensitivity: StringSensitivity? = nil,<br> width widthSensitivity: StringSensitivity? = nil,<br> in locale: Locale? = nil<br> ) -> Self { ... }<br>}<br>```<br><br>### Comparing and Hashing Strings<br><br>#### Collation Semantics<br><br>What Unicode says about collation—which is used in `<`, `==`, and hashing—
turns<br>out to be quite interesting, once you pick it apart. The full Unicode
Collation<br>Algorithm (UCA) works like this:<br><br>1. Fully normalize both strings<br>2. Convert each string to a sequence of numeric triples to form a collation
key<br>3. “Flatten” the key by concatenating the sequence of first elements
to the<br> sequence of second elements to the sequence of third elements<br>4. Lexicographically compare the flattened keys <br><br>While step 1 can usually<br>be [done quickly](</font></tt><a href=http://unicode.org/reports/tr15/#Description_Norm><tt><font size=2>http://unicode.org/reports/tr15/#Description_Norm</font></tt></a><tt><font size=2>)
and<br>incrementally, step 2 uses a collation table that maps matching *sequences*
of<br>unicode scalars in the normalized string to *sequences* of triples, which
get<br>accumulated into a collation key. Predictably, this is where the
real costs<br>lie.<br><br>*However*, there are some bright spots to this story. First, as it
turns out,<br>string sorting (localized or not) should be done down to what's called<br>the<br>[“identical” level](</font></tt><a href=http://unicode.org/reports/tr10/#Multi_Level_Comparison><tt><font size=2>http://unicode.org/reports/tr10/#Multi_Level_Comparison</font></tt></a><tt><font size=2>),<br>which adds a step 3a: append the string's normalized form to the flattened<br>collation key. At first blush this just adds work, but consider what
it does<br>for equality: two strings that normalize the same, naturally, will collate
the<br>same. But also, *strings that normalize differently will always collate<br>differently*. In other words, for equality, it is sufficient to compare
the<br>strings' normalized forms and see if they are the same. We can therefore<br>entirely skip the expensive part of collation for equality comparison.<br><br>Next, naturally, anything that applies to equality also applies to hashing:
it<br>is sufficient to hash the string's normalized form, bypassing collation
keys.<br>This should provide significant speedups over the current implementation.<br>Perhaps more importantly, since comparison down to the “identical” level
applies<br>even to localized strings, it means that hashing and equality can be implemented<br>exactly the same way for localized and non-localized text, and hash tables
with<br>localized keys will remain valid across current-locale changes.<br><br>Finally, once it is agreed that the *default* role for `String` is to handle<br>machine-generated and machine-readable text, the default ordering of `String`s<br>need no longer use the UCA at all. It is sufficient to order them
in any way<br>that's consistent with equality, so `String` ordering can simply be a<br>lexicographical comparison of normalized forms,[4]<br>(which is equivalent to lexicographically comparing the sequences of grapheme<br>clusters), again bypassing step 2 and offering another speedup.<br><br>This leaves us executing the full UCA *only* for localized sorting, and
ICU's<br>implementation has apparently been very well optimized.<br><br>Following this scheme everywhere would also allow us to make sorting behavior<br>consistent across platforms. Currently, we sort `String` according
to the UCA,<br>except that—*only on Apple platforms*—pairs of ASCII characters are ordered
by<br>unicode scalar value.<br><br>#### Syntax<br><br>Because the current `Comparable` protocol expresses all comparisons with
binary<br>operators, string comparisons—which may require<br>additional [options](#operations-with-options)—do not fit smoothly into
the<br>existing syntax. At the same time, we'd like to solve other problems
with<br>comparison, as outlined<br>in<br>[this proposal](</font></tt><a href=https://gist.github.com/CodaFi/f0347bd37f1c407bf7ea0c429ead380e><tt><font size=2>https://gist.github.com/CodaFi/f0347bd37f1c407bf7ea0c429ead380e</font></tt></a><tt><font size=2>)<br>(implemented by changes at the head<br>of<br>[this branch](</font></tt><a href="https://github.com/CodaFi/swift/commits/space-the-final-frontier"><tt><font size=2>https://github.com/CodaFi/swift/commits/space-the-final-frontier</font></tt></a><tt><font size=2>)).<br>We should adopt a modification of that proposal that uses a method rather
than<br>an operator `<=>`:<br><br>```swift<br>enum SortOrder { case before, same, after }<br><br>protocol Comparable : Equatable {<br> func compared(to: Self) -> SortOrder<br> ...<br>}<br>```<br><br>This change will give us a syntactic platform on which to implement methods
with<br>additional, defaulted arguments, thereby unifying and regularizing comparison<br>across the library.<br><br>```swift<br>extension String {<br> func compared(to: Self) -> SortOrder<br><br>}<br>```<br><br>**Note:** `SortOrder` should bridge to `NSComparisonResult`. It's
also possible<br>that the standard library simply adopts Foundation's `ComparisonResult`
as is,<br>but we believe the community should at least consider alternate naming
before<br>that happens. There will be an opportunity to discuss the choices
in detail<br>when the modified<br>[Comparison Proposal](</font></tt><a href=https://gist.github.com/CodaFi/f0347bd37f1c407bf7ea0c429ead380e><tt><font size=2>https://gist.github.com/CodaFi/f0347bd37f1c407bf7ea0c429ead380e</font></tt></a><tt><font size=2>)
comes<br>up for review.<br><br>### `String` should be a `Collection` of `Character`s Again<br><br>In Swift 2.0, `String`'s `Collection` conformance was dropped, because
we<br>convinced ourselves that its semantics differed from those of `Collection`
too<br>significantly.<br><br>It was always well understood that if strings were treated as sequences
of<br>`UnicodeScalar`s, algorithms such as `lexicographicalCompare`, `elementsEqual`,<br>and `reversed` would produce nonsense results. Thus, in Swift 1.0, `String`
was<br>a collection of `Character` (extended grapheme clusters). During 2.0<br>development, though, we realized that correct string concatenation could<br>occasionally merge distinct grapheme clusters at the start and end of combined<br>strings.<br><br>This quirk aside, every aspect of strings-as-collections-of-graphemes appears
to<br>comport perfectly with Unicode. We think the concatenation problem is tolerable,<br>because the cases where it occurs all represent partially-formed constructs.
The<br>largest class—isolated combining characters such as ◌́ (U+0301 COMBINING
ACUTE<br>ACCENT)—are explicitly called out in the Unicode standard as<br>“[degenerate](</font></tt><a href=http://unicode.org/reports/tr29/#Grapheme_Cluster_Boundaries><tt><font size=2>http://unicode.org/reports/tr29/#Grapheme_Cluster_Boundaries</font></tt></a><tt><font size=2>)”
or<br>“[defective](</font></tt><a href=http://www.unicode.org/versions/Unicode9.0.0/ch03.pdf><tt><font size=2>http://www.unicode.org/versions/Unicode9.0.0/ch03.pdf</font></tt></a><tt><font size=2>)”.
The other<br>cases—such as a string ending in a zero-width joiner or half of a regional<br>indicator—appear to be equally transient and unlikely outside of a text
editor.<br><br>Admitting these cases encourages exploration of grapheme composition and
is<br>consistent with what appears to be an overall Unicode philosophy that “no<br>special provisions are made to get marginally better behavior for… cases
that<br>never occur in practice.”[2] Furthermore, it seems<br>unlikely to disturb the semantics of any plausible algorithms. We can handle<br>these cases by documenting them, explicitly stating that the elements of
a<br>`String` are an emergent property based on Unicode rules.<br><br>The benefits of restoring `Collection` conformance are substantial: <br><br> * Collection-like operations encourage experimentation with strings
to<br> investigate and understand their behavior. This is useful
for teaching new<br> programmers, but also good for experienced programmers who
want to<br> understand more about strings/unicode.<br> <br> * Extended grapheme clusters form a natural element boundary for
Unicode<br> strings. For example, searching and matching operations
will always produce<br> results that line up on grapheme cluster boundaries.<br> <br> * Character-by-character processing is a legitimate thing to do
in many real<br> use-cases, including parsing, pattern matching, and language-specific<br> transformations such as transliteration.<br> <br> * `Collection` conformance makes a wide variety of powerful operations<br> available that are appropriate to `String`'s default role
as the vehicle for<br> machine processed text.<br> <br> The methods `String` would inherit from `Collection`, where
similar to<br> higher-level string algorithms, have the right semantics.
For example,<br> grapheme-wise `lexicographicalCompare`, `elementsEqual`,
and application of<br> `flatMap` with case-conversion, produce the same results
one would expect<br> from whole-string ordering comparison, equality comparison,
and<br> case-conversion, respectively. `reverse` operates correctly
on graphemes,<br> keeping diacritics moored to their base characters and leaving
emoji intact.<br> Other methods such as `indexOf` and `contains` make obvious
sense. A few<br> `Collection` methods, like `min` and `max`, may not be particularly
useful<br> on `String`, but we don't consider that to be a problem worth
solving, in<br> the same way that we wouldn't try to suppress `min` and `max`
on a<br> `Set([UInt8])` that was used to store IP addresses.<br> <br> * Many of the higher-level operations that we want to provide for
`String`s,<br> such as parsing and pattern matching, should apply to any
`Collection`, and<br> many of the benefits we want for `Collections`, such<br> as unified slicing, should accrue<br> equally to `String`. Making `String` part of the same
protocol hierarchy<br> allows us to write these operations once and not worry about
keeping the<br> benefits in sync.<br> <br> * Slicing strings into substrings is a crucial part of the vocabulary
of<br> string processing, and all other sliceable things are `Collection`s.<br> Because of its collection-like behavior, users naturally
think of `String`<br> in collection terms, but run into frustrating limitations
where it fails to<br> conform and are left to wonder where all the differences
lie. Many simply<br> “correct” this limitation by declaring a trivial conformance:<br> <br> ```swift<br> extension String : BidirectionalCollection {}<br> ```<br> <br> Even if we removed indexing-by-element from `String`, users
could still do<br> this:<br> <br> ```swift<br> extension String : BidirectionalCollection {<br> subscript(i: Index) -> Character { return
characters[i] }<br> }<br> ```<br> <br> It would be much better to legitimize the conformance to
`Collection` and<br> simply document the oddity of any concatenation corner-cases,
than to deny<br> users the benefits on the grounds that a few cases are confusing.<br><br>Note that the fact that `String` is a collection of graphemes does *not*
mean<br>that string operations will necessarily have to do grapheme boundary<br>recognition. See the Unicode protocol section for details.<br><br>### `Character` and `CharacterSet`<br><br>`Character`, which represents a<br>Unicode<br>[extended grapheme cluster](</font></tt><a href=http://unicode.org/reports/tr29/#Grapheme_Cluster_Boundaries><tt><font size=2>http://unicode.org/reports/tr29/#Grapheme_Cluster_Boundaries</font></tt></a><tt><font size=2>),<br>is a bit of a black box, requiring conversion to `String` in order to<br>do any introspection, including interoperation with ASCII. To fix
this, we should:<br><br> - Add a `unicodeScalars` view much like `String`'s, so that the sub-structure<br> of grapheme clusters is discoverable.<br> - Add a failable `init` from sequences of scalars (returning nil for sequences<br> that contain 0 or 2+ graphemes).<br> - (Lower priority) expose some operations, such as `func uppercase() -><br> String`, `var isASCII: Bool`, and, to the extent they can be sensibly<br> generalized, queries of unicode properties that should also be
exposed on<br> `UnicodeScalar` such as `isAlphabetic` and `isGraphemeBase` .<br><br>Despite its name, `CharacterSet` currently operates on the Swift `UnicodeScalar`<br>type. This means it is usable on `String`, but only by going through the
unicode<br>scalar view. To deal with this clash in the short term, `CharacterSet`
should be<br>renamed to `UnicodeScalarSet`. In the longer term, it may be appropriate
to<br>introduce a `CharacterSet` that provides similar functionality for extended<br>grapheme clusters.[5]<br><br>### Unification of Slicing Operations<br><br>Creating substrings is a basic part of String processing, but the slicing<br>operations that we have in Swift are inconsistent in both their spelling
and<br>their naming: <br><br> * Slices with two explicit endpoints are done with subscript, and
support<br> in-place mutation:<br> <br> ```swift<br> s[i..<j].mutate()<br> ```<br><br> * Slicing from an index to the end, or from the start to an index,
is done<br> with a method and does not support in-place mutation:<br> ```swift<br> s.prefix(upTo: i).readOnly()<br> ```<br><br>Prefix and suffix operations should be migrated to be subscripting operations<br>with one-sided ranges i.e. `s.prefix(upTo: i)` should become `s[..<i]`,
as<br>in<br>[this proposal](</font></tt><a href="https://github.com/apple/swift-evolution/blob/9cf2685293108ea3efcbebb7ee6a8618b83d4a90/proposals/0132-sequence-end-ops.md"><tt><font size=2>https://github.com/apple/swift-evolution/blob/9cf2685293108ea3efcbebb7ee6a8618b83d4a90/proposals/0132-sequence-end-ops.md</font></tt></a><tt><font size=2>).<br>With generic subscripting in the language, that will allow us to collapse
a wide<br>variety of methods and subscript overloads into a single implementation,
and<br>give users an easy-to-use and composable way to describe subranges.<br><br>Further extending this EDSL to integrate use-cases like `s.prefix(maxLength:
5)`<br>is an ongoing research project that can be considered part of the potential<br>long-term vision of text (and collection) processing.<br><br>### Substrings<br><br>When implementing substring slicing, languages are faced with three options:<br><br>1. Make the substrings the same type as string, and share storage.<br>2. Make the substrings the same type as string, and copy storage when making
the substring.<br>3. Make substrings a different type, with a storage copy on conversion
to string.<br><br>We think number 3 is the best choice. A walk-through of the tradeoffs follows.<br><br>#### Same type, shared storage<br><br>In Swift 3.0, slicing a `String` produces a new `String` that is a view
into a<br>subrange of the original `String`'s storage. This is why `String` is 3
words in<br>size (the start, length and buffer owner), unlike the similar `Array` type<br>which is only one.<br><br>This is a simple model with big efficiency gains when chopping up strings
into<br>multiple smaller strings. But it does mean that a stored substring keeps
the<br>entire original string buffer alive even after it would normally have been<br>released.<br><br>This arrangement has proven to be problematic in other programming languages,<br>because applications sometimes extract small strings from large ones and
keep<br>those small strings long-term. That is considered a memory leak and was
enough<br>of a problem in Java that they changed from substrings sharing storage
to<br>making a copy in 1.7.<br><br>#### Same type, copied storage<br><br>Copying of substrings is also the choice made in C#, and in the default<br>`NSString` implementation. This approach avoids the memory leak issue,
but has<br>obvious performance overhead in performing the copies.<br><br>This in turn encourages trafficking in string/range pairs instead of in<br>substrings, for performance reasons, leading to API challenges. For example:<br><br>```swift<br>foo.compare(bar, range: start..<end)<br>```<br><br>Here, it is not clear whether `range` applies to `foo` or `bar`. This<br>relationship is better expressed in Swift as a slicing operation:<br><br>```swift<br>foo[start..<end].compare(bar)<br>```<br><br>Not only does this clarify to which string the range applies, it also brings<br>this sub-range capability to any API that operates on `String` "for
free". So<br>these other combinations also work equally well:<br><br>```swift<br>// apply range on argument rather than target<br>foo.compare(bar[start..<end])<br>// apply range on both<br>foo[start..<end].compare(bar[start1..<end1])<br>// compare two strings ignoring first character<br>foo.dropFirst().compare(bar.dropFirst())<br>```<br><br>In all three cases, an explicit range argument need not appear on the `compare`<br>method itself. The implementation of `compare` does not need to know anything<br>about ranges. Methods need only take range arguments when that was an<br>integral part of their purpose (for example, setting the start and end
of a<br>user's current selection in a text box).<br><br>#### Different type, shared storage<br><br>The desire to share underlying storage while preventing accidental memory
leaks<br>occurs with slices of `Array`. For this reason we have an `ArraySlice`
type.<br>The inconvenience of a separate type is mitigated by most operations used
on<br>`Array` from the standard library being generic over `Sequence` or `Collection`.<br><br>We should apply the same approach for `String` by introducing a distinct<br>`SubSequence` type, `Substring`. Similar advice given for `ArraySlice`
would apply to `Substring`:<br><br>> Important: Long-term storage of `Substring` instances is discouraged.
A<br>> substring holds a reference to the entire storage of a larger string,
not<br>> just to the portion it presents, even after the original string's
lifetime<br>> ends. Long-term storage of a `Substring` may therefore prolong the
lifetime<br>> of large strings that are no longer otherwise accessible, which can
appear<br>> to be memory leakage.<br><br>When assigning a `Substring` to a longer-lived variable (usually a stored<br>property) explicitly of type `String`, a type conversion will be performed,
and<br>at this point the substring buffer is copied and the original string's
storage<br>can be released.<br><br>A `String` that was not its own `Substring` could be one word—a single
tagged<br>pointer—without requiring additional allocations. `Substring`s would be
a view<br>onto a `String`, so are 3 words - pointer to owner, pointer to start, and
a<br>length. The small string optimization for `Substring` would take advantage
of<br>the larger size, probably with a less compressed encoding for speed.<br><br>The downside of having two types is the inconvenience of sometimes having
a<br>`Substring` when you need a `String`, and vice-versa. It is likely this
would<br>be a significantly bigger problem than with `Array` and `ArraySlice`, as<br>slicing of `String` is such a common operation. It is especially relevant
to<br>existing code that assumes `String` is the currency type. To ease the pain
of<br>type mismatches, `Substring` should be a subtype of `String` in the same
way<br>that `Int` is a subtype of `Optional<Int>`. This would give users
an implicit<br>conversion from `Substring` to `String`, as well as the usual implicit<br>conversions such as `[Substring]` to `[String]` that other subtype<br>relationships receive.<br><br>In most cases, type inference combined with the subtype relationship should<br>make the type difference a non-issue and users will not care which type
they</font></tt><br><tt><font size=2>are using. For flexibility and optimizability, most
operations from the<br>standard library will traffic in generic models of<br>[`Unicode`](#the--code-unicode--code--protocol).<br><br>##### Guidance for API Designers<br><br>In this model, **if a user is unsure about which type to use, `String`
is always<br>a reasonable default**. A `Substring` passed where `String` is expected
will be<br>implicitly copied. When compared to the “same type, copied storage” model,
we<br>have effectively deferred the cost of copying from the point where a substring<br>is created until it must be converted to `String` for use with an API.<br><br>A user who needs to optimize away copies altogether should use this guideline:<br>if for performance reasons you are tempted to add a `Range` argument to
your<br>method as well as a `String` to avoid unnecessary copies, you should instead<br>use `Substring`.<br><br>##### The “Empty Subscript”<br><br>To make it easy to call such an optimized API when you only have a `String`
(or<br>to call any API that takes a `Collection`'s `SubSequence` when all you
have is<br>the `Collection`), we propose the following “empty subscript” operation,<br><br>```swift<br>extension Collection {<br> subscript() -> SubSequence { <br> return self[startIndex..<endIndex] <br> }<br>}<br>```<br><br>which allows the following usage:<br><br>```swift<br>funcThatIsJustLooking(at: person.name[]) // pass person.name as Substring<br>```<br><br>The `[]` syntax can be offered as a fixit when needed, similar to `&`
for an<br>`inout` argument. While it doesn't help a user to convert `[String]` to<br>`[Substring]`, the need for such conversions is extremely rare, can be
done with<br>a simple `map` (which could also be offered by a fixit):<br><br>```swift<br>takesAnArrayOfSubstring(arrayOfString.map { $0[] })<br>```<br><br>#### Other Options Considered<br><br>As we have seen, all three options above have downsides, but it's possible<br>these downsides could be eliminated/mitigated by the compiler. We are proposing<br>one such mitigation—implicit conversion—as part of the the "different
type,<br>shared storage" option, to help avoid the cognitive load on developers
of<br>having to deal with a separate `Substring` type.<br><br>To avoid the memory leak issues of a "same type, shared storage"
substring<br>option, we considered whether the compiler could perform an implicit copy
of<br>the underlying storage when it detects the string is being "stored"
for long<br>term usage, say when it is assigned to a stored property. The trouble with
this<br>approach is it is very difficult for the compiler to distinguish between<br>long-term storage versus short-term in the case of abstractions that rely
on<br>stored properties. For example, should the storing of a substring inside
an<br>`Optional` be considered long-term? Or the storing of multiple substrings<br>inside an array? The latter would not work well in the case of a<br>`components(separatedBy:)` implementation that intended to return an array
of<br>substrings. It would also be difficult to distinguish intentional medium-term<br>storage of substrings, say by a lexer. There does not appear to be an effective<br>consistent rule that could be applied in the general case for detecting
when a<br>substring is truly being stored long-term.<br><br>To avoid the cost of copying substrings under "same type, copied storage",
the<br>optimizer could be enhanced to to reduce the impact of some of those copies.<br>For example, this code could be optimized to pull the invariant substring
out<br>of the loop:<br><br>```swift<br>for _ in 0..<lots { <br> someFunc(takingString: bigString[bigRange]) <br>}<br>```<br><br>It's worth noting that a similar optimization is needed to avoid an equivalent<br>problem with implicit conversion in the "different type, shared storage"
case:<br><br>```swift<br>let substring = bigString[bigRange]<br>for _ in 0..<lots { someFunc(takingString: substring) }<br>```<br><br>However, in the case of "same type, copied storage" there are
many use cases<br>that cannot be optimized as easily. Consider the following simple definition
of<br>a recursive `contains` algorithm, which when substring slicing is linear
makes<br>the overall algorithm quadratic:<br><br>```swift<br>extension String {<br> func containsChar(_ x: Character) -> Bool {<br> return !isEmpty && (first == x || dropFirst().containsChar(x))<br> }<br>}<br>```<br><br>For the optimizer to eliminate this problem is unrealistic, forcing the
user to<br>remember to optimize the code to not use string slicing if they want it
to be<br>efficient (assuming they remember):<br><br>```swift<br>extension String {<br> // add optional argument tracking progress through the string<br> func containsCharacter(_ x: Character, atOrAfter idx: Index?
= nil) -> Bool {<br> let idx = idx ?? startIndex<br> return idx != endIndex<br> && (self[idx] == x ||
containsCharacter(x, atOrAfter: index(after: idx)))<br> }<br>}<br>```<br><br>#### Substrings, Ranges and Objective-C Interop<br><br>The pattern of passing a string/range pair is common in several Objective-C<br>APIs, and is made especially awkward in Swift by the non-interchangeability
of<br>`Range<String.Index>` and `NSRange`. <br><br>```swift<br>s2.find(s2, sourceRange: NSRange(j..<s2.endIndex, in: s2))<br>```<br><br>In general, however, the Swift idiom for operating on a sub-range of a<br>`Collection` is to *slice* the collection and operate on that:<br><br>```swift<br>s2.find(s2[j..<s2.endIndex])<br>```<br><br>Therefore, APIs that operate on an `NSString`/`NSRange` pair should be
imported<br>without the `NSRange` argument. The Objective-C importer should be
changed to<br>give these APIs special treatment so that when a `Substring` is passed,
instead<br>of being converted to a `String`, the full `NSString` and range are passed
to<br>the Objective-C method, thereby avoiding a copy.<br><br>As a result, you would never need to pass an `NSRange` to these APIs, which<br>solves the impedance problem by eliminating the argument, resulting in
more<br>idiomatic Swift code while retaining the performance benefit. To
help users<br>manually handle any cases that remain, Foundation should be augmented to
allow<br>the following syntax for converting to and from `NSRange`:<br><br>```swift<br>let nsr = NSRange(i..<j, in: s) // An NSRange corresponding to s[i..<j]<br>let iToJ = Range(nsr, in: s) // Equivalent to i..<j<br>```<br><br>### The `Unicode` protocol<br><br>With `Substring` and `String` being distinct types and sharing almost all<br>interface and semantics, and with the highest-performance string processing<br>requiring knowledge of encoding and layout that the currency types can't<br>provide, it becomes important to capture the common “string API” in a
protocol.<br>Since Unicode conformance is a key feature of string processing in swift,
we<br>call that protocol `Unicode`:<br><br>**Note:** The following assumes several features that are planned but not
yet implemented in<br> Swift, and should be considered a sketch rather than a final design.<br> <br>```swift<br>protocol Unicode <br> : Comparable, BidirectionalCollection where Element == Character
{<br> <br> associatedtype Encoding : UnicodeEncoding<br> var encoding: Encoding { get }<br> <br> associatedtype CodeUnits <br> : RandomAccessCollection where Element == Encoding.CodeUnit<br> var codeUnits: CodeUnits { get }<br> <br> associatedtype UnicodeScalars <br> : BidirectionalCollection where Element == UnicodeScalar<br> var unicodeScalars: UnicodeScalars { get }<br><br> associatedtype ExtendedASCII <br> : BidirectionalCollection where Element == UInt32<br> var extendedASCII: ExtendedASCII { get }<br><br> var unicodeScalars: UnicodeScalars { get }<br>}<br><br>extension Unicode {<br> // ... define high-level non-mutating string operations, e.g. search
...<br><br> func compared<Other: Unicode>(<br> to rhs: Other,<br> case caseSensitivity: StringSensitivity? = nil,<br> diacritic diacriticSensitivity: StringSensitivity? = nil,<br> width widthSensitivity: StringSensitivity? = nil,<br> in locale: Locale? = nil<br> ) -> SortOrder { ... }<br>}<br><br>extension Unicode : RangeReplaceableCollection where CodeUnits :<br> RangeReplaceableCollection {<br> // Satisfy protocol requirement<br> mutating func replaceSubrange<C : Collection>(_: Range<Index>,
with: C) <br> where C.Element == Element<br> <br> // ... define high-level mutating string operations, e.g. replace
...<br>}<br><br>```<br><br>The goal is that `Unicode` exposes the underlying encoding and code units
in<br>such a way that for types with a known representation (e.g. a high-performance<br>`UTF8String`) that information can be known at compile-time and can be
used to<br>generate a single path, while still allowing types like `String` that admit<br>multiple representations to use runtime queries and branches to fast path<br>specializations.<br><br>**Note:** `Unicode` would make a fantastic namespace for much of<br>what's in this proposal if we could get the ability to nest types and<br>protocols in protocols.<br><br><br>### Scanning, Matching, and Tokenization<br><br>#### Low-Level Textual Analysis<br><br>We should provide convenient APIs processing strings by character. For
example,<br>it should be easy to cleanly express, “if this string starts with `"f"`,
process<br>the rest of the string as follows…” Swift is well-suited to expressing
this<br>common pattern beautifully, but we need to add the APIs. Here are
two examples<br>of the sort of code that might be possible given such APIs:<br><br>```swift<br>if let firstLetter = input.droppingPrefix(alphabeticCharacter) {<br> somethingWith(input) // process the rest of input<br>}<br><br>if let (number, restOfInput) = input.parsingPrefix(Int.self) {<br> ...<br>}<br>```<br><br>The specific spelling and functionality of APIs like this are TBD. The
larger<br>point is to make sure matching-and-consuming jobs are well-supported.<br><br>#### Unified Pattern Matcher Protocol<br><br>Many of the current methods that do matching are overloaded to do the same<br>logical operations in different ways, with the following axes:<br><br>- Logical Operation: `find`, `split`, `replace`, match at start<br>- Kind of pattern: `CharacterSet`, `String`, a regex, a closure<br>- Options, e.g. case/diacritic sensitivity, locale. Sometimes a part
of<br> the method name, and sometimes an argument<br>- Whole string or subrange.<br><br>We should represent these aspects as orthogonal, composable components,<br>abstracting pattern matchers into a protocol like<br>[this one](</font></tt><a href=https://github.com/apple/swift/blob/master/test/Prototypes/PatternMatching.swift#L33><tt><font size=2>https://github.com/apple/swift/blob/master/test/Prototypes/PatternMatching.swift#L33</font></tt></a><tt><font size=2>),<br>that can allow us to define logical operations once, without introducing<br>overloads, and massively reducing API surface area.<br><br>For example, using the strawman prefix `%` syntax to turn string literals
into<br>patterns, the following pairs would all invoke the same generic methods:<br><br>```swift<br>if let found = s.firstMatch(%"searchString") { ... }<br>if let found = s.firstMatch(someRegex) { ... }<br><br>for m in s.allMatches((%"searchString"), case: .insensitive)
{ ... }<br>for m in s.allMatches(someRegex) { ... }<br><br>let items = s.split(separatedBy: ", ")<br>let tokens = s.split(separatedBy: CharacterSet.whitespace)<br>```<br><br>Note that, because Swift requires the indices of a slice to match the indices
of<br>the range from which it was sliced, operations like `firstMatch` can return
a<br>`Substring?` in lieu of a `Range<String.Index>?`: the indices of
the match in<br>the string being searched, if needed, can easily be recovered as the<br>`startIndex` and `endIndex` of the `Substring`.<br><br>Note also that matching operations are useful for collections in general,
and<br>would fall out of this proposal:<br><br>```<br>// replace subsequences of contiguous NaNs with zero<br>forces.replace(oneOrMore([Float.nan]), [0.0])<br>```<br><br>#### Regular Expressions<br><br>Addressing regular expressions is out of scope for this proposal.<br>That said, it is important that to note the pattern matching protocol mentioned<br>above provides a suitable foundation for regular expressions, and types
such as<br>`NSRegularExpression` can easily be retrofitted to conform to it. In
the<br>future, support for regular expression literals in the compiler could allow
for<br>compile-time syntax checking and optimization.<br><br>### String Indices<br><br>`String` currently has four views—`characters`, `unicodeScalars`, `utf8`,
and<br>`utf16`—each with its own opaque index type. The APIs used to translate
indices<br>between views add needless complexity, and the opacity of indices makes
them<br>difficult to serialize.<br><br>The index translation problem has two aspects:<br><br> 1. `String` views cannot consume one anothers' indices without a
cumbersome<br> conversion step. An index into a `String`'s `characters`
must be translated<br> before it can be used as a position in its `unicodeScalars`.
Although these<br> translations are rarely needed, they add conceptual and API
complexity.<br> 2. Many APIs in the core libraries and other frameworks still expose
`String`<br> positions as `Int`s and regions as `NSRange`s, which can
only reference a<br> `utf16` view and interoperate poorly with `String` itself.<br><br>#### Index Interchange Among Views<br><br>String's need for flexible backing storage and reasonably-efficient indexing<br>(i.e. without dynamically allocating and reference-counting the indices<br>themselves) means indices need an efficient underlying storage type. Although<br>we do not wish to expose `String`'s indices *as* integers, `Int` offsets
into<br>underlying code unit storage makes a good underlying storage type, provided<br>`String`'s underlying storage supports random-access. We think random-access<br>*code-unit storage* is a reasonable requirement to impose on all `String`<br>instances.<br><br>Making these `Int` code unit offsets conveniently accessible and constructible<br>solves the serialization problem:<br><br>```swift<br>clipboard.write(s.endIndex.codeUnitOffset)<br>let offset = clipboard.read(Int.self)<br>let i = String.Index(codeUnitOffset: offset)<br>```<br><br>Index interchange between `String` and its `unicodeScalars`, `codeUnits`,<br>and [`extendedASCII`](#parsing-ascii-structure) views can be made entirely<br>seamless by having them share an index type (semantics of indexing a `String`<br>between grapheme cluster boundaries are TBD—it can either trap or be forgiving).<br>Having a common index allows easy traversal into the interior of graphemes,<br>something that is often needed, without making it likely that someone will
do it<br>by accident.<br><br> - `String.index(after:)` should advance to the next grapheme, even when
the<br> index points partway through a grapheme.<br> <br> - `String.index(before:)` should move to the start of the grapheme before<br> the current position.<br><br>Seamless index interchange between `String` and its UTF-8 or UTF-16 views
is not<br>crucial, as the specifics of encoding should not be a concern for most
use<br>cases, and would impose needless costs on the indices of other views. That<br>said, we can make translation much more straightforward by exposing simple<br>bidirectional converting `init`s on both index types:<br><br>```swift<br>let u8Position = String.UTF8.Index(someStringIndex)<br>let originalPosition = String.Index(u8Position)<br>```<br><br>#### Index Interchange with Cocoa<br><br>We intend to address `NSRange`s that denote substrings in Cocoa APIs as<br>described [later in this document](#substrings--ranges-and-objective-c-interop).<br>That leaves the interchange of bare indices with Cocoa APIs trafficking
in<br>`Int`. Hopefully such APIs will be rare, but when needed, the following<br>extension, which would be useful for all `Collections`, can help:<br><br>```swift<br>extension Collection {<br> func index(offset: IndexDistance) -> Index {<br> return index(startIndex, offsetBy: offset)<br> }<br> func offset(of i: Index) -> IndexDistance {<br> return distance(from: startIndex, to: i)<br> }<br>}<br>```<br><br>Then integers can easily be translated into offsets into a `String`'s `utf16`<br>view for consumption by Cocoa:<br><br>```swift<br>let cocoaIndex = s.utf16.offset(of: String.UTF16Index(i))<br>let swiftIndex = s.utf16.index(offset: cocoaIndex)<br>```<br><br>### Formatting<br><br>A full treatment of formatting is out of scope of this proposal, but<br>we believe it's crucial for completing the text processing picture. This<br>section details some of the existing issues and thinking that may guide
future<br>development.<br><br>#### Printf-Style Formatting<br><br>`String.format` is designed on the `printf` model: it takes a format string
with<br>textual placeholders for substitution, and an arbitrary list of other arguments.<br>The syntax and meaning of these placeholders has a long history in<br>C, but for anyone who doesn't use them regularly they are cryptic and complex,<br>as the `printf (3)` man page attests.<br><br>Aside from complexity, this style of API has two major problems: First,
the<br>spelling of these placeholders must match up to the types of the arguments,
in<br>the right order, or the behavior is undefined. Some limited support
for<br>compile-time checking of this correspondence could be implemented, but
only for<br>the cases where the format string is a literal. Second, there's no reasonable<br>way to extend the formatting vocabulary to cover the needs of new types:
you are<br>stuck with what's in the box.<br><br>#### Foundation Formatters<br><br>The formatters supplied by Foundation are highly capable and versatile,
offering<br>both formatting and parsing services. When used for formatting, though,
the<br>design pattern demands more from users than it should:<br><br> * Matching the type of data being formatted to a formatter type<br> * Creating an instance of that type<br> * Setting stateful options (`currency`, `dateStyle`) on the type.
Note: the<br> need for this step prevents the instance from being used
and discarded in<br> the same expression where it is created.<br> * Overall, introduction of needless verbosity into source<br><br>These may seem like small issues, but the experience of Apple localization<br>experts is that the total drag of these factors on programmers is such
that they<br>tend to reach for `String.format` instead.<br><br>#### String Interpolation<br><br>Swift string interpolation provides a user-friendly alternative to printf's<br>domain-specific language (just write ordinary swift code!) and its type
safety<br>problems (put the data right where it belongs!) but the following issues
prevent<br>it from being useful for localized formatting (among other jobs):<br><br> * [SR-2303](</font></tt><a href="https://bugs.swift.org/browse/SR-2303"><tt><font size=2>https://bugs.swift.org/browse/SR-2303</font></tt></a><tt><font size=2>)
We are unable to restrict<br> types used in string interpolation.<br> * [SR-1260](</font></tt><a href="https://bugs.swift.org/browse/SR-1260"><tt><font size=2>https://bugs.swift.org/browse/SR-1260</font></tt></a><tt><font size=2>)
String interpolation can't<br> distinguish (fragments of) the base string from the string
substitutions.<br><br>In the long run, we should improve Swift string interpolation to the point
where<br>it can participate in most any formatting job. Mostly this centers
around<br>fixing the interpolation protocols per the previous item, and supporting<br>localization.<br><br>To be able to use formatting effectively inside interpolations, it needs
to be<br>both lightweight (because it all happens in-situ) and discoverable. One
<br>approach would be to standardize on `format` methods, e.g.:<br><br>```swift<br>"Column 1: \(n.format(radix:16, width:8)) *** \(message)"<br><br>"Something with leading zeroes: \(x.format(fill: zero, width:8))"<br>```<br><br>### C String Interop<br><br>Our support for interoperation with nul-terminated C strings is scattered
and<br>incoherent, with 6 ways to transform a C string into a `String` and four
ways to<br>do the inverse. These APIs should be replaced with the following<br><br>```swift<br>extension String {<br> /// Constructs a `String` having the same contents as `nulTerminatedUTF8`.<br> ///<br> /// - Parameter nulTerminatedUTF8: a sequence of contiguous UTF-8
encoded <br> /// bytes ending just before the first zero byte (NUL character).<br> init(cString nulTerminatedUTF8: UnsafePointer<CChar>)<br> <br> /// Constructs a `String` having the same contents as `nulTerminatedCodeUnits`.<br> ///<br> /// - Parameter nulTerminatedCodeUnits: a sequence of contiguous
code units in<br> /// the given `encoding`, ending just before the first zero
code unit.<br> /// - Parameter encoding: describes the encoding in which the code
units<br> /// should be interpreted.<br> init<Encoding: UnicodeEncoding>(<br> cString nulTerminatedCodeUnits: UnsafePointer<Encoding.CodeUnit>,<br> encoding: Encoding)<br> <br> /// Invokes the given closure on the contents of the string, represented
as a<br> /// pointer to a null-terminated sequence of UTF-8 code units.<br> func withCString<Result>(<br> _ body: (UnsafePointer<CChar>) throws -> Result)
rethrows -> Result<br>}<br>```<br><br>In both of the construction APIs, any invalid encoding sequence detected
will<br>have its longest valid prefix replaced by U+FFFD, the Unicode replacement<br>character, per Unicode specification. This covers the common case.
The<br>replacement is done *physically* in the underlying storage and the validity
of<br>the result is recorded in the `String`'s `encoding` such that future accesses<br>need not be slowed down by possible error repair separately.<br><br>Construction that is aborted when encoding errors are detected can be<br>accomplished using APIs on the `encoding`. String types that retain
their<br>physical encoding even in the presence of errors and are repaired on-the-fly
can<br>be built as different instances of the `Unicode` protocol.<br><br>### Unicode 9 Conformance<br><br>Unicode 9 (and MacOS 10.11) brought us support for family emoji, which
changes<br>the process of properly identifying `Character` boundaries. We need
to update<br>`String` to account for this change.<br><br>### High-Performance String Processing<br><br>Many strings are short enough to store in 64 bits, many can be stored using
only<br>8 bits per unicode scalar, others are best encoded in UTF-16, and some
come to<br>us already in some other encoding, such as UTF-8, that would be costly
to<br>translate. Supporting these formats while maintaining usability for<br>general-purpose APIs demands that a single `String` type can be backed
by many<br>different representations.<br><br>That said, the highest performance code always requires static knowledge
of the<br>data structures on which it operates, and for this code, dynamic selection
of<br>representation comes at too high a cost. Heavy-duty text processing
demands a<br>way to opt out of dynamism and directly use known encodings. Having
this<br>ability can also make it easy to cleanly specialize code that handles dynamic<br>cases for maximal efficiency on the most common representations.<br><br>To address this need, we can build models of the `Unicode` protocol that
encode<br>representation information into the type, such as `NFCNormalizedUTF16String`.<br><br>### Parsing ASCII Structure<br><br>Although many machine-readable formats support the inclusion of arbitrary<br>Unicode text, it is also common that their fundamental structure lies entirely<br>within the ASCII subset (JSON, YAML, many XML formats). These formats
are often<br>processed most efficiently by recognizing ASCII structural elements as
ASCII,<br>and capturing the arbitrary sections between them in more-general strings.
The<br>current String API offers no way to efficiently recognize ASCII and skip
past<br>everything else without the overhead of full decoding into unicode scalars.<br><br>For these purposes, strings should supply an `extendedASCII` view that
is a<br>collection of `UInt32`, where values less than `0x80` represent the<br>corresponding ASCII character, and other values represent data that is
specific<br>to the underlying encoding of the string.<br><br>## Language Support<br><br>This proposal depends on two new features in the Swift language:<br><br>1. **Generic subscripts**, to<br> enable unified slicing syntax.<br><br>2. **A subtype relationship** between<br> `Substring` and `String`, enabling framework APIs to traffic solely
in<br> `String` while still making it possible to avoid copies by handling<br> `Substring`s where necessary.<br><br>Additionally, **the ability to nest types and protocols inside<br>protocols** could significantly shrink the footprint of this proposal<br>on the top-level Swift namespace.<br><br><br>## Open Questions<br><br>### Must `String` be limited to storing UTF-16 subset encodings?<br><br>- The ability to handle `UTF-8`-encoded strings (models of `Unicode`) is
not in<br> question here; this is about what encodings must be storable, without<br> transcoding, in the common currency type called “`String`”.<br>- ASCII, Latin-1, UCS-2, and UTF-16 are UTF-16 subsets. UTF-8 is
not.<br>- If we have a way to get at a `String`'s code units, we need a concrete
type in<br> which to express them in the API of `String`, which is a concrete
type<br>- If String needs to be able to represent UTF-32, presumably the code units
need<br> to be `UInt32`.<br>- Not supporting UTF-32-encoded text seems like one reasonable design choice.<br>- Maybe we can allow UTF-8 storage in `String` and expose its code units
as<br> `UInt16`, just as we would for Latin-1.<br>- Supporting only UTF-16-subset encodings would imply that `String` indices
can<br> be serialized without recording the `String`'s underlying encoding.<br><br>### Do we need a type-erasable base protocol for UnicodeEncoding?<br><br>UnicodeEncoding has an associated type, but it may be important to be able
to<br>traffic in completely dynamic encoding values, e.g. for “tell me the most<br>efficient encoding for this string.”<br><br>### Should there be a string “facade?”<br><br>One possible design alternative makes `Unicode` a vehicle for expressing<br>the storage and encoding of code units, but does not attempt to give it
an API<br>appropriate for `String`. Instead, string APIs would be provided
by a generic<br>wrapper around an instance of `Unicode`:<br><br>```swift<br>struct StringFacade<U: Unicode> : BidirectionalCollection {<br><br> // ...APIs for high-level string processing here...<br> <br> var unicode: U // access to lower-level unicode details<br>}<br><br>typealias String = StringFacade<StringStorage><br>typealias Substring = StringFacade<StringStorage.SubSequence><br>```<br><br>This design would allow us to de-emphasize lower-level `String` APIs such
as<br>access to the specific encoding, by putting them behind a `.unicode` property.<br>A similar effect in a facade-less design would require a new top-level<br>`StringProtocol` playing the role of the facade with an an `associatedtype<br>Storage : Unicode`.</font></tt><br><tt><font size=2><br>An interesting variation on this design is possible if defaulted generic<br>parameters are introduced to the language:<br><br>```swift<br>struct String<U: Unicode = StringStorage> <br> : BidirectionalCollection {<br><br> // ...APIs for high-level string processing here...<br> <br> var unicode: U // access to lower-level unicode details<br>}<br><br>typealias Substring = String<StringStorage.SubSequence><br>```<br><br>One advantage of such a design is that naïve users will always extend “the
right<br>type” (`String`) without thinking, and the new APIs will show up on `Substring`,<br>`MyUTF8String`, etc. That said, it also has downsides that should
not be<br>overlooked, not least of which is the confusability of the meaning of the
word<br>“string.” Is it referring to the generic or the concrete type?<br><br>### `TextOutputStream` and `TextOutputStreamable`<br><br>`TextOutputStreamable` is intended to provide a vehicle for<br>efficiently transporting formatted representations to an output stream<br>without forcing the allocation of storage. Its use of `String`, a<br>type with multiple representations, at the lowest-level unit of<br>communication, conflicts with this goal. It might be sufficient to<br>change `TextOutputStream` and `TextOutputStreamable` to traffic in an<br>associated type conforming to `Unicode`, but that is not yet clear.<br>This area will require some design work.<br><br>### `description` and `debugDescription`<br><br>* Should these be creating localized or non-localized representations?<br>* Is returning a `String` efficient enough?<br>* Is `debugDescription` pulling the weight of the API surface area it adds?<br><br>### `StaticString`<br><br>`StaticString` was added as a byproduct of standard library developed and
kept<br>around because it seemed useful, but it was never truly *designed* for
client<br>programmers. We need to decide what happens with it. Presumably
*something*<br>should fill its role, and that should conform to `Unicode`.<br><br>## Footnotes<br><br><b id="f0">0</b> The integers rewrite currently underway
is expected to<br> substantially reduce the scope of `Int`'s API by using more<br> generics. [↩](#a0)<br><br><b id="f1">1</b> In practice, these semantics will
usually be tied to the<br>version of the installed [ICU](</font></tt><a href="http://icu-project.org/"><tt><font size=2>http://icu-project.org</font></tt></a><tt><font size=2>)
library, which<br>programmatically encodes the most complex rules of the Unicode Standard
and its<br>de-facto extension, CLDR.[↩](#a1)<br><br><b id="f2">2</b><br>See<br>[</font></tt><a href=http://unicode.org/reports/tr29/#Notation><tt><font size=2>http://unicode.org/reports/tr29/#Notation</font></tt></a><tt><font size=2>](</font></tt><a href=http://unicode.org/reports/tr29/#Notation><tt><font size=2>http://unicode.org/reports/tr29/#Notation</font></tt></a><tt><font size=2>).
Note<br>that inserting Unicode scalar values to prevent merging of grapheme clusters
would<br>also constitute a kind of misbehavior (one of the clusters at the boundary
would<br>not be found in the result), so would be relatively costly to implement,
with<br>little benefit. [↩](#a2)<br><br><b id="f4">4</b> The use of non-UCA-compliant ordering
is fully sanctioned by<br> the Unicode standard for this purpose. In fact there's<br> a [whole chapter](</font></tt><a href=http://www.unicode.org/versions/Unicode9.0.0/ch05.pdf><tt><font size=2>http://www.unicode.org/versions/Unicode9.0.0/ch05.pdf</font></tt></a><tt><font size=2>)<br> dedicated to it. In particular, §5.17 says:<br><br> > When comparing text that is visible to end users, a correct
linguistic sort<br> > should be used, as described in _Section 5.16, Sorting and<br> > Searching_. However, in many circumstances the only requirement
is for a<br> > fast, well-defined ordering. In such cases, a binary ordering
can be used.<br><br> [↩](#a4)<br><br><br><b id="f5">5</b> The queries supported by `NSCharacterSet`
map directly onto<br>properties in a table that's indexed by unicode scalar value. This
table is<br>part of the Unicode standard. Some of these queries (e.g., “is this
an<br>uppercase character?”) may have fairly obvious generalizations to grapheme<br>clusters, but exactly how to do it is a research topic and *ideally* we'd
either<br>establish the existing practice that the Unicode committee would standardize,
or<br>the Unicode committee would do the research and we'd implement their<br>result.[↩](#a5)<br><br>_______________________________________________<br>swift-evolution mailing list<br>swift-evolution@swift.org<br></font></tt><a href="https://lists.swift.org/mailman/listinfo/swift-evolution"><tt><font size=2>https://lists.swift.org/mailman/listinfo/swift-evolution</font></tt></a><tt><font size=2><br></font></tt><br><br><BR>