[swift-users] Splitting a string into "natural/visual character" components?

Jens Persson jens at bitcycle.com
Fri May 12 03:43:31 CDT 2017


I want a function f such that:

f("abc") == ["a", "b", "c"]

f("cafรฉ") == ["c", "a", "f", "รฉ"]

f("๐Ÿ‘จโ€๐Ÿ‘ฉโ€๐Ÿ‘งโ€๐Ÿ‘ฆ๐Ÿ‘ท๐Ÿพโ€โ™€๏ธ") == ["๐Ÿ‘จโ€๐Ÿ‘ฉโ€๐Ÿ‘งโ€๐Ÿ‘ฆ", "๐Ÿ‘ท๐Ÿพโ€โ™€๏ธ"]

I'm not sure if the last example renders correctly by mail for everyone but
the input String contains these _two_ "natural/visual characters":
(1) A family emoji
(2) a construction worker (woman, with skin tone modifier) emoji.
and the result is an Array of two strings (one for each emoji).

The first two examples are easy, the third example is the tricky one.

Is there a (practical) way to do this (in Swift 3)?

/Jens



PS

It's OK if the function has to depend on eg a graphics context etc.
(I tried writing a function so that it extracts the glyphs, using
NSTextStorage, NSLayoutManager and the AppleColorEmoji font, but it says
that "๐Ÿ‘จโ€๐Ÿ‘ฉโ€๐Ÿ‘งโ€๐Ÿ‘ฆ๐Ÿ‘ท๐Ÿพโ€โ™€๏ธ" contains 18(!) glyphs, whereas eg "cafรฉ" contains
4 as expected.)

If the emojis of the third example doesn't look like they should in this
mail, here is another way to write the exact same example using only simple
text:

let inputOfThirdExample =
"\u{1F468}\u{200D}\u{1F469}\u{200D}\u{1F467}\u{200D}\u{1F466}\u{1F477}\u{1F3FE}\u{200D}\u{2640}\u{FE0F}"

let result = f(inputOfThirdExample)

let expectedResult =
["\u{1F468}\u{200D}\u{1F469}\u{200D}\u{1F467}\u{200D}\u{1F466}",
"\u{1F477}\u{1F3FE}\u{200D}\u{2640}\u{FE0F}"]

print(result.elementsEqual(result)) // Should print true
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-users/attachments/20170512/674aa228/attachment.html>


More information about the swift-users mailing list