[swift-evolution] Pitch: Renaming CharacterSet to UnicodeScalarSet

Xiaodi Wu xiaodi.wu at gmail.com
Thu Sep 29 00:30:39 CDT 2016


CharacterSet is a Foundation value type. It was a subject of the following
proposal:

https://github.com/apple/swift-evolution/blob/master/proposals/0069-swift-mutability-for-foundation.md

We might be able improve on the implementation, but I don't think
re-arguing the name is an option.


On Wed, Sep 28, 2016 at 11:59 PM Jay Abbott <jay at abbott.me.uk> wrote:

>
> Yes - this is totally confusing. CharacterSet and Set<Character> are
> completely different things with different semantics.
>
> I don't know the history, but is CharacterSet simply to have a Swift
> equivalent of NSCharacterSet? That seems to be what it is, but since Swift
> redefined characters in a better way, this should be removed or called
> something else to avoid confusion. You shouldn't have to qualify what you
> mean by 'character' in a type name because it diverges from the definition
> in the rest of the language.
>
> On Thu, 29 Sep 2016 at 04:48 Xiaodi Wu via swift-evolution <
> swift-evolution at swift.org> wrote:
>
>> On Wed, Sep 28, 2016 at 10:34 PM, Xiaodi Wu <xiaodi.wu at gmail.com> wrote:
>>
>>> On Wed, Sep 28, 2016 at 10:23 PM, Charles Srstka via swift-evolution <
>>> swift-evolution at swift.org> wrote:
>>>
>>>> On Sep 28, 2016, at 9:57 PM, Erica Sadun via swift-evolution <
>>>> swift-evolution at swift.org> wrote:
>>>>
>>>>
>>>> D'erp. I missed that. And that's an unambiguous answer.
>>>>
>>>> So let me move on to part B of the pitch: I think CharacterSets are
>>>> broken.
>>>>
>>>> Xiaodi Wu: "isn't the problem you're presenting really an argument that
>>>> the type should be fleshed out to handle characters (grapheme clusters)
>>>> containing more than one Unicode scalar?"
>>>>
>>>>
>>>> It seems that it already does handle such characters:
>>>>
>>>> (done in Objective-C so we can log the length of the range as a count
>>>> of UTF-16 code units)
>>>>
>>>> #import <Foundation/Foundation.h>
>>>>
>>>> int main(int argc, char *argv[]) {
>>>>     @autoreleasepool {
>>>>         NSCharacterSet *bikeSet = [NSCharacterSet
>>>> characterSetWithCharactersInString:@"🚲"];
>>>>         NSString *str = @"foo🚲bar";
>>>>
>>>>
>>>>         NSRange range = [str rangeOfCharacterFromSet:bikeSet];
>>>>
>>>>
>>>>         NSLog(@"location: %lu length: %lu", range.location, range.
>>>> length);
>>>>     }
>>>> }
>>>>
>>>> - - - - - - -
>>>>
>>>> *2016-09-28 22:20:00.622471 test[15577:2433912] location: 3 length: 2*
>>>> *Program ended with exit code: 0*
>>>>
>>>> - - - - - - -
>>>>
>>>> As we can see, the character from the set is recognized as consisting
>>>> of two code units. There are a few bugs in the system, though. See the
>>>> cocoa-dev thread “Where is my bicycle?” from about a year ago:
>>>> http://prod.lists.apple.com/archives/cocoa-dev/2015/Apr/msg00074.html
>>>>
>>>
>>> The bike emoji might be two code units, but it is one Unicode scalar
>>> (U+1F6B2). However, the Canadian flag emoji, for instance, is two Unicode
>>> scalars (U+1F1E8 U+1F1E6) but nonetheless one character.
>>>
>>
>> To illustrate in code how CharacterSet doesn't actually handle characters
>> made up of multiple Unicode scalars:
>>
>> ```
>> import Foundation
>>
>> let str1 = "🇦🇩"
>> let first = CharacterSet(charactersIn: str1) // this actually crashes
>> corelibs-foundation
>> let str2 = "🇦🇺"
>> let second = CharacterSet(charactersIn: str2)
>> let intersection = first.intersection(second)
>> print(intersection.isEmpty)
>> // actual output: false
>> // obviously, if we were really dealing with characters, the intersection
>> should be empty
>> ```
>>
>>
>>> Charles
>>>>
>>>>
>>>> _______________________________________________
>>>> swift-evolution mailing list
>>>> swift-evolution at swift.org
>>>> https://lists.swift.org/mailman/listinfo/swift-evolution
>>>>
>>>>
>>> _______________________________________________
>> swift-evolution mailing list
>> swift-evolution at swift.org
>> https://lists.swift.org/mailman/listinfo/swift-evolution
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-evolution/attachments/20160929/8de91239/attachment.html>


More information about the swift-evolution mailing list