[swift-evolution] Pitch: Renaming CharacterSet to UnicodeScalarSet

Charles Srstka cocoadev at charlessoft.com
Wed Sep 28 22:23:21 CDT 2016


> On Sep 28, 2016, at 9:57 PM, Erica Sadun via swift-evolution <swift-evolution at swift.org> wrote:
> 
> D'erp. I missed that. And that's an unambiguous answer.
> 
> So let me move on to part B of the pitch: I think CharacterSets are broken.
> 
>> Xiaodi Wu: "isn't the problem you're presenting really an argument that the type should be fleshed out to handle characters (grapheme clusters) containing more than one Unicode scalar?"

It seems that it already does handle such characters:

(done in Objective-C so we can log the length of the range as a count of UTF-16 code units)

#import <Foundation/Foundation.h>

int main(int argc, char *argv[]) {
    @autoreleasepool {
        NSCharacterSet *bikeSet = [NSCharacterSet characterSetWithCharactersInString:@"🚲"];
        NSString *str = @"foo🚲bar";
        
        NSRange range = [str rangeOfCharacterFromSet:bikeSet];
        
        NSLog(@"location: %lu length: %lu", range.location, range.length);
    }
}

- - - - - - -

2016-09-28 22:20:00.622471 test[15577:2433912] location: 3 length: 2
Program ended with exit code: 0

- - - - - - -

As we can see, the character from the set is recognized as consisting of two code units. There are a few bugs in the system, though. See the cocoa-dev thread “Where is my bicycle?” from about a year ago: http://prod.lists.apple.com/archives/cocoa-dev/2015/Apr/msg00074.html <http://prod.lists.apple.com/archives/cocoa-dev/2015/Apr/msg00074.html>

Charles

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-evolution/attachments/20160928/b1825eaa/attachment.html>


More information about the swift-evolution mailing list