[swift-evolution] Pitch: Renaming CharacterSet to UnicodeScalarSet
David Sweeris
davesweeris at mac.com
Thu Sep 29 07:49:09 CDT 2016
IIUC, Jay wasn't arguing for renaming CharacterSet, but replacing it with Swift's existing Set mechanism. If/when generics get to the point that we can say 'extension Set<Character> {...}', I think the transition could simply be putting 'typealias CharacterSet = Set<Character>' somewhere in the framework (although I don't know how Obj-C interop would be affected by such a change).
- Dave Sweeris
> On Sep 29, 2016, at 00:30, Xiaodi Wu via swift-evolution <swift-evolution at swift.org> wrote:
>
> CharacterSet is a Foundation value type. It was a subject of the following proposal:
>
> https://github.com/apple/swift-evolution/blob/master/proposals/0069-swift-mutability-for-foundation.md
>
> We might be able improve on the implementation, but I don't think re-arguing the name is an option.
>
>
>> On Wed, Sep 28, 2016 at 11:59 PM Jay Abbott <jay at abbott.me.uk> wrote:
>>
>> Yes - this is totally confusing. CharacterSet and Set<Character> are completely different things with different semantics.
>>
>> I don't know the history, but is CharacterSet simply to have a Swift equivalent of NSCharacterSet? That seems to be what it is, but since Swift redefined characters in a better way, this should be removed or called something else to avoid confusion. You shouldn't have to qualify what you mean by 'character' in a type name because it diverges from the definition in the rest of the language.
>>
>>> On Thu, 29 Sep 2016 at 04:48 Xiaodi Wu via swift-evolution <swift-evolution at swift.org> wrote:
>>>> On Wed, Sep 28, 2016 at 10:34 PM, Xiaodi Wu <xiaodi.wu at gmail.com> wrote:
>>>
>>>> On Wed, Sep 28, 2016 at 10:23 PM, Charles Srstka via swift-evolution <swift-evolution at swift.org> wrote:
>>>>>> On Sep 28, 2016, at 9:57 PM, Erica Sadun via swift-evolution <swift-evolution at swift.org> wrote:
>>>>>>
>>>>>> D'erp. I missed that. And that's an unambiguous answer.
>>>>>>
>>>>>> So let me move on to part B of the pitch: I think CharacterSets are broken.
>>>>>>
>>>>>>> Xiaodi Wu: "isn't the problem you're presenting really an argument that the type should be fleshed out to handle characters (grapheme clusters) containing more than one Unicode scalar?"
>>>>>
>>>>> It seems that it already does handle such characters:
>>>>>
>>>>> (done in Objective-C so we can log the length of the range as a count of UTF-16 code units)
>>>>>
>>>>> #import <Foundation/Foundation.h>
>>>>>
>>>>> int main(int argc, char *argv[]) {
>>>>> @autoreleasepool {
>>>>> NSCharacterSet *bikeSet = [NSCharacterSet characterSetWithCharactersInString:@"🚲"];
>>>>> NSString *str = @"foo🚲bar";
>>>>>
>>>>> NSRange range = [str rangeOfCharacterFromSet:bikeSet];
>>>>>
>>>>> NSLog(@"location: %lu length: %lu", range.location, range.length);
>>>>> }
>>>>> }
>>>>>
>>>>> - - - - - - -
>>>>>
>>>>> 2016-09-28 22:20:00.622471 test[15577:2433912] location: 3 length: 2
>>>>> Program ended with exit code: 0
>>>>>
>>>>> - - - - - - -
>>>>>
>>>>> As we can see, the character from the set is recognized as consisting of two code units. There are a few bugs in the system, though. See the cocoa-dev thread “Where is my bicycle?” from about a year ago: http://prod.lists.apple.com/archives/cocoa-dev/2015/Apr/msg00074.html
>>>>
>>>> The bike emoji might be two code units, but it is one Unicode scalar (U+1F6B2). However, the Canadian flag emoji, for instance, is two Unicode scalars (U+1F1E8 U+1F1E6) but nonetheless one character.
>>>
>>> To illustrate in code how CharacterSet doesn't actually handle characters made up of multiple Unicode scalars:
>>>
>>> ```
>>> import Foundation
>>>
>>> let str1 = "🇦🇩"
>>> let first = CharacterSet(charactersIn: str1) // this actually crashes corelibs-foundation
>>> let str2 = "🇦🇺"
>>> let second = CharacterSet(charactersIn: str2)
>>> let intersection = first.intersection(second)
>>> print(intersection.isEmpty)
>>> // actual output: false
>>> // obviously, if we were really dealing with characters, the intersection should be empty
>>> ```
>>>
>>>>
>>>>> Charles
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> swift-evolution mailing list
>>>>> swift-evolution at swift.org
>>>>> https://lists.swift.org/mailman/listinfo/swift-evolution
>>>>>
>>>>
>>> _______________________________________________
>>> swift-evolution mailing list
>>> swift-evolution at swift.org
>>> https://lists.swift.org/mailman/listinfo/swift-evolution
> _______________________________________________
> swift-evolution mailing list
> swift-evolution at swift.org
> https://lists.swift.org/mailman/listinfo/swift-evolution
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-evolution/attachments/20160929/e6f77515/attachment.html>
More information about the swift-evolution
mailing list