[swift-evolution] Unicode identifiers & operators
Chris Lattner
clattner at apple.com
Sun Sep 18 23:29:53 CDT 2016
> On Sep 18, 2016, at 6:24 PM, Xiaodi Wu via swift-evolution <swift-evolution at swift.org> wrote:
>
> On Sun, Sep 18, 2016 at 9:19 PM, Erica Sadun via swift-evolution <swift-evolution at swift.org <mailto:swift-evolution at swift.org>> wrote:
> Let me tl;dr'er this even more: βΉοΈ is an operator, but π is an identifier.
>
> -- E, succinct, who thinks there's room for improvement
>
> Ha, yes. Let's see if I can be as succinct in my contribution to the discussion:
>
> 1) Agree that current situation not ideal, for reasons above
+1, totally agreed. We really need to improve this, aiming for Swift 3.1 or Swift 4 seems like a really good idea, because the appetite for this sort of change will probably be very low after Swift 4.
> 2) The solution might best be not one but several proposals:
>
> 2a) Unicode normalization: invisible characters, Greek tonos, etc. (cf. previous message about previously proposed solution, which reflects Unicode recommendations in UTR #31)--low hanging fruit: there's an established Unicode recommendation with clear wins for security and consistency
>
> 2b) Legal and illegal characters for identifiers *or* operators: UTR #31 makes recommendations regarding rarely used scripts; probably best to follow the letter and spirit of these recommendations (which would probably mean ancient Greek musical symbols and Egyptian hieroglyphics shouldn't be identifier or operator characters)
>
> 2c) Decisions as to which characters are identifier characters or operator characters: for instance, emoji should probably never be operator characters; if an emoji has a non-emoji counterpart that is an operator (βοΈβββββοΈ, etc.) it might be best simply to make these illegal rather than operator characters
>
> 2d) Confusables: I think the last time we had this discussion, it was apparent that it'd be difficult to decide which confusables to allow or disallow after some of the low-hanging fruit is taken care of by Unicode normalization (see item 2a); the Unicode Consortium-provided list seems too quick to call two things "confusable" for our purposes (with criteria that might be relevant for URLs or other use cases, but casting too wide a net perhaps for Swift identifiers)
These all seem like good points. I agree that we should default to following an existing Unicode standard unless there is a really good reason to deviate.
I donβt have an opinion about the specific direction of the proposal though.
-Chris
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-evolution/attachments/20160918/f3a7feec/attachment.html>
More information about the swift-evolution
mailing list