[swift-evolution] Prohibit invisible characters in identifier names

Mon Jun 20 23:37:14 CDT 2016

> On Jun 21, 2016, at 2:23 AM, Brent Royal-Gordon via swift-evolution <swift-evolution at swift.org> wrote:
> 
>> Perhaps stupid but: why was Swift designed to accept most Unicode characters in identifier names? Wouldn’t it be simpler to go back to a model where only standard ascii characters are accepted in identifier names?
> 
> I assume it has something to do with the fact that 94.6% of the world's population speak a first language which is not English. That outweighs the inconvenience for Anglo developers, IMHO.

Yes, but the SDKs (frameworks, system libraries) are all in English, including Swift standard library. I remember a few languages attempting localized versions for kids to study better, failing terribly because you learned something that had a very very limited use.

When it comes to maintaining code, using localized identifier names is a bad practice since anyone outside that country coming to the code can't really use it. I personally can't imagine coming to maintain Swift code with identifiers in Chinese, Japanese, Arabic, ...

While the feature of non-ASCII characters being allowed as identifiers (which was held up high with Apple giving emoji examples) may seem cool, I can only see this helpful in the future, given a different keyboard layout (as someone has pointed out some time ago here), to introduce one-character operators that would be otherwise impossible. But if someone came to me with a code where a variable would be an emoji of a dog, he'd get fired on the spot.

I'd personally vote to keep the zero-width-joiner characters forbidden within the code outside of string literals (where they may make sense). I agree that this can be easily solved by linters, but: I think this particular set of characters should be restricted by the language itself, since it's something easily omittable during code review and given the upcoming package manager, this can lead to a hard-to-find malware being distributed among developers who include these packages within their projects - since you usually do not run a linter on a 3rd party code.

As for the confusables - this depends a lot on the rendering and what font you have set. I've tried  𝛎 → v with current Xcode and it looks really different, mostly when you use a fixed-space font which usually doesn't have non-ASCII characters which are then rendered using a different font, making the distinction easy to spot.

> 
> Honestly, this seems to me like a concern for linters and security auditing tools, not for the compiler. Swift identifiers are case-sensitive; I see no reason they shouldn't be script-sensitive or zero-width-joiner-sensitive. (Though basic Unicode normalization seems like a good idea, since differently-normalized strings are `==` anyway.)
> 
> -- 
> Brent Royal-Gordon
> Architechies
> 
> _______________________________________________
> swift-evolution mailing list
> swift-evolution at swift.org
> https://lists.swift.org/mailman/listinfo/swift-evolution