[swift-evolution] Prohibit invisible characters in identifier names

João Pinheiro joao at joaopinheiro.org
Mon Jun 20 16:17:19 CDT 2016

On 20 Jun 2016, at 21:07, Xiaodi Wu <xiaodi.wu at gmail.com> wrote:
> On Mon, Jun 20, 2016 at 2:42 PM, João Pinheiro <joao at joaopinheiro.org <mailto:joao at joaopinheiro.org>> wrote:
> I agree that treating zero-width spaces as non-existent would be a possible solution, but I think it would make more sense to consider it as white space and thus not admissible in identifier names.
> If you treat it like whitespace, then you get interesting behaviors that I don't think you would want. For example, something that looks like `if letter...` could be parsed as conditional binding `if let ter...` if I put in a zero-width space in the right place.

I hadn't thought of that possibility. Ignoring them has the problem of creating multiple valid representations for the same identifier though. Not allowing invisible characters in identifiers sounds like the best solution to me.

> I'm not sure of what the best way to handle left-to-right and right-to-left markers would be. Does it make sense to allow mixed text orientation in identifiers?
> How do other languages that support Unicode handle these markers in identifiers? I'd be interested to know.

Me too.

> Removing ambiguity between unicode confusables is a much more complicated issue which implies defining a canonical unicode representation for identifiers and a way to resolve them. It would also make it impractical to use certain valid mathematical symbols as identifiers.
> Most interesting mathematical symbols are reserved for operators anyway. As a result, `x` and the multiplication symbol are not readily confusable in most contexts in Swift, and confusable resolution could be built in such a way that identifier characters are not regarded as confusable with operator characters.

That would require maintaining a large list of exception characters though. Just like the problem with ignoring invisible characters mentioned above, eliminating confusables has the problem of creating multiple representations for the same identifier, which could become quite confusing and result in additional problems of its own. I think it would probably be best to avoid a situation where it's necessary to resolve different representations of an identifier.

João Pinheiro
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-evolution/attachments/20160620/3514d598/attachment.html>

More information about the swift-evolution mailing list