[swift-evolution] Prohibit invisible characters in identifier names

João Pinheiro joao at joaopinheiro.org
Mon Jun 20 14:42:32 CDT 2016


I agree that treating zero-width spaces as non-existent would be a possible solution, but I think it would make more sense to consider it as white space and thus not admissible in identifier names. I'm not sure of what the best way to handle left-to-right and right-to-left markers would be. Does it make sense to allow mixed text orientation in identifiers?

Removing ambiguity between unicode confusables is a much more complicated issue which implies defining a canonical unicode representation for identifiers and a way to resolve them. It would also make it impractical to use certain valid mathematical symbols as identifiers.

João Pinheiro


> On 20 Jun 2016, at 20:23, Xiaodi Wu <xiaodi.wu at gmail.com> wrote:
> 
> On Mon, Jun 20, 2016 at 2:17 PM, João Pinheiro <swift-evolution at swift.org <mailto:swift-evolution at swift.org>> wrote:
> Nice feature in the IBM Swift Sandbox. Xcode doesn't display zero-width spaces either so the identifier names look exactly the same.
> 
> The issue with left-to-right and right-to-left markers is interesting and has previously been exploited in email phishing attacks.
> 
> It would be possible to highlight invisible characters in Xcode as a stopgap measure, but that doesn't solve the problem for developers using other editors or in other platforms. I think it would be a better idea to sanitise the set of allowed (or prohibited) characters for identifiers at the language level.
> 
> This is a potential security problem, but no need try to invent an ad-hoc solution here, particularly one as drastic as prohibiting characters. The same security considerations are applicable elsewhere and there's a lot of work about Unicode security. See here: http://www.unicode.org/reports/tr39/ <http://www.unicode.org/reports/tr39/>
> 
> Unicode maintains a list of "confusable" characters. See here: http://www.unicode.org/Public/security/latest/confusables.txt <http://www.unicode.org/Public/security/latest/confusables.txt>
> 
> It should be sufficient to regard confusables as the same glyph for the purpose of identifier names; zero-width and invisible marks would then be regarded as non-existent, so that `test` and `t[invisible glyph]est` would refer to the same variable.
> 
> 
> Sincerely,
> João Pinheiro
> 
> 
> > On 20 Jun 2016, at 19:26, Vladimir.S <svabox at gmail.com <mailto:svabox at gmail.com>> wrote:
> >
> > Very interesting.
> >
> > Btw, IBM Swift Sandbox shows these spaces:
> > https://swiftlang.ng.bluemix.net/ <https://swiftlang.ng.bluemix.net/>
> > But my mail client does not - i.e. I saw exactly the same "test"&"abc"
> >
> > Also, I read about some issues with left-to-right and right-to-left markers that also somehow change the actual text of source - i.e. you see one text, but when it compiles - it works not as expected. I.e. viewer/editor processes these special codes and show you one text, but compiler treats text in another way.
> >
> > I believe it is a potential security problem that all unicode chars are allowed for variables/func names in Swift. IMO We definitely should limit allowed charset for identifiers in sources.
> >
> > On 20.06.2016 20:51, João Pinheiro via swift-evolution wrote:
> >> Recently there has been a screenshot going around Twitter about C++ allowing zero-width spaces in variable names. Swift also suffers from this problem which can be abused to create ambiguous, misleading, and potentially obfuscate nefarious code.
> >>
> >> I would like to propose a change to prohibit the use of invisible characters in identifier names.
> >>
> >> I'm including an example of problematic code at the bottom of this email.
> >>
> >> Sincerely,
> >> João Pinheiro
> >>
> >>
> >> /* The output for this code is:
> >> A
> >> B
> >> C
> >> 1
> >> 2
> >> 3
> >> */
> >>
> >> func test() { print("A") }
> >> func t​est() { print("B") }
> >> func te​st() { print("C") }
> >>
> >> let abc = 1
> >> let a​bc = 2
> >> let ab​c = 3
> >>
> >> test()
> >> t​est()
> >> te​st()
> >>
> >> print(abc)
> >> print(a​bc)
> >> print(ab​c)
> >> _______________________________________________
> >> swift-evolution mailing list
> >> swift-evolution at swift.org <mailto:swift-evolution at swift.org>
> >> https://lists.swift.org/mailman/listinfo/swift-evolution <https://lists.swift.org/mailman/listinfo/swift-evolution>
> >>
> 
> _______________________________________________
> swift-evolution mailing list
> swift-evolution at swift.org <mailto:swift-evolution at swift.org>
> https://lists.swift.org/mailman/listinfo/swift-evolution <https://lists.swift.org/mailman/listinfo/swift-evolution>
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-evolution/attachments/20160620/eb0c8d4d/attachment.html>


More information about the swift-evolution mailing list