[swift-evolution] Prohibit invisible characters in identifier names

Xiaodi Wu xiaodi.wu at gmail.com
Mon Jun 20 15:07:49 CDT 2016


On Mon, Jun 20, 2016 at 2:42 PM, João Pinheiro <joao at joaopinheiro.org>
wrote:

> I agree that treating zero-width spaces as non-existent would be a
> possible solution, but I think it would make more sense to consider it as
> white space and thus not admissible in identifier names.
>

If you treat it like whitespace, then you get interesting behaviors that I
don't think you would want. For example, something that looks like `if
letter...` could be parsed as conditional binding `if let ter...` if I put
in a zero-width space in the right place.


> I'm not sure of what the best way to handle left-to-right and
> right-to-left markers would be. Does it make sense to allow mixed text
> orientation in identifiers?
>

How do other languages that support Unicode handle these markers in
identifiers? I'd be interested to know.


> Removing ambiguity between unicode confusables is a much more complicated
> issue which implies defining a canonical unicode representation for
> identifiers and a way to resolve them. It would also make it impractical to
> use certain valid mathematical symbols as identifiers.
>

Most interesting mathematical symbols are reserved for operators anyway. As
a result, `x` and the multiplication symbol are not readily confusable in
most contexts in Swift, and confusable resolution could be built in such a
way that identifier characters are not regarded as confusable with operator
characters.


> João Pinheiro
>
>
> On 20 Jun 2016, at 20:23, Xiaodi Wu <xiaodi.wu at gmail.com> wrote:
>
> On Mon, Jun 20, 2016 at 2:17 PM, João Pinheiro <swift-evolution at swift.org>
> wrote:
>
>> Nice feature in the IBM Swift Sandbox. Xcode doesn't display zero-width
>> spaces either so the identifier names look exactly the same.
>>
>> The issue with left-to-right and right-to-left markers is interesting and
>> has previously been exploited in email phishing attacks.
>>
>> It would be possible to highlight invisible characters in Xcode as a
>> stopgap measure, but that doesn't solve the problem for developers using
>> other editors or in other platforms. I think it would be a better idea to
>> sanitise the set of allowed (or prohibited) characters for identifiers at
>> the language level.
>>
>
> This is a potential security problem, but no need try to invent an ad-hoc
> solution here, particularly one as drastic as prohibiting characters. The
> same security considerations are applicable elsewhere and there's a lot of
> work about Unicode security. See here:
> http://www.unicode.org/reports/tr39/
>
> Unicode maintains a list of "confusable" characters. See here:
> http://www.unicode.org/Public/security/latest/confusables.txt
>
> It should be sufficient to regard confusables as the same glyph for the
> purpose of identifier names; zero-width and invisible marks would then be
> regarded as non-existent, so that `test` and `t[invisible glyph]est` would
> refer to the same variable.
>
>
>> Sincerely,
>> João Pinheiro
>>
>>
>> > On 20 Jun 2016, at 19:26, Vladimir.S <svabox at gmail.com> wrote:
>> >
>> > Very interesting.
>> >
>> > Btw, IBM Swift Sandbox shows these spaces:
>> > https://swiftlang.ng.bluemix.net/
>> > But my mail client does not - i.e. I saw exactly the same "test"&"abc"
>> >
>> > Also, I read about some issues with left-to-right and right-to-left
>> markers that also somehow change the actual text of source - i.e. you see
>> one text, but when it compiles - it works not as expected. I.e.
>> viewer/editor processes these special codes and show you one text, but
>> compiler treats text in another way.
>> >
>> > I believe it is a potential security problem that all unicode chars are
>> allowed for variables/func names in Swift. IMO We definitely should limit
>> allowed charset for identifiers in sources.
>> >
>> > On 20.06.2016 20:51, João Pinheiro via swift-evolution wrote:
>> >> Recently there has been a screenshot going around Twitter about C++
>> allowing zero-width spaces in variable names. Swift also suffers from this
>> problem which can be abused to create ambiguous, misleading, and
>> potentially obfuscate nefarious code.
>> >>
>> >> I would like to propose a change to prohibit the use of invisible
>> characters in identifier names.
>> >>
>> >> I'm including an example of problematic code at the bottom of this
>> email.
>> >>
>> >> Sincerely,
>> >> João Pinheiro
>> >>
>> >>
>> >> /* The output for this code is:
>> >> A
>> >> B
>> >> C
>> >> 1
>> >> 2
>> >> 3
>> >> */
>> >>
>> >> func test() { print("A") }
>> >> func t​est() { print("B") }
>> >> func te​st() { print("C") }
>> >>
>> >> let abc = 1
>> >> let a​bc = 2
>> >> let ab​c = 3
>> >>
>> >> test()
>> >> t​est()
>> >> te​st()
>> >>
>> >> print(abc)
>> >> print(a​bc)
>> >> print(ab​c)
>> >> _______________________________________________
>> >> swift-evolution mailing list
>> >> swift-evolution at swift.org
>> >> https://lists.swift.org/mailman/listinfo/swift-evolution
>> >>
>>
>> _______________________________________________
>> swift-evolution mailing list
>> swift-evolution at swift.org
>> https://lists.swift.org/mailman/listinfo/swift-evolution
>>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-evolution/attachments/20160620/66ec5302/attachment.html>


More information about the swift-evolution mailing list