<div dir="ltr">On Mon, Jun 20, 2016 at 2:42 PM, João Pinheiro <span dir="ltr"><<a href="mailto:joao@joaopinheiro.org" target="_blank">joao@joaopinheiro.org</a>></span> wrote:<br><div class="gmail_extra"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word">I agree that treating zero-width spaces as non-existent would be a possible solution, but I think it would make more sense to consider it as white space and thus not admissible in identifier names.</div></blockquote><div><br></div><div>If you treat it like whitespace, then you get interesting behaviors that I don't think you would want. For example, something that looks like `if letter...` could be parsed as conditional binding `if let ter...` if I put in a zero-width space in the right place.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word">I'm not sure of what the best way to handle left-to-right and right-to-left markers would be. Does it make sense to allow mixed text orientation in identifiers?</div></blockquote><div><br></div><div>How do other languages that support Unicode handle these markers in identifiers? I'd be interested to know.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word"><div>Removing ambiguity between unicode confusables is a much more complicated issue which implies defining a canonical unicode representation for identifiers and a way to resolve them. It would also make it impractical to use certain valid mathematical symbols as identifiers.<br></div></div></blockquote><div><br></div><div>Most interesting mathematical symbols are reserved for operators anyway. As a result, `x` and the multiplication symbol are not readily confusable in most contexts in Swift, and confusable resolution could be built in such a way that identifier characters are not regarded as confusable with operator characters.</div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word"><div></div><span class="HOEnZb"><font color="#888888"><div><br></div><div>João Pinheiro</div></font></span><div><div class="h5"><div><br></div><div><br><div><blockquote type="cite"><div>On 20 Jun 2016, at 20:23, Xiaodi Wu <<a href="mailto:xiaodi.wu@gmail.com" target="_blank">xiaodi.wu@gmail.com</a>> wrote:</div><br><div><div dir="ltr">On Mon, Jun 20, 2016 at 2:17 PM, João Pinheiro <span dir="ltr"><<a href="mailto:swift-evolution@swift.org" target="_blank">swift-evolution@swift.org</a>></span> wrote:<br><div class="gmail_extra"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">Nice feature in the IBM Swift Sandbox. Xcode doesn't display zero-width spaces either so the identifier names look exactly the same.<br>
<br>
The issue with left-to-right and right-to-left markers is interesting and has previously been exploited in email phishing attacks.<br>
<br>
It would be possible to highlight invisible characters in Xcode as a stopgap measure, but that doesn't solve the problem for developers using other editors or in other platforms. I think it would be a better idea to sanitise the set of allowed (or prohibited) characters for identifiers at the language level.<br></blockquote><div><br></div><div>This is a potential security problem, but no need try to invent an ad-hoc solution here, particularly one as drastic as prohibiting characters. The same security considerations are applicable elsewhere and there's a lot of work about Unicode security. See here: <a href="http://www.unicode.org/reports/tr39/" target="_blank">http://www.unicode.org/reports/tr39/</a></div><div><br></div><div>Unicode maintains a list of "confusable" characters. See here: <a href="http://www.unicode.org/Public/security/latest/confusables.txt" target="_blank">http://www.unicode.org/Public/security/latest/confusables.txt</a></div><div><br></div><div>It should be sufficient to regard confusables as the same glyph for the purpose of identifier names; zero-width and invisible marks would then be regarded as non-existent, so that `test` and `t[invisible glyph]est` would refer to the same variable.</div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<br>
Sincerely,<br>
João Pinheiro<br>
<div><div><br>
<br>
> On 20 Jun 2016, at 19:26, Vladimir.S <<a href="mailto:svabox@gmail.com" target="_blank">svabox@gmail.com</a>> wrote:<br>
><br>
> Very interesting.<br>
><br>
> Btw, IBM Swift Sandbox shows these spaces:<br>
> <a href="https://swiftlang.ng.bluemix.net/" rel="noreferrer" target="_blank">https://swiftlang.ng.bluemix.net/</a><br>
> But my mail client does not - i.e. I saw exactly the same "test"&"abc"<br>
><br>
> Also, I read about some issues with left-to-right and right-to-left markers that also somehow change the actual text of source - i.e. you see one text, but when it compiles - it works not as expected. I.e. viewer/editor processes these special codes and show you one text, but compiler treats text in another way.<br>
><br>
> I believe it is a potential security problem that all unicode chars are allowed for variables/func names in Swift. IMO We definitely should limit allowed charset for identifiers in sources.<br>
><br>
> On 20.06.2016 20:51, João Pinheiro via swift-evolution wrote:<br>
>> Recently there has been a screenshot going around Twitter about C++ allowing zero-width spaces in variable names. Swift also suffers from this problem which can be abused to create ambiguous, misleading, and potentially obfuscate nefarious code.<br>
>><br>
>> I would like to propose a change to prohibit the use of invisible characters in identifier names.<br>
>><br>
>> I'm including an example of problematic code at the bottom of this email.<br>
>><br>
>> Sincerely,<br>
>> João Pinheiro<br>
>><br>
>><br>
>> /* The output for this code is:<br>
>> A<br>
>> B<br>
>> C<br>
>> 1<br>
>> 2<br>
>> 3<br>
>> */<br>
>><br>
>> func test() { print("A") }<br>
>> func test() { print("B") }<br>
>> func test() { print("C") }<br>
>><br>
>> let abc = 1<br>
>> let abc = 2<br>
>> let abc = 3<br>
>><br>
>> test()<br>
>> test()<br>
>> test()<br>
>><br>
>> print(abc)<br>
>> print(abc)<br>
>> print(abc)<br>
>> _______________________________________________<br>
>> swift-evolution mailing list<br>
>> <a href="mailto:swift-evolution@swift.org" target="_blank">swift-evolution@swift.org</a><br>
>> <a href="https://lists.swift.org/mailman/listinfo/swift-evolution" rel="noreferrer" target="_blank">https://lists.swift.org/mailman/listinfo/swift-evolution</a><br>
>><br>
<br>
_______________________________________________<br>
swift-evolution mailing list<br>
<a href="mailto:swift-evolution@swift.org" target="_blank">swift-evolution@swift.org</a><br>
<a href="https://lists.swift.org/mailman/listinfo/swift-evolution" rel="noreferrer" target="_blank">https://lists.swift.org/mailman/listinfo/swift-evolution</a><br>
</div></div></blockquote></div><br></div></div>
</div></blockquote></div><br></div></div></div></div></blockquote></div><br></div></div>