[swift-evolution] A path forward on rationalizing unicode identifiers and operators

Ethan Tira-Thompson etirathompson at apple.com
Mon Oct 2 19:28:27 CDT 2017


I’m all for fixing pressing issues requested by Xiaodi, but beyond that I request we give a little more thought to the long term direction.

My 2¢ is I’ve been convinced that very few characters are “obviously” either a operator or identifier across all contexts where they might be used.  Thus relegating the vast majority of thousands of ambiguous characters to committee to decide a single global usage.  But that is both a huge time sink and fundamentally flawed in approach due to the contextual dependency of who is using them.

For example, if a developer finds a set of symbols which perfectly denote some niche concept, do you really expect the developer to submit a proposal and wait months/years to get the characters classified and then a new compiler version to be distributed, all so that developer can adopt his/her own notation?

And then after that is done, now say a member of some distant tribe complains they wanted to use one of those characters to write identifiers using their native language.  Even though there may be zero intersection between these two user groups, this path forces Swift itself to pick a side of one vs. the other.

Surely there is some way to enable the local developer to resolve these choices rather than putting the swift language definition on the critical path?

The goals I know of:
1. Performance: don’t require parsing all imports to get the operator set
2. Security: don’t let imports do surprising/obfuscated stuff
3. Functionality: do let users write what they want, or import/share libraries for niche domains
4. Well defined: resolve conflicts, e.g. between libraries

I’m a little out of my league, but let’s say we want to use operator ᵀ from some matrixlib, how about:
	import matrixlib (operator: ᵀ)

Or if you want several operators:
	import matrixlib (operators: [ᵀ,·,⊗])

Ideally, any local operator definitions “just work” across their own module, but if it requires a “import (operator: ×)” in each file for performance, so be it.

A whitelist of “standard” operators would automatically import (i.e. initialize the operator character list) to maintain compatibility with current usage.  But you can imagine additional arguments to the import call, such as “standardOperators: false” to import only the explicitly listed operators and reduce potential surprises.

My rationale vs. the goals:
1. Performance: the operator character set vs. identifiers (everything else) can be determined within the file itself
2. Security: developer explicitly opts-in to the special operators they want to use, and readers can see where an operator comes from
3. Functionality: user is able to define their operators without getting committee involved
4. Well defined: potential conflict between libraries resolved by client’s choice to import or exclude the operator

Does this have potential?

-Ethan


> On Oct 2, 2017, at 10:59 AM, David Sweeris via swift-evolution <swift-evolution at swift.org> wrote:
> 
> 
> On Oct 2, 2017, at 09:14, Xiaodi Wu <xiaodi.wu at gmail.com <mailto:xiaodi.wu at gmail.com>> wrote:
> 
>> What is your use case for this?
>> 
>> On Mon, Oct 2, 2017 at 10:56 David Sweeris via swift-evolution <swift-evolution at swift.org <mailto:swift-evolution at swift.org>> wrote:
>> 
>> On Oct 1, 2017, at 22:01, Chris Lattner via swift-evolution <swift-evolution at swift.org <mailto:swift-evolution at swift.org>> wrote:
>> 
>>> 
>>>> On Oct 1, 2017, at 9:26 PM, Kenny Leung via swift-evolution <swift-evolution at swift.org <mailto:swift-evolution at swift.org>> wrote:
>>>> 
>>>> Hi All.
>>>> 
>>>> I’d like to help as well. I have fun with operators.
>>>> 
>>>> There is also the issue of code security with invisible unicode characters and characters that look exactly alike.
>>> 
>>> Unless there is a compelling reason to add them, I think we should ban invisible characters.  What is the harm of characters that look alike?
>> 
>> Especially if people want to use the character in question as both an identifier and an operator: We can make the character an identifier and its lookalike an operator (or the other way around).
> 
> Off the top of my head...
> In calculus, “𝖽” (MATHEMATICAL SANS-SERIF SMALL D) would be a fine substitute for "d" in “𝖽y/𝖽x” ("the derivative of y(x) with respect to x").
> In statistics, we could use "𝖢" (MATHEMATICAL SANS-SERIF CAPITAL C), as in "5𝖢3" to mimic the "5C3" notation ("5 choose 3"). And although not strictly an issue of identifiers vs operators, “!” (FULLWIDTH EXCLAMATION MARK) would be an ok substitution (that extra space on the right looks funny) for "!" in “4!” ("4 factorial").
> 
> I'm sure there are other examples from math/science/<insert any "symbology"-heavy DSL here>, but “d” in particular is one that I’ve wanted for a while since Swift classifies "∂" (the partial derivative operator) as an operator rather than an identifier, making it impossible to use a consistent syntax between normal derivatives and partial derivatives (normal derivatives are "d(y)/d(x)", whereas partial derivatives get to drop the parens "∂y/∂x")
> 
> - Dave Sweeris
> _______________________________________________
> swift-evolution mailing list
> swift-evolution at swift.org <mailto:swift-evolution at swift.org>
> https://lists.swift.org/mailman/listinfo/swift-evolution <https://lists.swift.org/mailman/listinfo/swift-evolution>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-evolution/attachments/20171002/07a21781/attachment.html>


More information about the swift-evolution mailing list