[swift-evolution] A path forward on rationalizing unicode identifiers and operators

Xiaodi Wu xiaodi.wu at gmail.com
Mon Oct 2 19:45:35 CDT 2017


On Mon, Oct 2, 2017 at 19:28 Ethan Tira-Thompson via swift-evolution <
swift-evolution at swift.org> wrote:

> I’m all for fixing pressing issues requested by Xiaodi, but beyond that I
> request we give a little more thought to the long term direction.
>
> My 2¢ is I’ve been convinced that very few characters are “obviously”
> either a operator or identifier across all contexts where they might be
> used.  Thus relegating the vast majority of thousands of ambiguous
> characters to committee to decide a single global usage.  But that is both
> a huge time sink and fundamentally flawed in approach due to the contextual
> dependency of who is using them.
>
> For example, if a developer finds a set of symbols which perfectly denote
> some niche concept, do you really expect the developer to submit a proposal
> and wait months/years to get the characters classified and then a new
> compiler version to be distributed, all so that developer can adopt his/her
> own notation?
>

The Unicode Consortium already has a document describing which Unicode
characters are suitable identifiers in programming languages, with guidance
as to how to customize that list around the edges. This is already adopted
by other programming languages. So, with little design effort, that task is
not only doable but largely done.

As to operators, again, I am of the strong opinion that making it possible
for developers to adopt any preferred notation for any purpose (a) is
fundamentally incompatible with the division between operators and
identifiers, as I believe you’re saying here; and (b) should be a non-goal
from the outset. The only task, so far as I can tell, left to do is to
identify what pragmatic set of (mostly mathematical) symbols are used as
operators in the wider world and are likely to be already used in Swift
code or part of common use cases where an operator is clearly superior to
alternative spellings. In my view, the set of valid operator characters not
only shouldn’t require parsing or import directives, but should be small
enough to be knowable by memory.

And then after that is done, now say a member of some distant tribe
> complains they wanted to use one of those characters to write identifiers
> using their native language.  Even though there may be zero intersection
> between these two user groups, this path forces Swift itself to pick a side
> of one vs. the other.
>
> Surely there is some way to enable the local developer to resolve these
> choices rather than putting the swift language definition on the critical
> path?
>
> The goals I know of:
> 1. Performance: don’t require parsing all imports to get the operator set
> 2. Security: don’t let imports do surprising/obfuscated stuff
> 3. Functionality: do let users write what they want, or import/share
> libraries for niche domains
> 4. Well defined: resolve conflicts, e.g. between libraries
>
> I’m a little out of my league, but let’s say we want to use operator ᵀ
> from some matrixlib, how about:
> import matrixlib (operator: ᵀ)
>
> Or if you want several operators:
> import matrixlib (operators: [ᵀ,·,⊗])
>
> Ideally, any local operator definitions “just work” across their own
> module, but if it requires a “import (operator: ×)” in each file for
> performance, so be it.
>
> A whitelist of “standard” operators would automatically import (i.e.
> initialize the operator character list) to maintain compatibility with
> current usage.  But you can imagine additional arguments to the import
> call, such as “standardOperators: false” to import only the explicitly
> listed operators and reduce potential surprises.
>
> My rationale vs. the goals:
> 1. Performance: the operator character set vs. identifiers (everything
> else) can be determined within the file itself
> 2. Security: developer explicitly opts-in to the special operators they
> want to use, and readers can see where an operator comes from
> 3. Functionality: user is able to define their operators without getting
> committee involved
> 4. Well defined: potential conflict between libraries resolved by client’s
> choice to import or exclude the operator
>
> Does this have potential?
>
> -Ethan
>
> On Oct 2, 2017, at 10:59 AM, David Sweeris via swift-evolution <
> swift-evolution at swift.org> wrote:
>
>
> On Oct 2, 2017, at 09:14, Xiaodi Wu <xiaodi.wu at gmail.com> wrote:
>
> What is your use case for this?
>
> On Mon, Oct 2, 2017 at 10:56 David Sweeris via swift-evolution <
> swift-evolution at swift.org> wrote:
>
>>
>> On Oct 1, 2017, at 22:01, Chris Lattner via swift-evolution <
>> swift-evolution at swift.org> wrote:
>>
>>
>> On Oct 1, 2017, at 9:26 PM, Kenny Leung via swift-evolution <
>> swift-evolution at swift.org> wrote:
>>
>> Hi All.
>>
>> I’d like to help as well. I have fun with operators.
>>
>> There is also the issue of code security with invisible unicode
>> characters and characters that look exactly alike.
>>
>>
>> Unless there is a compelling reason to add them, I think we should ban
>> invisible characters.  What is the harm of characters that look alike?
>>
>>
>> Especially if people want to use the character in question as both an
>> identifier and an operator: We can make the character an identifier and its
>> lookalike an operator (or the other way around).
>>
>
> Off the top of my head...
> In calculus, “𝖽” (MATHEMATICAL SANS-SERIF SMALL D) would be a fine
> substitute for "d" in “𝖽y/𝖽x” ("the derivative of y(x) with respect to
> x").
> In statistics, we could use "𝖢" (MATHEMATICAL SANS-SERIF CAPITAL C), as
> in "5𝖢3" to mimic the "5C3" notation ("5 choose 3"). And although not
> strictly an issue of identifiers vs operators, “!” (FULLWIDTH EXCLAMATION
> MARK) would be an ok substitution (that extra space on the right looks
> funny) for "!" in “4!” ("4 factorial").
>
> I'm sure there are other examples from math/science/<insert any
> "symbology"-heavy DSL here>, but “d” in particular is one that I’ve wanted
> for a while since Swift classifies "∂" (the partial derivative operator) as
> an operator rather than an identifier, making it impossible to use a
> consistent syntax between normal derivatives and partial derivatives
> (normal derivatives are "d(y)/d(x)", whereas partial derivatives get to
> drop the parens "∂y/∂x")
>
> - Dave Sweeris
>
> _______________________________________________
> swift-evolution mailing list
> swift-evolution at swift.org
> https://lists.swift.org/mailman/listinfo/swift-evolution
>
> _______________________________________________
> swift-evolution mailing list
> swift-evolution at swift.org
> https://lists.swift.org/mailman/listinfo/swift-evolution
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-evolution/attachments/20171003/8ed0533e/attachment.html>


More information about the swift-evolution mailing list