[swift-evolution] [Draft] Refining identifier and operator symbology (take 2)

Xiaodi Wu xiaodi.wu at gmail.com
Mon Feb 27 21:01:17 CST 2017


On Sun, Feb 26, 2017 at 11:50 AM, Nevin Brackett-Rozinsky via
swift-evolution <swift-evolution at swift.org> wrote:

> This looks very good Xiaodi, and I have a few thoughts about it.
>
> First, is the intent that Swift will follow future changes to Unicode
> operator recommendations, or that Swift will choose a “frozen in time” set
> of Unicode recommendations to adopt? If the former, then we will likely see
> source-breaking changes as Unicode evolves. And if the latter, then Swift’s
> choices are apt to diverge even more from Unicode’s over time.
>

Great question. I guess the text leaves the mechanics of forward
compatibility unsaid. The answer is: both.

With respect to Unicode identifiers, UAX#31 guarantees future compatibility
for ID_Start and ID_Continue. That is, anything that is currently valid in
ID_Start will be valid in ID_Start for all time. It is reasonable to expect
that the same experts will adopt that approach for their operator
recommendations in the future. Indeed, they have set themselves up fairly
well for this already: UAX#31 also guarantees that Pattern_Syntax
characters will never be moved into ID_Start or ID_Continue. Therefore, we
also have a guarantee that the approach for Swift's operators proposed here
will *never* overlap with Swift's identifier characters even as Unicode
evolves.

Second, it is well-established that programming operators do not have to be
> mathematical. For example, Swift uses the punctuation marks ‘!’, ‘?’, and
> ‘&’ as operators in its standard library. The approach described in your
> proposal does an excellent job at covering the core mathematical operator
> characters in Unicode, however it does not appear to make such an effort
> toward non-mathematical operators.
>
> Of particular note, given that ‘?’, ‘¿’, and ‘‽’ are operator characters,
> it seems inconsistent to omit ‘⸘’. Similarly, with ‘&’ an operator, one
> would expect ‘⅋’ to be as well. I see that “expanding the set of operator
> characters” is listed as a non-goal, however that does not make it an
> anti-goal, and the proposal indeed expands the set by adding ‘\’. Likewise
> “rectifying Unicode shortcomings” is listed as a non-goal, although the
> proposal incorporates some 16 characters for Swift 3 compatibility.
>

Expanding the set of valid operator characters by adding `\` is not a goal
for this proposal. However, it so happens that UTR#25 explicitly mentions
`\` as an operator. In fact, UTR#25 lists every one of Swift's ASCII
operators as mathematical operators not classified as [:Math:], minus `?`
but plus `\`. Therefore, if we agree that the alignment of Swift to Unicode
recommendations as closely as possible is a desirable goal, the most
intellectually honest set of ASCII operators would include `\`. Now, if
Swift-specific implementation concerns preclude its inclusion, then I
personally wouldn't fight it.

The proposal makes no attempt to define a "non-mathematical operator"
because, again, Unicode has no such definition--yet. There is no approach
of which I'm aware to achieving consensus on that topic, short of either
(a) waiting for more expert hands over at the Unicode Consortium; or (b) a
character-by-character survey of all symbols in Unicode by non-experts (I
count myself here) on this list, which is an explicit anti-goal of this
proposal. In anticipation of Unicode completing its work, this proposal
advances a design that (as I write above) makes possible the adoption of
future Unicode recommendations in a source-compatible way. The chief
mechanism by which this is guaranteed is by not assigning non-[:Math:]
Pattern_Syntax characters (emoji excepted) to either identifiers or
operators. It addresses the most common concern of those responding to an
earlier version of this proposal, who argued against restricting operators
in the interim to only ASCII characters (which would also be a
source-compatible approach that makes room for future Unicode
recommendations) because there is a set of non-ASCII characters that have
unambiguously the characteristics of "operatorlikeness" useful to enable a
more math-like syntax. The proposal here makes no effort to expand our
understanding of what an operator is beyond what's required for the Swift
standard library plus Unicode's somewhat imperfect classification of
mathematical symbols. Indeed, the proposal makes explicit the expectation
that Unicode experts will undertake that task.

The 20 characters included for Swift 3 compatibility have as their
objective only the preservation of Swift 3 source compatibility. They
represent an educated guess (based on public code samples and messages to
this list) as to what symbols are most likely to be used in real, shipping
Swift code, absent arguments against inclusion on other grounds. They are
not intended to represent any attempt at rationalization in alignment with
some Unicode-recommended criterion. As I mentioned, I'm eager to hear
feedback to the effect that some real, shipping code would be broken by the
proposal. I'm sensitive to the dissatisfactory nature of apparent
inconsistency. However, if the omission of `⸘` is to be regarded as a grave
shortcoming on the grounds of inconsistency, then it would be more in
alignment with the stated goals to drop `‽` as a compatibility character
than to include `⸘`. There is no evidence that either is in use. Again, the
purpose of including `¿` is really as stated: it has been mentioned on this
list that people use it as an operator in existing Swift code, and thus it
is included for compatibility.

Another point that may be worth considering, are the two specific
> characters ‘∅’ and ‘∞’ which, although strongly mathematical, are
> definitely not operators. They are names for things—objects, quantities—and
> thus by the principle of least surprise they should be available for use in
> identifier names. Just as one might write “let π = Double.pi” at the top of
> a file, so too might one write “let ∞ = Double.infinity” or “let ∅ =
> Set<Int>()” for use later on:
>
> let y = sin(π * x)
> if tan(θ) == ∞ { … }
> var s = ∅
>
> Thus, for the purpose of consistency, I think it makes sense to classify
> ‘∅’ and ‘∞’ as identifiers, as well as ‘⸘’ and ‘⅋’ as operators.
> Alternatively, ‘∞’ could be a floating-point literal, in which case it
> still would not be an operator.
>

There are more than just two such characters. For example, U+29DE INFINITY
NEGATED WITH VERTICAL BAR. There are also a slew of other characters
classified as "operators" by Unicode which have shades of
"identifierlikeness." See, for example, how tiny and miny (which I think
you'll agree pass the "operatorlikeness" smell test, being as they are tiny
versions of plus and minus) are used in math to denote values. As you will
see from previous discussions, this can prompt extensive
character-by-character debate: again, an anti-goal.

Now, I will grant you that however fuzzy the line between
"identifierlikeness" and "operatorlikeness," null set and infinity are
likely to fall on the "identifier" side of it. However, the fact remains
that Unicode has classified these characters as syntax characters (i.e.
Pattern_Syntax), and it is untenable for a community not made up of Unicode
experts to try to "fix" that classification character by character. There
are similar issues with identifier characters detailed in UAX#31, not to
mention likely issues not currently known to us. This proposal deliberately
omits any mention of specific characters outside the ASCII range, other
than 20 characters for source compatibility. As I mention above, I am not
convinced it is a good idea to include even those 20 absent evidence of
actual source breakage. In this particular case, since infinity and null
set are currently valid Swift 3 operators, it is their omission that would
increase source incompatibility.

I understand that you described this type of feedback (on particular
> characters) as “less helpful”, however it appears that the “most helpful”
> types of feedback are unnecessary: the proposal is well thought out, with a
> strong core approach. It is only in the fine details that a few
> improvements can be made, “lesser” though they may be.
>
> Nevin
>
> _______________________________________________
> swift-evolution mailing list
> swift-evolution at swift.org
> https://lists.swift.org/mailman/listinfo/swift-evolution
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-evolution/attachments/20170227/22dae1ec/attachment-0001.html>


More information about the swift-evolution mailing list