[swift-evolution] [Proposal] Refining Identifier and Operator Symbology

Nevin Brackett-Rozinsky nevin.brackettrozinsky at gmail.com
Sun Oct 23 18:59:42 CDT 2016


All right, I have gone through and formulated a set of characters to serve
as the core of our operator symbols. I started with [:Sm:] and removed
blocks and subheaders which are not clearly useful as operators (though may
be reincorporated selectively in the future). Then I added the rest of the
Arrows block, as well as punctuation symbols that are “operator-like”.

In particular, I kept Swift’s existing ASCII operators, and all of Swift’s
Latin-1 operators except for currency signs and the copyright and
registered trademark symbols. I also kept most of Swift’s existing General
Punctuation operators.

The end result is a set of 1,020 operator characters
<http://unicode.org/cldr/utility/list-unicodeset.jsp?a=%5B%5B%3ASm%3A%5D%0D%0A%0D%0A-%5Cp%7BBlock%3DSuperscripts+And+Subscripts%7D%0D%0A-%5Cp%7BBlock%3DMiscellaneous+Technical%7D%0D%0A-%5Cp%7BBlock%3DGeometric+Shapes%7D%0D%0A-%5Cp%7BBlock%3DMiscellaneous+Symbols%7D%0D%0A-%5Cp%7BBlock%3DAlphabetic+Presentation+Forms%7D%0D%0A-%5Cp%7BBlock%3DSmall+Form+Variants%7D%0D%0A-%5Cp%7BBlock%3DHalfwidth+And+Fullwidth+Forms%7D%0D%0A-%5Cp%7BBlock%3DMathematical+Alphanumeric+Symbols%7D%0D%0A-%5Cp%7BBlock%3DArabic+Mathematical+Alphabetic+Symbols%7D%0D%0A-%5Cp%7Bsubhead%3DVariant+letterforms+and+symbols%7D%0D%0A-%5Cp%7Bsubhead%3DLetterlike+symbol%7D%0D%0A%0D%0A%5Cp%7BBlock%3DArrows%7D%0D%0A%5B%2F+%3D+%5C-+%2B+%21+*+%25+%3C+%3E+%5C%26+%7C+%5C%5E+~+%3F%5D%0D%0A%5B%C2%A1+%C2%A2+%C2%A3+%C2%A4+%C2%A5+%C2%A6+%C2%A7+%C2%A9+%C2%AB+%C2%AC+%C2%AE+%C2%B0+%C2%B1+%C2%B6+%C2%BB+%C2%BF%5D+-+%5B%C2%A2+%C2%A3+%C2%A4+%C2%A5+%C2%A9+%C2%AE%5D%0D%0A%5Cp%7Bsubhead%3DGeneral+punctuation%7D+-+%5BU%2B203F+U%2B2040+U%2B2045+U%2B2046+U%2B2054%5D%0D%0A%5Cp%7Bsubhead%3DDouble+punctuation+for+vertical+text%7D%0D%0A%5Cp%7Bsubhead%3DArchaic+punctuation%7D+-+%5BU%2B2E31+U%2B2E33+U%2B2E34+U%2B2E3F%5D%0D%0AU%2B214B%5D&g=&i=>,
which removes 1,628 symbols
<http://unicode.org/cldr/utility/list-unicodeset.jsp?a=%5B%2F+%3D+%5C-+%2B+%21+*+%25+%3C+%3E+%5C%26+%7C+%5C%5E+~+%3F%0D%0AU%2B00A1+-+U%2B00A7%0D%0AU%2B00A9+U%2B00AB+U%2B00AC+U%2B00AE%0D%0AU%2B00B0+-+U%2B00B1%0D%0AU%2B00B6+U%2B00BB+U%2B00BF+U%2B00D7+U%2B00F7%0D%0AU%2B2016+-+U%2B2017%0D%0AU%2B2020+-+U%2B2027%0D%0AU%2B2030+-+U%2B203E%0D%0AU%2B2041+-+U%2B2053%0D%0AU%2B2055+-+U%2B205E%0D%0AU%2B2190+-+U%2B23FF%0D%0AU%2B2500+-+U%2B2775%0D%0AU%2B2794+-+U%2B2BFF%0D%0AU%2B2E00+-+U%2B2E7F%0D%0AU%2B3001+-+U%2B3003%0D%0AU%2B3008+-+U%2B3030%5D%0D%0A%0D%0A-%5B%5B%3ASm%3A%5D%0D%0A-%5Cp%7BBlock%3DSuperscripts+And+Subscripts%7D%0D%0A-%5Cp%7BBlock%3DMiscellaneous+Technical%7D%0D%0A-%5Cp%7BBlock%3DGeometric+Shapes%7D%0D%0A-%5Cp%7BBlock%3DMiscellaneous+Symbols%7D%0D%0A-%5Cp%7BBlock%3DAlphabetic+Presentation+Forms%7D%0D%0A-%5Cp%7BBlock%3DSmall+Form+Variants%7D%0D%0A-%5Cp%7BBlock%3DHalfwidth+And+Fullwidth+Forms%7D%0D%0A-%5Cp%7BBlock%3DMathematical+Alphanumeric+Symbols%7D%0D%0A-%5Cp%7BBlock%3DArabic+Mathematical+Alphabetic+Symbols%7D%0D%0A-%5Cp%7Bsubhead%3DVariant+letterforms+and+symbols%7D%0D%0A-%5Cp%7Bsubhead%3DLetterlike+symbol%7D%0D%0A%5Cp%7BBlock%3DArrows%7D%0D%0A%5B%2F+%3D+%5C-+%2B+%21+*+%25+%3C+%3E+%5C%26+%7C+%5C%5E+~+%3F%5D%0D%0A%5B%C2%A1+%C2%A2+%C2%A3+%C2%A4+%C2%A5+%C2%A6+%C2%A7+%C2%A9+%C2%AB+%C2%AC+%C2%AE+%C2%B0+%C2%B1+%C2%B6+%C2%BB+%C2%BF%5D+-+%5B%C2%A2+%C2%A3+%C2%A4+%C2%A5+%C2%A9+%C2%AE%5D%0D%0A%5Cp%7Bsubhead%3DGeneral+punctuation%7D+-+%5BU%2B203F+U%2B2040+U%2B2045+U%2B2046+U%2B2054%5D%0D%0A%5Cp%7Bsubhead%3DDouble+punctuation+for+vertical+text%7D%0D%0A%5Cp%7Bsubhead%3DArchaic+punctuation%7D+-+%5BU%2B2E31+U%2B2E33+U%2B2E34+U%2B2E3F%5D%0D%0AU%2B214B%5D&g=&i=>
from Swift’s existing operator set
<http://unicode.org/cldr/utility/list-unicodeset.jsp?a=%5B%2F+%3D+%5C-+%2B+%21+*+%25+%3C+%3E+%5C%26+%7C+%5C%5E+~+%3F%0D%0AU%2B00A1+-+U%2B00A7%0D%0AU%2B00A9+U%2B00AB+U%2B00AC+U%2B00AE%0D%0AU%2B00B0+-+U%2B00B1%0D%0AU%2B00B6+U%2B00BB+U%2B00BF+U%2B00D7+U%2B00F7%0D%0AU%2B2016+-+U%2B2017%0D%0AU%2B2020+-+U%2B2027%0D%0AU%2B2030+-+U%2B203E%0D%0AU%2B2041+-+U%2B2053%0D%0AU%2B2055+-+U%2B205E%0D%0AU%2B2190+-+U%2B23FF%0D%0AU%2B2500+-+U%2B2775%0D%0AU%2B2794+-+U%2B2BFF%0D%0AU%2B2E00+-+U%2B2E7F%0D%0AU%2B3001+-+U%2B3003%0D%0AU%2B3008+-+U%2B3030%5D&g=&i=>
and adds just 4 new ones
<http://unicode.org/cldr/utility/list-unicodeset.jsp?a=%5B%5B%3ASm%3A%5D%0D%0A-%5Cp%7BBlock%3DSuperscripts+And+Subscripts%7D%0D%0A-%5Cp%7BBlock%3DMiscellaneous+Technical%7D%0D%0A-%5Cp%7BBlock%3DGeometric+Shapes%7D%0D%0A-%5Cp%7BBlock%3DMiscellaneous+Symbols%7D%0D%0A-%5Cp%7BBlock%3DAlphabetic+Presentation+Forms%7D%0D%0A-%5Cp%7BBlock%3DSmall+Form+Variants%7D%0D%0A-%5Cp%7BBlock%3DHalfwidth+And+Fullwidth+Forms%7D%0D%0A-%5Cp%7BBlock%3DMathematical+Alphanumeric+Symbols%7D%0D%0A-%5Cp%7BBlock%3DArabic+Mathematical+Alphabetic+Symbols%7D%0D%0A-%5Cp%7Bsubhead%3DVariant+letterforms+and+symbols%7D%0D%0A-%5Cp%7Bsubhead%3DLetterlike+symbol%7D%0D%0A%5Cp%7BBlock%3DArrows%7D%0D%0A%5B%2F+%3D+%5C-+%2B+%21+*+%25+%3C+%3E+%5C%26+%7C+%5C%5E+~+%3F%5D%0D%0A%5B%C2%A1+%C2%A2+%C2%A3+%C2%A4+%C2%A5+%C2%A6+%C2%A7+%C2%A9+%C2%AB+%C2%AC+%C2%AE+%C2%B0+%C2%B1+%C2%B6+%C2%BB+%C2%BF%5D+-+%5B%C2%A2+%C2%A3+%C2%A4+%C2%A5+%C2%A9+%C2%AE%5D%0D%0A%5Cp%7Bsubhead%3DGeneral+punctuation%7D+-+%5BU%2B203F+U%2B2040+U%2B2045+U%2B2046+U%2B2054%5D%0D%0A%5Cp%7Bsubhead%3DDouble+punctuation+for+vertical+text%7D%0D%0A%5Cp%7Bsubhead%3DArchaic+punctuation%7D+-+%5BU%2B2E31+U%2B2E33+U%2B2E34+U%2B2E3F%5D%0D%0AU%2B214B%5D%0D%0A%0D%0A-%5B%2F+%3D+%5C-+%2B+%21+*+%25+%3C+%3E+%5C%26+%7C+%5C%5E+~+%3F%0D%0AU%2B00A1+-+U%2B00A7%0D%0AU%2B00A9+U%2B00AB+U%2B00AC+U%2B00AE%0D%0AU%2B00B0+-+U%2B00B1%0D%0AU%2B00B6+U%2B00BB+U%2B00BF+U%2B00D7+U%2B00F7%0D%0AU%2B2016+-+U%2B2017%0D%0AU%2B2020+-+U%2B2027%0D%0AU%2B2030+-+U%2B203E%0D%0AU%2B2041+-+U%2B2053%0D%0AU%2B2055+-+U%2B205E%0D%0AU%2B2190+-+U%2B23FF%0D%0AU%2B2500+-+U%2B2775%0D%0AU%2B2794+-+U%2B2BFF%0D%0AU%2B2E00+-+U%2B2E7F%0D%0AU%2B3001+-+U%2B3003%0D%0AU%2B3008+-+U%2B3030%5D&g=&i=>
(⅀ ؆ ؇ ⅋). I left out the “Full Stop” character, to be dealt with by
whatever rules we decide upon for dots in operators.

Here is the classification of the 1,020 characters I have identified as
operators:

[[:Sm:]
-\p{Block=Superscripts And Subscripts}
-\p{Block=Miscellaneous Technical}
-\p{Block=Geometric Shapes}
-\p{Block=Miscellaneous Symbols}
-\p{Block=Alphabetic Presentation Forms}
-\p{Block=Small Form Variants}
-\p{Block=Halfwidth And Fullwidth Forms}
-\p{Block=Mathematical Alphanumeric Symbols}
-\p{Block=Arabic Mathematical Alphabetic Symbols}
-\p{subhead=Variant letterforms and symbols}
-\p{subhead=Letterlike symbol}
\p{Block=Arrows}
[/ = \- + ! * % < > \& | \^ ~ ?]
[¡ ¢ £ ¤ ¥ ¦ § © « ¬ ® ° ± ¶ » ¿] - [¢ £ ¤ ¥ © ®]
\p{subhead=General punctuation} - [U+203F U+2040 U+2045 U+2046 U+2054]
\p{subhead=Double punctuation for vertical text}
\p{subhead=Archaic punctuation} - [U+2E31 U+2E33 U+2E34 U+2E3F]
U+214B]

Additionally, I think it is worthwhile to consider including the “Drafting
symbols” subheader and most of the “Miscellaneous technical” subheader.
This would add 34 more operator characters
<http://unicode.org/cldr/utility/list-unicodeset.jsp?a=%5Cp%7Bsubhead%3DDrafting+symbols%7D%0D%0A%5Cp%7Bsubhead%3DMiscellaneous+technical%7D%0D%0A-%5BU%2B23E8%5D%5D&g=&i=>
.

I did not consider non-head operator characters
<http://unicode.org/cldr/utility/list-unicodeset.jsp?a=U%2B0300+-+U%2B036F%0D%0AU%2B1DC0+-+U%2B1DFF%0D%0AU%2B20D0+-+U%2B20FF%0D%0AU%2BFE00+-+U%2BFE0F%0D%0AU%2BFE20+-+U%2BFE2F%0D%0AU%2BE0100+-+U%2BE01EF&g=&i=>,
which are predominantly combining marks and variant selectors, and should
probably stay essentially as they are. Also, I kept the empty set and
infinity sign as operators, though we may want to change that.

There are a lot more symbols that could potentially become operators (eg.
shapes, currency signs, APL, etc.). However in light of the prevailing view
that we should start conservatively and add more in the future, I believe
this set of 1,020 characters is a good place to begin.

Nevin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-evolution/attachments/20161023/fd59a7ad/attachment.html>


More information about the swift-evolution mailing list