[swift-evolution] [Proposal] Refining Identifier and Operator Symbology

Jonathan S. Shapiro jonathan.s.shapiro at gmail.com
Thu Oct 20 10:14:31 CDT 2016


Operators, Nouns, and Verbs


There's an issue that I think it's worth bringing it out into the open for
everyone to see so that we all know it is present. Solutions are possible,
but they go beyond the scope of the identifier proposal. Here's the brief
statement of the problem:

   1. Operators are verbs. They *operate* on their arguments.
   2. Math symbols are not always verbs. ∑ is a verb (and an operator). ∞
   is usually understood to be a noun.
   3. Operator *symbols* (that is: identifiers) are just names. They are
   neither inherently verbs nor inherently nouns.

We tend (at first glance) to prefer for nouns to be treated as identifiers
and operators as verbs. Operator identifiers confuse the issue because we
are calling them *operator* identifiers. A better name might be "math
symbol identifiers", because it doesn't have the same
association. Unfortunately there is no Unicode category for "Math symbols
that are verbs". This is true, in part, because there actually isn't
general agreement about how symbols are used in math. Once you get past the
basic stuff, a symbol means whatever you define it to mean in the current
publication, and math authors grab symbols entirely for the convenience of
the authors. Hopefully, but not always, in a way that reflects or suggests
a generally recognized intuition. Often no general agreement exists.

If we actually wanted to solve the noun/verb issue, we need to acknowledge
that being a noun (verb) is orthogonal to being a conventional identifier
(math symbol identifier). Here is one way to separate the concepts in Swift:

   1. Make it true that *any* identifier can be either a conventional
   identifier or a math symbol identifier. We already do this in several
   places.
   2. Make it true that *any* identifier (including a conventional
   identifier) can be treated like a reserved word (that is: like an operator)
   for parse purposes.

>From a parse perspective, the thing that makes an identifier into an
operator is that (a) it has been given some status as a reserved
identifier, and (b) it has a defined precedence rule. It would be possible
to re-imagine the meaning of Swift's operator declaration syntax to mean
"this identifier is now being given reserved-word status, and should be
treated for parse purposes as an operator while this declaration is
lexically in scope". No change is required to the current language. This
re-interpretation would allow us to say (for example):

infix operator LazyAnd : *somePrecedence*


which would introduce "LazyAnd" as an operator token *even though the
identifier does not use math symbols as its characters.* Simultaneously, it
would allow us to bind ∞ and use that identifier without forcing a noun (∞)
to be a verb simply because it has symbols in the name.

I personally believe that this would resolve some of the confusion about
operators, because it would separate the "how do we tokenize?" question
from the "what behaves like an operator?" question. It would also allow us
to preserve the existing mathematical use of many math symbols that are (by
convention) nouns. From a lexer/parser perspective, the concrete change is
that we go from "it's an operator because it's made up of math symbols" to
"it's an operator because it's an identifier and it's in the list of things
that are in scope as operators" (effectively a look-up table). That's the
entire change.

Unfortunately every change comes at a cost, and the cost of this one is
that we would once again have to be thoughtful about white space. Why?
Because:

a.!  // selection of a field named "!" in object a
a.!+ // selection of a field named "!+" in object a
a.! + // selection of field named "!" in object a followed by operator (?) +


You can build comparable examples without field names:

! b // two identifiers
!+ b // two identifiers
! + b// two identifiers
a+b // three identifiers


How confusing would this become? We have some limited experience, but only
limtied, in BitC. BitC allowed operator definitions to use conventional
identifiers in the way I sketched above (actually, we did full-up mixfix,
but that's another topic), and it worked very well. BitC did *not* allow
operators to be used as general-purpose identifiers, but in hindsight I
believe that we probably should have done so.

Keep in mind that this is exactly the same "think about white space" issue
that we already know from conventional identifiers.

>From a "but would this be too weird?" standpoint, all of the *current*
minglings
of identifiers without white space would be preserved, so "a+b" would
continue to behave like you expect. But just like

a.b__and c
a.b __and c


mean two very different things in C++, it would now be true that

a.!&&c  // ident dot ident ident
a.! &&c // ident dot ident ident ident


would mean different things.


I don't know if I'm being helpful or just confusing the issue further, but
I hope this helps people think about this stuff better.


Jonathan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-evolution/attachments/20161020/34c61b94/attachment.html>


More information about the swift-evolution mailing list