[swift-evolution] [Review] SE-0155: Normalize Enum Case Representation

Xiaodi Wu xiaodi.wu at gmail.com
Sat Apr 1 16:54:13 CDT 2017


On Sat, Apr 1, 2017 at 3:38 PM, Daniel Duan <daniel at duan.org> wrote:

> Thanks again for a detailed review. I have a few comments inline.
>
> On Apr 1, 2017, at 9:50 AM, Xiaodi Wu via swift-evolution <
> swift-evolution at swift.org> wrote:
>
> • Does this proposal fit well with the feel and direction of Swift?
>>
>
> The "Pattern consistency" section does not align well with the feel and
> direction of Swift. Specifically, it does not explore some of the
> difficulties that arise from the proposed rules, adopts some of the same
> shortcomings that required revision for SE-0111, and deviates from some of
> the anticipated fixes for those shortcomings outlined in the core team's
> "update and commentary" to SE-0111.
>
> It is not the case that the design proposed is "a consequence of no longer
> relying on tuple patterns," in that it is not the inevitable result that
> falls out of that decision.
>
>
> The text in this revision may be poorly phrased. The connection, as I
> pointed out in an previous thread, is that we need to define syntax for
> enum pattern matching because the one we’ve been using in Swift 3 is tuple
> pattern’s syntax, which is now distinct and separate.
>

What I'm saying here is that, although _some_ change becomes necessary, the
particular changes proposed here are not themselves "a consequence of no
longer relying on tuple patterns."

Put another way, given `enum E { case foo(bar: Int, baz: Int) }`, not being
allowed to write `switch e { case foo(let a, let b): break }` is *not* an
inevitable consequence of moving away from tuple patterns. Since the
particular proposed changes break more existing source code than is
strictly necessary for moving away from tuple-based pattern matching, those
choices require stringent justification.

I will detail the alternative design that requires the fewest deviations or
> special rules, and breaks the least code extant today, later on. First, the
> shortcomings:
>
> 1.
> The proposed rules for pattern matching are a source-breaking change, and
> are *not* the most minimal such change given the abandoning of tuples (see
> alternative below). However, the proposal does not engage with the core
> team's Swift 4 criteria for source-breaking changes with respect to the
> proposed "stricter rules" for pattern matching. There is no text at all
> about why specifically having the compiler encourage local _variable_ names
> to match argument labels resolves an active harm that outweighs the goal of
> preserving the greatest possible source compatibility.
>
>
> With this proposal, user can still use local variable names. It is true
> that if there are many ways to achieve the same thing, the compiler would
> be encouraging user to do that thing. But that puts a cost on the compiler,
> new users and experienced readers in unfamiliar codebases. This is (albeit
> not to a satisfactory degree, it seems) pointed out in the motivation
> section.
>
> As for source compatibility, Swift 3 code should continue to work with
> warnings. Swift 4 mode would issue errors along with fix-its, which the
> migrator can leverage. Depends on core team/community’s implementor
> resource, there’s even a chance that this change would roll out one version
> later (warning in 4.X, error in 5.Y). In theory, the migration hurdle can
> be minimized.
>

Many syntactic changes can be migrated in this way, but for Swift 4, that
would only be justified when the existing syntax meets a high bar for being
harmful. Again, the overarching theme of my response is that I don't think
the proposed "stricter rules" offer much more harm mitigation than
significantly less source-breaking designs for pattern matching, and I
don't see anything in the proposal text that discusses the issue or
justifies the particular design over less source-breaking alternatives.


> OTOH, the proposal does outline a major use case for a local variable name
> that does not match the argument label: `param` vs `parameter`.
> Widely-respected style guides in various languages encourage unabbreviated
> and descriptive API names but much more concise local variable names. This
> is a legitimate and good practice being actively discouraged by the sugared
> rules.
>
>
> This not a counterpoint, but I personally think using shortened names is
> not something to be encouraged. A (admittedly quirky) practice some of us
> inherited from the Cocoa style guideline is to use real, complete words for
> variable names. I’d like to think that The Swift API Design Guidelines are
> aligned in spirit on this matter - “clarity is more important than
> brevity”. (incidentally, the guidelines’s code samples don’t contain
> partial-word variables anywhere).
>

We're talking _local_ variables: local variables aren't API. There are
many, many examples of single-letter variables in the design guidelines.
For example, `x = y.union(z)` has three of them.


>
> This would be merely annoying and not harmful if we could guarantee that
> it only means the API user will have to use longer local names, but the
> natural impulse on the part of thoughtful API authors would be to limit the
> expressiveness of their labels to help out their users.
>
> This puts API authors in an impossible bind: they need to choose labels
> that are not too short lest it collide frequently with existing local
> variable names (`x` and `y` would be suboptimal, for example, but there are
> good reasons why an associated value might have arguments labeled `x` and
> `y`),
>
>
> API authors are already in this impossible bind: whenever they export a
> type name, a method signature in an open class or a protocol, risk of
> collision come up.
>

Again, local variables aren't API. API authors have never been in this bind
with respect to local variables. Nothing in the language has ever caused
API to restrict the consumer's choice of local variable names. I think this
is a highly, highly unusual rule.


> When a local variable does collide with a payload label, it would be bad
> if the user accidentally used the variable _in stead of_ the actual payload
> value. Forcing users to proactively rebind the variable would make them
> more mindful for this type of mistake.
>

What mistake do you have in mind? Currently, labels have nothing to do with
variable names. How does a user accidentally use a label name instead of a
variable name?


> but they also need to choose labels that are not too verbose. The safest
> bet in this case would be not to label at all, but then they lose the
> communicative aspect of argument labels (see point 2 below).
>
>
> A more realistic version of the story: API author choose labels that make
> the most sense for the declaration and user accept the risk of collision as
> they use the API. Most of those who choose to skip labels would not have
> given this much thought about their effect at all.
>
>
> 2.
> In the "update and commentary" revising SE-0111, it was acknowledged that
> "cosmetic" labels have a significant use case. Thus, the rules were changed
> to allow `(_ foo: Int, _ bar: Int) -> ()` to communicate to the reader of
> code that the first argument serves some purpose "foo" without forcing that
> name to be part of the API, pending further revisions.
>
> Because enum cases are currently tuples, labels can be dropped freely, and
> therefore these labels are effectively "optional" parts of the API that can
> be seen by the user but, at their discretion, not used. That fulfills the
> use case of "cosmetic" labels. In this revised proposal, by requiring the
> argument label to be actually _written_ somewhere by the API user, it puts
> a dent into the legitimate use case of "cosmetic" labels.
>
> That is to say, an API author who wishes to communicate something about a
> parameter by using a label must now also consider if that label is also
> appropriate as a variable name and must forgo its use if the label is not
> so appropriate. This is a very different decision-making process and it is
> being applied retroactively to previously designed APIs whose labels would
> have been (hopefully thoughtfully) chosen under very different
> circumstances.
>
>
> This is something we never agreed on: SE-0111 is about functions. In some
> languages, patterns does resemble constructor functions, but that’s as much
> similarity as one can get anywhere. I still think applying every decision
> we made about functions to pattern matching is weird.
>

I have to admit, I still don't understand your reticence. The first part of
your proposal aligns enum cases with functions. If we are to look for
patterns in something that is spelled like a function, then it is natural
for the pattern itself to be spelled like a function, no? Currently, in
Swift 3, since we're trying to use pattern matching for a tuple, the
pattern is spelled like a tuple. In my simplistic mind, if we're trying to
use pattern matching for a $foo, the pattern should be spelled like a $foo.
Far from being weird, to me that is the only possible intuitive syntax.

But here’s my analysis anyways: the “cosmetic label” comment is about
> paving a way to restore expressivity of closures. It talks about the
> *interaction* between a function/closure’s declaration and use site — if
> parameter names are provided in a closure’s declaration, they should be
> required at invocation, similar to pre-SE-0111. IMO this proposal makes
> enum case and patterns closer to this goal.
>

I agree that your proposal does indeed get us closer to SE-0111. By
requiring argument labels chosen by the API author to be written out by the
user, we get closer to the goals of SE-0111. But SE-0111 also had a large
drawback that required post-approval modification, which was that there
ended up being no way to write "cosmetic labels," which both the community
and core team agreed was an important use case.

With functions, that role can be filled with internal parameter names. This
is what the "update and commentary" restored to SE-0111. With tuples, that
role is filled by the labels themselves, because they can be ergonomically
erased. With enum cases, you have not provided a parallel facility for
cosmetic labels, because in your proposal labels can no longer be easily
erased, but nor are there internal parameter names or some other
substitute. I'm saying that we should learn from the problems discovered
after SE-0111 was approved and fix that shortcoming for enum cases before
this proposal is adopted.

> 3.
> The first part of the proposal aligns enum case syntax with functions.
> Functions often taken prepositions as argument labels, and indeed previous
> SE proposals have extended the rules to allow most words. However, `case
> foo(index: Int, in: T)` would have a disastrous label, as `in` would be a
> very annoying variable name whose use would be actively encouraged by the
> proposed sugared pattern matching rules.
>
> The proposed rules for the sugared pattern would also require (well,
> greatly encourage) unique labels for each argument. This again is
> inconsistent with the naming conventions encouraged by the first part of
> the proposal aligning enum case syntax with functions, which have no such
> restrictions. If a user names something `case foo(point: T, point: T)`,
> then the matching rules would actively encourage an invalid redefinition of
> a variable named `point`.
>
> (On the other hand, the API author does not have the luxury of naming the
> same case `foo(from point: T, to point: T)`, and even if they did,
> prepositions can make lousy local variable names--see first paragraph.)
>
>
> I don’t see this as a problem for enum case authors. It just means the
> poor pattern writer needs to provide the positional information to
> disambiguate.
>

 What do you mean by "positional information" here?

4.
> The proposal does not explore what happens when the proposed prohibition
> on "mixing and matching" the proposed sugared and unsugared pattern
> matching runs up against associated values that have a mix of labeled and
> unlabeled parameters, and pattern matching user cases where the user does
> not wish to bind all of the arguments.
>
> Given `case foo(a: Int, String, b: Int, String)`, the only sensible
> interpretation of the rules for sugared syntax would allow the user to
> choose any name for some but not all of the labels. If the user wishes to
> bind only `b`, however, he or she will need to navigate a puzzling set of
> rules that are not spelled out in the proposal:
>
> ```
> case foo(a: _, _, b: let b, _)
> // this is definitely allowed
>
> case foo(a: _, _, b: let myVar, _)
> // this is also definitely allowed
>
> // but...
> case foo(_, _, b: let myVar, _)
> // is this allowed, or must the user explicitly state and not bind `a`?
>
> // ...and with respect to the sugared version...
> case foo(_, _, let b, _)
> // is this allowed, or must the user explicitly state and not bind `a`?
> ```
>
>
> Good point. To make up for this: `_` can substitute any sub pattern, which
> is something that this proposal doesn’t change but definitely worth
> spelling out.
>
> 5.
> In the "update and commentary" revising SE-0111, the core team outlined a
> preferred path to restoring the full use of argument labels for functions
> without giving them type system significance. They gave a non-sugared form
> and a sugared form, both of which have met with approval from the community.
>
> Briefly, the non-sugared form allows compound names to be used in variable
> names: `func foo(opToUse op(lhs:rhs:) : (Int, Int) -> Int)`. The first
> part of this proposal is consistent in that it removes the type system
> significance of argument labels from the associated values of enum cases,
> and considers them as part of the enum case name. It also stands to reason
> that, if a user were to match a case _without_ trying to bind any
> variables, the same syntax would have be used if the base name is
> ambiguous: `case elet(locals:body:): break`.
>
> However, the proposal makes no provision for using that same compound name
> in pattern matching. There appears to be no particular reason for its
> isolated omission here, as `case elet(locals:body:)(let a, let b): return a
> * b` is readable and presents no syntactic difficulties. (Moreover, it is
> consistent with the syntax permitted in this proposal for initializing a
> variable: `let foo = Expr.elet(locals:body:)([], anExpr)`.)
>
>
> Another good point. We can handle this in the purely additional proposal
> for compound variable names. I consider this not the 5th item in the list,
> but a separate suggestion, however :P
>
>
> ---
>
> In light of these shortcomings, I would argue that the following
> alternative scheme is the most intuitive and consistent for pattern
> matching given the general agreement that enum case representation should
> be "normalized":
>
> Given:
>
> ```
> enum S {
>   case foo(bar: Int, baz: Int)
>   case foo(boo: String)
>   case bar(boo: String)
> }
> ```
>
> a. As in functions after SE-0111, enum cases can be identified
> unambiguously, regardless of whether one is initializing a variable or
> matching a case, by their compound name, e.g. `bar(boo:)`. Where a case can
> be unambiguously identified with only the base name, that is an alternative
> spelling, e.g. `bar`. Where a case cannot be identified uniquely with the
> base name, then it is an error to try to use the base name alone: `case
> foo: break // error: unambiguous`.
>
> b. As in functions after SE-0111, arguments can be passed in either a
> sugared form or an unsugared form, and they can be bound in a pattern
> matching statement in the same way. That is, `case foo(bar: let a, baz: let
> b): break` and `case foo(bar:baz:)(let a, let b): break` are equivalent.
>
> c. As in functions, one cannot supply different or incorrect argument
> labels. That is, `case foo(baz: let a, bar: let b)` and `case
> foo(baz:bar:)(let a, let b)` are both forbidden. _This recovers the vast
> majority of the additional syntactic safety that is outlined in the revised
> proposal, but without the use of any special rules for pattern matching._
>
> d. By composing rules (a) and (b), `case bar(let a)` is allowed as it is
> today, preserving source compatibility. However `case foo(let b, let c)` is
> not allowed, and _not_ because different local variable names are chosen,
> but because the enum has two cases named foo.
>
>
> From a user’s point of view, there’s enough positional information in this
> pattern for the compiler to figure out which case it should match. This
> would be very unintuitive IMO.
>

Wait, the key point of your proposal, with its "stricter rules," is that
labels shouldn't be optional even with sufficient positional information!
That's also the whole thing above about getting us closer to aligning with
SE-0111, etc.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-evolution/attachments/20170401/6b84d02d/attachment.html>


More information about the swift-evolution mailing list