[swift-evolution] [Review] SE-0155: Normalize Enum Case Representation

Daniel Duan daniel at duan.org
Sat Apr 1 15:38:44 CDT 2017


Thanks again for a detailed review. I have a few comments inline.

> On Apr 1, 2017, at 9:50 AM, Xiaodi Wu via swift-evolution <swift-evolution at swift.org> wrote:
> 
> 	• Does this proposal fit well with the feel and direction of Swift?
> 
> The "Pattern consistency" section does not align well with the feel and direction of Swift. Specifically, it does not explore some of the difficulties that arise from the proposed rules, adopts some of the same shortcomings that required revision for SE-0111, and deviates from some of the anticipated fixes for those shortcomings outlined in the core team's "update and commentary" to SE-0111.
> 
> It is not the case that the design proposed is "a consequence of no longer relying on tuple patterns," in that it is not the inevitable result that falls out of that decision.

The text in this revision may be poorly phrased. The connection, as I pointed out in an previous thread, is that we need to define syntax for enum pattern matching because the one we’ve been using in Swift 3 is tuple pattern’s syntax, which is now distinct and separate.

> I will detail the alternative design that requires the fewest deviations or special rules, and breaks the least code extant today, later on. First, the shortcomings:
> 
> 1.
> The proposed rules for pattern matching are a source-breaking change, and are *not* the most minimal such change given the abandoning of tuples (see alternative below). However, the proposal does not engage with the core team's Swift 4 criteria for source-breaking changes with respect to the proposed "stricter rules" for pattern matching. There is no text at all about why specifically having the compiler encourage local _variable_ names to match argument labels resolves an active harm that outweighs the goal of preserving the greatest possible source compatibility.

With this proposal, user can still use local variable names. It is true that if there are many ways to achieve the same thing, the compiler would be encouraging user to do that thing. But that puts a cost on the compiler, new users and experienced readers in unfamiliar codebases. This is (albeit not to a satisfactory degree, it seems) pointed out in the motivation section. 

As for source compatibility, Swift 3 code should continue to work with warnings. Swift 4 mode would issue errors along with fix-its, which the migrator can leverage. Depends on core team/community’s implementor resource, there’s even a chance that this change would roll out one version later (warning in 4.X, error in 5.Y). In theory, the migration hurdle can be minimized. 

> 
> OTOH, the proposal does outline a major use case for a local variable name that does not match the argument label: `param` vs `parameter`. Widely-respected style guides in various languages encourage unabbreviated and descriptive API names but much more concise local variable names. This is a legitimate and good practice being actively discouraged by the sugared rules.

This not a counterpoint, but I personally think using shortened names is not something to be encouraged. A (admittedly quirky) practice some of us inherited from the Cocoa style guideline is to use real, complete words for variable names. I’d like to think that The Swift API Design Guidelines are aligned in spirit on this matter - “clarity is more important than brevity”. (incidentally, the guidelines’s code samples don’t contain partial-word variables anywhere).

> 
> This would be merely annoying and not harmful if we could guarantee that it only means the API user will have to use longer local names, but the natural impulse on the part of thoughtful API authors would be to limit the expressiveness of their labels to help out their users.
> 
> This puts API authors in an impossible bind: they need to choose labels that are not too short lest it collide frequently with existing local variable names (`x` and `y` would be suboptimal, for example, but there are good reasons why an associated value might have arguments labeled `x` and `y`),

API authors are already in this impossible bind: whenever they export a type name, a method signature in an open class or a protocol, risk of collision come up.

When a local variable does collide with a payload label, it would be bad if the user accidentally used the variable _in stead of_ the actual payload value. Forcing users to proactively rebind the variable would make them more mindful for this type of mistake.

> but they also need to choose labels that are not too verbose. The safest bet in this case would be not to label at all, but then they lose the communicative aspect of argument labels (see point 2 below).

A more realistic version of the story: API author choose labels that make the most sense for the declaration and user accept the risk of collision as they use the API. Most of those who choose to skip labels would not have given this much thought about their effect at all.

> 
> 2.
> In the "update and commentary" revising SE-0111, it was acknowledged that "cosmetic" labels have a significant use case. Thus, the rules were changed to allow `(_ foo: Int, _ bar: Int) -> ()` to communicate to the reader of code that the first argument serves some purpose "foo" without forcing that name to be part of the API, pending further revisions.
> 
> Because enum cases are currently tuples, labels can be dropped freely, and therefore these labels are effectively "optional" parts of the API that can be seen by the user but, at their discretion, not used. That fulfills the use case of "cosmetic" labels. In this revised proposal, by requiring the argument label to be actually _written_ somewhere by the API user, it puts a dent into the legitimate use case of "cosmetic" labels.
> 
> That is to say, an API author who wishes to communicate something about a parameter by using a label must now also consider if that label is also appropriate as a variable name and must forgo its use if the label is not so appropriate. This is a very different decision-making process and it is being applied retroactively to previously designed APIs whose labels would have been (hopefully thoughtfully) chosen under very different circumstances.

This is something we never agreed on: SE-0111 is about functions. In some languages, patterns does resemble constructor functions, but that’s as much similarity as one can get anywhere. I still think applying every decision we made about functions to pattern matching is weird. But here’s my analysis anyways: the “cosmetic label” comment is about paving a way to restore expressivity of closures. It talks about the *interaction* between a function/closure’s declaration and use site — if parameter names are provided in a closure’s declaration, they should be required at invocation, similar to pre-SE-0111. IMO this proposal makes enum case and patterns closer to this goal.

> 
> 3.
> The first part of the proposal aligns enum case syntax with functions. Functions often taken prepositions as argument labels, and indeed previous SE proposals have extended the rules to allow most words. However, `case foo(index: Int, in: T)` would have a disastrous label, as `in` would be a very annoying variable name whose use would be actively encouraged by the proposed sugared pattern matching rules.
> 
> The proposed rules for the sugared pattern would also require (well, greatly encourage) unique labels for each argument. This again is inconsistent with the naming conventions encouraged by the first part of the proposal aligning enum case syntax with functions, which have no such restrictions. If a user names something `case foo(point: T, point: T)`, then the matching rules would actively encourage an invalid redefinition of a variable named `point`.
> 
> (On the other hand, the API author does not have the luxury of naming the same case `foo(from point: T, to point: T)`, and even if they did, prepositions can make lousy local variable names--see first paragraph.)

I don’t see this as a problem for enum case authors. It just means the poor pattern writer needs to provide the positional information to disambiguate.

> 
> 4.
> The proposal does not explore what happens when the proposed prohibition on "mixing and matching" the proposed sugared and unsugared pattern matching runs up against associated values that have a mix of labeled and unlabeled parameters, and pattern matching user cases where the user does not wish to bind all of the arguments.
> 
> Given `case foo(a: Int, String, b: Int, String)`, the only sensible interpretation of the rules for sugared syntax would allow the user to choose any name for some but not all of the labels. If the user wishes to bind only `b`, however, he or she will need to navigate a puzzling set of rules that are not spelled out in the proposal:
> 
> ```
> case foo(a: _, _, b: let b, _)
> // this is definitely allowed
> 
> case foo(a: _, _, b: let myVar, _)
> // this is also definitely allowed
> 
> // but...
> case foo(_, _, b: let myVar, _)
> // is this allowed, or must the user explicitly state and not bind `a`?
> 
> // ...and with respect to the sugared version...
> case foo(_, _, let b, _)
> // is this allowed, or must the user explicitly state and not bind `a`?
> ```
> 

Good point. To make up for this: `_` can substitute any sub pattern, which is something that this proposal doesn’t change but definitely worth spelling out.  

> 5.
> In the "update and commentary" revising SE-0111, the core team outlined a preferred path to restoring the full use of argument labels for functions without giving them type system significance. They gave a non-sugared form and a sugared form, both of which have met with approval from the community.
> 
> Briefly, the non-sugared form allows compound names to be used in variable names: `func foo(opToUse op(lhs:rhs:) : (Int, Int) -> Int)`. The first part of this proposal is consistent in that it removes the type system significance of argument labels from the associated values of enum cases, and considers them as part of the enum case name. It also stands to reason that, if a user were to match a case _without_ trying to bind any variables, the same syntax would have be used if the base name is ambiguous: `case elet(locals:body:): break`.
> 
> However, the proposal makes no provision for using that same compound name in pattern matching. There appears to be no particular reason for its isolated omission here, as `case elet(locals:body:)(let a, let b): return a * b` is readable and presents no syntactic difficulties. (Moreover, it is consistent with the syntax permitted in this proposal for initializing a variable: `let foo = Expr.elet(locals:body:)([], anExpr)`.)

Another good point. We can handle this in the purely additional proposal for compound variable names. I consider this not the 5th item in the list, but a separate suggestion, however :P

> 
> --- 
> 
> In light of these shortcomings, I would argue that the following alternative scheme is the most intuitive and consistent for pattern matching given the general agreement that enum case representation should be "normalized":
> 
> Given:
> 
> ```
> enum S {
>   case foo(bar: Int, baz: Int)
>   case foo(boo: String)
>   case bar(boo: String)
> }
> ```
> 
> a. As in functions after SE-0111, enum cases can be identified unambiguously, regardless of whether one is initializing a variable or matching a case, by their compound name, e.g. `bar(boo:)`. Where a case can be unambiguously identified with only the base name, that is an alternative spelling, e.g. `bar`. Where a case cannot be identified uniquely with the base name, then it is an error to try to use the base name alone: `case foo: break // error: unambiguous`.
> 
> b. As in functions after SE-0111, arguments can be passed in either a sugared form or an unsugared form, and they can be bound in a pattern matching statement in the same way. That is, `case foo(bar: let a, baz: let b): break` and `case foo(bar:baz:)(let a, let b): break` are equivalent.
> 
> c. As in functions, one cannot supply different or incorrect argument labels. That is, `case foo(baz: let a, bar: let b)` and `case foo(baz:bar:)(let a, let b)` are both forbidden. _This recovers the vast majority of the additional syntactic safety that is outlined in the revised proposal, but without the use of any special rules for pattern matching._
> 
> d. By composing rules (a) and (b), `case bar(let a)` is allowed as it is today, preserving source compatibility. However `case foo(let b, let c)` is not allowed, and _not_ because different local variable names are chosen, but because the enum has two cases named foo.

From a user’s point of view, there’s enough positional information in this pattern for the compiler to figure out which case it should match. This would be very unintuitive IMO.

> I believe that this alternative preserves achieves the goals of normalizing enum case representation, simplifying the rules around pattern matching and adding safety to pattern matching by preventing reordering/mismatching of labels that were sometimes permitted with tuples, and preserving almost all source compatibility, without the use of ad-hoc rules.
> 
> 
> 	• If you have used other languages or libraries with a similar feature, how do you feel that this proposal compares to those?
> 
> N/A
> 
> 	• How much effort did you put into your review? A glance, a quick reading, or an in-depth study?
> 
> In-depth study
> 
> 
> _______________________________________________
> swift-evolution mailing list
> swift-evolution at swift.org <mailto:swift-evolution at swift.org>
> https://lists.swift.org/mailman/listinfo/swift-evolution <https://lists.swift.org/mailman/listinfo/swift-evolution>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-evolution/attachments/20170401/64d163f3/attachment.html>


More information about the swift-evolution mailing list