[swift-evolution] Normalize Enum Case Representation (rev. 2)

Daniel Duan daniel at duan.org
Thu Mar 9 13:48:19 CST 2017


> On Mar 9, 2017, at 12:31 AM, Xiaodi Wu <xiaodi.wu at gmail.com> wrote:
> 
> On Thu, Mar 9, 2017 at 1:07 AM, Daniel Duan <daniel at duan.org <mailto:daniel at duan.org>> wrote:
> Thanks for the thoughtful feed Xiaodi! Replies are inline. I'm going to incorporate some of the responses into the proposal.
> 
> On Mar 8, 2017, at 9:56 PM, Xiaodi Wu <xiaodi.wu at gmail.com <mailto:xiaodi.wu at gmail.com>> wrote:
> 
>> The rendered version differs from the text appended to your message. I'll assume the more fully fleshed out version is what you intend to submit. Three comments/questions:
>> 
>> Enum Case "Overloading"
>> 
>> An enum may contain cases with the same full name but with associated values of different types. For example:
>> 
>> enum Expr {
>>     case literal(Bool)
>>     case literal(Int)
>> }
>> The above cases have overloaded constructors, which follow the same rules as functions at call site for disambiguation:
>> 
>> // It's clear which case is being constructed in the following.
>> let aBool: Expr = .literal(false)
>> let anInt: Expr = .literal(42)
>> User must specify an as expression in sub-patterns in pattern matching, in order to match with such cases:
>> 
>> case .literal(let value) // this is ambiguous
>> case .literal(let value as Bool) // matches `case literal(Bool)`
>> 
>> Comment/question 1: Here, why aren't you proposing to allow `case .literal(let value: Bool)`? For one, it would seem to be more consistent.
> 
> The example in proposal doesn't include any labels. Are you suggesting two colons for sub-patterns with labels? Like `case .literal(value: let value: Bool)`?  This looks jarring. But I'm definitely open to other suggestions.
> 
> That does look jarring. But hmm.
>  
>> Second, since we still have some use cases where there's Obj-C bridging magic with `as`, using `as` in this way may run into ambiguity issues if (for example) you have two cases, one with associated value of type `String` and the other of type `NSString`.
> 
> Either this should be rejected at declaration, or we need a way to accept a "pre-magic" resolution at pattern matching, when this scenarios is at hand.
> 
> Or we align pattern matching to function syntax and have such cases disambiguated in that way (see below).
>  
> I'm on the phone so I can't verify. Wouldn't function overloading face a similar problem?
> 
>> Also, since enum cases are to be like functions, I assume that the more verbose `as` version would work for free: `case .literal(let value) as (Bool) -> Expr`?
>> 
> 
> This is not being proposed. When a user sees/authors a case, their expectation for the declared case constructor should resemble that of a function. Pattern matching was considered separately since it's not relatable syntactically.
> 
> This requires justification. If enum cases are to be like functions, then the logical expectation is that pattern matching should work in that way too. I see no rationale to undergird your claim that pattern matching is "not relatable syntactically." Allowing `case .literal(let value) as (Bool) -> Expr` would solve the issue above, as well as provide more flexibility with the issues below.

I have concerns about the verbosity this syntax introduces. Example:

enum A { case v(Int) }
enum B { case v(A); case v(Int) }

To disambiguate a value of type B, it would be

case .v(A.v(let xValue)) as ((Int -> A) -> B)

This scales poorly for cases with deeper recursions and/or more associated values.

Disambiguate at the sub-pattern level doesn’t have this scalability problem.

—

We’ve encountered a bigger question that it initial seems. Let’s zoom out.

There are 2 popular kinds of patterns for value deconstruction in PLs: patterns for trees and sequences. The former deconstructs value who’s prominently recursive: enum, struct, tuple; the latter deals with list-like (grows in 1 direction indefinitely) things. We are now investigating the syntax that can potentially be used for all tree patterns. Whereas the “shape” alone isn’t enough information, user must use the type to supplement the pattern for a successful match. If we introduce patterns for structs in the future, whatever we came up here for type disambiguation should work there.
>>  <https://github.com/dduan/swift-evolution/blob/SE0155-rev2/proposals/0155-normalize-enum-case-representation.md#alternative-payload-less-case-declaration>Alternative Payload-less Case Declaration
>> 
>> In Swift 3, the following syntax is valid:
>> 
>> enum Tree {
>>     case leaf() // the type of this constructor is confusing!
>> }
>> Tree.leaf has a very unexpected type to most Swift users: (()) -> Tree
>> 
>> We propose this syntax declare the "bare" case instead. So it's going to be the equivalent of
>> 
>> enum Tree {
>>     case leaf // `()` is optional and does the same thing.
>> }
>> 
>> 
>> Comment/question 2: First, if associated values are not to be modeled as tuples, for backwards compatibility the rare uses of `case leaf()` should be migrated to `case leaf(())`.
>> 
> 
> Yes,
> 
> Cool.
>  
> and when user uses a arbitrary name when they should have used a label, or when labels are misspelled, the compiler should suggest the correct labels.
> 
> As below, I disagree with this restriction very strongly.
>  
> I wasn't sure how much of migrator related thing should go into a proposal. Perhaps there should be more.
>> Second, to be clear, you are _not_ proposing additional sugar so that a case without an associated value be equivalent to a case that has an associated value of type `Void`, correct? You are saying that, with your proposal, both `case leaf()` and `case leaf` would be regarded as being of type `() -> Tree` instead of the current `(()) -> Tree`?
>> 
> 
> Correct. I'm _not_ proposing implicit `Void`.
>> [The latter (i.e. `() -> Tree`) seems entirely fine. The former (i.e. additional sugar for `(()) -> Tree`) seems mostly fine, except that it would introduce an inconsistency with raw values that IMO is awkward. That is, if I have `enum Foo { case bar }`, it would make case `bar` have implied associated type `Void`; but, if I have `enum Foo: Int { case bar }`, would case `bar` have raw value `0` of type `Int` as well as associated value `()` of type `Void`?]
>> 
>> 
>>  <https://github.com/dduan/swift-evolution/blob/SE0155-rev2/proposals/0155-normalize-enum-case-representation.md#pattern-consistency>Pattern Consistency
>> 
>> (The following enum will be used throughout code snippets in this section).
>> 
>> indirect enum Expr {
>>     case variable(name: String)
>>     case lambda(parameters: [String], body: Expr)
>> }
>> Compared to patterns in Swift 3, matching against enum cases will follow stricter rules. This is a consequence of no longer relying on tuple patterns.
>> 
>> When an associated value has a label, the sub-pattern must include the label exactly as declared. There are two variants that should look familiar to Swift 3 users. Variant 1 allows user to bind the associated value to arbitrary name in the pattern by requiring the label:
>> 
>> case .variable(name: let x) // okay
>> case .variable(x: let x) // compile error; there's no label `x`
>> case .lambda(parameters: let params, body: let body) // Okay
>> case .lambda(params: let params, body: let body) // error: 1st label mismatches
>> User may choose not to use binding names that differ from labels. In this variant, the corresponding value will bind to the label, resulting in this shorter form:
>> 
>> case .variable(let name) // okay, because the name is the same as the label
>> case .lambda(let parameters, let body) // this is okay too, same reason.
>> case .variable(let x) // compiler error. label must appear one way or another.
>> case .lambda(let params, let body) // compiler error, same reason as above.
>> Comment/question 3: Being a source-breaking change, that requires extreme justification, and I just don't think there is one for this rule. The perceived problem being addressed (that one might try to bind `parameters` to `body` and `body` to `parameters`) is unchanged whether enum cases are modeled as tuples or functions, so aligning enum cases to functions is not in and of itself justification to revisit the issue of whether to try to prohibit this. 
>> 
> 
> To reiterate, here patterns are changed not for any kind of "alignment" with function syntax. It changed because we dropped the tuple pattern (which remains available for matching with tuple values, btw), therefore we need to consider what a first-class syntax for enum case would look like.
> 
> Since the rationale for this proposal is to "normalize" enum cases by making them more function-like, again you will need to justify why pattern matching should break from that overarching goal.
> 
> This is a source-breaking change, so it's not enough that a "first-class syntax" from the ground up would be different from the status quo (which was the Swift 3 evolution standard--if we were to do it again from scratch, would we still do it this way?). The Swift 4 evolution expectation is that a source-breaking change should require "extreme" justification.

Fair enough. I think the Swift 3 criteria is met. As for Swift 4, I used the word “deprecated” in the source compatibility section. I imagine this means that only deprecation warnings and fix-its are issued in Swift 4 and the warning becomes an error in Swift 5. Obviously, that’s not a justification…

What do you think Joe? 

> 
> The justification for this breaking change is this: with tuples, labels in pattern is not well enforced. User can skip them, bind value to totally arbitrary names, etc. I personally think emulating such rule prevents us from making pattern matching easier to read for experienced devs and easier to learn for new comers. 
> 
> Perhaps, but this is an argument that tuple pattern binding is inferior. It has nothing to do with enum cases in particular. In fact, several threads have touched on this topic holistically. The conclusions there have been that allowing (a: Int, b: Int) to bind (Int, Int) or vice versa is healthy and useful, but allowing (a: Int, b: Int) to bind (b: Int, c: Int) is not so good, and (a: Int, b: Int) binding (b: Int, a: Int) is counterintuitive and should be removed.

Fantastic! Really appreciate this summary.

> Emulating tuple rules is the backwards-compatible way. Therefore, it ought to be the default unless the source-breaking alternative is not merely easier but _overwhelmingly_ easier.

I assume by “easier” you mean impacting existing codebase less by design or tooling? 

> If tuples are dangerously broken, then that calls for a proposal to change tuples.
> 
> It's reasonable to expect existing patterns in the wild either already use the labels found in declaration or they are matching against label-less cases.
> 
> I don't think that is reasonable to expect.
>  
> In other words, existing code with good style won't be affected much.
> 
> You're defining your preference as "good style" and saying that a proposal to enforce your preference by the compiler won't affect code with "good style" (i.e. adhering to your preference), which is tautologically true and also not a valid argument.

This is a combination of personal experience and some of my wishful thinking. I work on a few large Swift projects and observed that most of the time labels matches the names chosen in the pattern. The selection bias here is strong, of course. I wish I could qualify it as “most of the code in the wild” in place of “good style” :P

>  
> For the code that actually would break, I think the migrator and the compiler can provide sufficient help in form of migration/fixits/warnings.
> 
> Ultimately I think requiring appearance of labels one way or another in patterns will improve both the readability of the pattern matching site as well as forcing the author of case declaration consider the use site more.
>> In fact, I think the proposed solution suffers from two great weaknesses. First, it seems ad-hoc. Consider this: if enum cases are to be modeled as functions, then I should be able to write something intermediate between the options above; namely: `case .variable(name:)(let x)`. Since `.variable` unambiguously refers to `.variable(name:)`, I should also be allowed to write `.variable(let x)` just as I am now.
>> 
> Again, patterns are not to be modeled after functions. Only the declaration and usage of case constructors are.
> 
> This requires justification. Why are they not to be? IMO, they ought to be.
> 
>> Second, it seems unduly restrictive. If, in the containing scope, I have a variable named `body` that I don't want to shadow, this rule would force me to either write the more verbose form or deal with shadowing `body`. If a person opts for the shorter form, they are choosing not to use the label.
>> 
> 
> In fact this (to avoid label conflict in nesting) is the only reason the longer form allows rebinding to other names at all! You say "unduly restrictive", I say "necessarily flexible" :)
>  
> The great majority of Swift users don't read this list, nor care about the starting point from which a proposal added "flexibility." They compare Swift N to Swift N + 1, and in this case, there is a new restriction. That is the only comparison which matters. So make no mistake, you are proposing a source-breaking restriction. For the arguments I gave in my earlier reply, I believe it is _unduly_ restrictive.
>> 
>> Only one of these variants may appear in a single pattern. Swift compiler will raise a compile error for mixed usage.
>> 
>> case .lambda(parameters: let params, let body) // error, can not mix the two.
>> Some patterns will no longer match enum cases. For example, all associated values can bind as a tuple in Swift 3, this will no longer work after this proposal:
>> 
>> // deprecated: matching all associated values as a tuple
>> if case let .lambda(f) = anLambdaExpr {
>>     evaluateLambda(parameters: f.parameters, body: f.body)
>> }

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-evolution/attachments/20170309/29f5511f/attachment.html>


More information about the swift-evolution mailing list