[swift-dev] [swift-evolution] Re-pitch: Deriving collections of enum cases

Xiaodi Wu xiaodi.wu at gmail.com
Sat Nov 11 12:28:24 CST 2017


On Sat, Nov 11, 2017 at 11:23 AM, Tony Allevato <tony.allevato at gmail.com>
wrote:

>
>
> On Fri, Nov 10, 2017 at 11:01 PM Xiaodi Wu via swift-evolution <
> swift-evolution at swift.org> wrote:
>
>> On Sat, Nov 11, 2017 at 12:15 AM, Brent Royal-Gordon via swift-evolution
>> <swift-evolution at swift.org> wrote:
>>
>>> > Personally I like the flexibility provided by the associatedtype, but
>>> I also recognize it won't be incredibly useful for enums — more so if we
>>> wanted to provide e.g. UInt8.allValues, whose ideal implementation might be
>>> "return 0...UInt8.max". So I could see allowing allValues to be any
>>> sequence or collection, but flexibility for allCases might be less
>>> important. Others should weigh in here.
>>>
>>>
>>> I think we should allow any `Collection` (option 3), and I think we
>>> should have a `DefaultCaseCollection<Enum>` type in the standard library
>>> which encapsulates the interaction with the runtime.
>>>
>>> A `Collection` is good because `Sequence` doesn't provide all of the
>>> guarantees we want—`allValues` should never be single-pass and it should
>>> always be possible to resume iteration at an earlier point, which
>>> `Sequence` doesn't guarantee. (It probably makes sense to use
>>> `BidirectionalCollection`, actually; I'm less sure about
>>> `RandomAccessCollection`.) At the same time, an `Array` ties us to all
>>> sorts of things that are unnecessary at best and harmful at worst, like a
>>> heap allocation. It also forces us into integer indices, which may be
>>> suboptimal for certain use cases (particularly if we later want to support
>>> associated values). And it prevents us from making types like integers
>>> conform, which I think would be a good idea.
>>>
>>> Meanwhile, encapsulating the runtime machinery in
>>> `DefaultCaseCollection` gives users an escape hatch: If the enum you want
>>> to use doesn't conform to `ValueEnumerable`, but you're certain it's
>>> compatible, you can construct a `DefaultCaseCollection` for it.
>>> `DefaultCaseCollection` can be a `RandomAccessCollection` with `Int`
>>> indices, making it convenient to use, but at the same time, it's *not* an
>>> array, so it doesn't have to allocate storage or think about `NSArray`
>>> bridging. And it minimizes the complexity of what the compiler needs to
>>> synthesize.
>>>
>>>         public protocol ValueEnumerable {
>>>                 associatedtype AllValues: BidirectionalCollection where
>>> AllValues.Element == Self
>>>                 static var allValues: AllValues { get }
>>>         }
>>>
>>>         // The compiler automatically does `typealias AllValues =
>>> DefaultCaseCollection<Self>` if the
>>>         // conformance is on the original declaration, the type is
>>> compatible (e.g. no associated values),
>>>         // and a different type is neither explicitly specified nor
>>> inferred. That will cause this default
>>>         // implementation to be used:
>>>         extension ValueEnumerable where AllValues ==
>>> DefaultCaseCollection<Self> {
>>>                 public static var allValues: DefaultCaseCollection<Self>
>>> {
>>>                         return DefaultCaseCollection(unsafeForEnum:
>>> Self.self)
>>>                 }
>>>         }
>>>
>>>         public struct DefaultCaseCollection<Enum>:
>>> RandomAccessCollection {
>>>                 public var startIndex: Int { return 0 }
>>>                 public let endIndex: Int
>>>
>>>                 public init(unsafeForEnum _: Enum.Type) {
>>>                         endIndex = _countCaseValues(Enum.self)
>>>                 }
>>>
>>>                 public subscript(i: Int) -> Enum {
>>>                         precondition(indices.contains(i), "Case index
>>> out of range")
>>>                         return Builtin.reinterpretCast(i) as Enum
>>>                 }
>>>         }
>>>
>>
>> Nit: if you want to call it `ValueEnumerable`, then this should be
>> `DefaultValueCollection`.
>>
>> More generally though, I can see the appeal of allowing `Int` to conform
>> to `ValueEnumerable`, but I've got a feeling that this is headed rapidly in
>> the direction of overengineering a feature without a good rationale for the
>> additional complexity. The stated use case is to enumerate the cases of an
>> enum, and any additional complexity above that should stand on its own
>> merit. I disagree quite vehemently that protocols should be "as general as
>> possible"; rather, they exist to enable useful generic algorithms and
>> should be as _useful_ as possible. There is a happy medium beyond which
>> overengineering the design makes a protocol markedly less
>> usable/approachable for the sake of enabling rare functionality.
>>
>
> Perhaps "no more specific than they need to be" would have been a better
> choice of words on my part than "as general as possible".
>
> I'm not sure why you think this decision makes the protocol "markedly less
> usable/approachable". and I think you're seeing a lot of complexity that
> isn't there. Regarding good rationale, I've given that in this thread above
> and in the previous discussion thread, but I'll summarize it again here:
>
> 1) "The stated use case is to enumerate the cases of an enum" isn't
> sufficient for designing a protocol because Swift does not have protocols
> that are restricted only to enums. Just as anyone can conform their own
> types to RawRepresentable, anyone should be able to conform their own types
> to ValueEnumerable.
>

This is the part of the argument that I am questioning. If the desired use
case is to enumerate the cases of an enum, is it much of a gain to allow
anyone to be able to conform their own non-enum types to this protocol?
Yes, if you buy that, then much of the rest follows. But make no mistake
that it is not necessary to serve the desired use case and dramatically
increases the complexity of the design (see below for just some
considerations).


> And once that's allowed, they should be able to do so without imposing an
> overly restrictive API. What if a custom type has thousands of elements
> that are easily computed sequentially or from an index? Forcing an Array on
> the implementor is an unnecessary restriction and a burden. (More on this
> below.)
>
> 2) There is *no decrease in usability for the common use case *by
> choosing to make the protocol associated type a Sequence instead of a
> Collection or Array because the concrete type synthesized by the compiler *would
> be a more refined concrete type anyway.* In other words, anyone who
> writes "MyEnum.allValues" would get back an integer-indexable random access
> collection. I can imagine useful algorithms (more of an
> algebraic/set-theoretical nature) that could be written for types with
> countably infinite value sets. Why require breaking source/ABI
> compatibility later if we can avoid it now, at no cost for the common user?
>
> 2.1) Anyone who writes a generic algorithm constrained over
> ValueEnumerable *would* have to add an additional constraint of "where T:
> RandomAccessCollection, T.Index == Int". This is a slight increase in
> complexity, but I would argue that if someone is writing generic algorithms
> over collections (and that's what this is), then this is something they
> should already familiar with. I argue that that trade-off is
> acceptable—others might disagree.
>
> 3) There is *no good reason* for the compiler to perform slow memory
> allocation operations to shuffle around information that it *already
> knows* when it can just as easily provide a specialized compact value
> type that is far more time/space efficient. I should finish the prototype I
> was working on—the code synthesis is not markedly more difficult than it
> would be to write code that constructs an array.
>
> 4) Most users *think* they need an array when what they really want is to
> be able to get the count of elements, iterate over them, and possibly
> integer-index into them. The specialized collection type can do all of
> those things more efficiently.
>
> 4.1) If a user *truly* needs an array, they just write
> "Array(MyEnum.allValues)".
>
>
>
>> Along the lines of user ergonomics, I would advocate for as many enums as
>> possible to conform without explicit opt-in. It's true that we are moving
>> away from such magical designs, and for good reason, but the gain here of
>> enums Just Working(TM) for such a long-demanded feature has, I would argue,
>> more benefits than drawbacks. To my mind, the feature is a lot like
>> `RawRepresentable` in several ways, and it would be defensible for an equal
>> amount of magic to be enabled for it.
>>
>
> I tend to agree with this philosophically, but the same argument about a
> long-demanded feature Just Working™ could be/was made for
> Equatable/Hashable being implicit and we decided to make it opt-in instead.
> I'm worried that we're going down a road where "this set of protocols is
> opt-in and this set of protocols is always-on" and users have to remember
> which are which. (And in some cases, it's even more complex! Enums are
> always Eq/Hash if they have raw values or no associated values, for
> historical reasons but they're opt-in with associated values.)
>

There are key differences here. Equatable and Hashable have extensive
semantic guarantees, and it is certainly not the case that a type with
members that are all Equatable fulfills those guarantees itself. It would
be undesirable to have implicit Equatable conformance because the compiler
cannot guarantee semantics and must not make such assumptions. (The
implicit conformance of certain enums to Equatable and Hashable is
interesting yet defensible because it relies on an aspect of those enums
not shared with other types, an aspect that can be profitably relied
upon--see below.) All enums that have one or more cases with no associated
values, by contrast, have an enumerable set of cases. This is something
knowable to the compiler.

This goes to the issue above of whether it is desirable in the first place
to conflate enums that have enumerable cases with a more general
"ValueEnumerable"; the requested feature is to enumerate enum cases, and
having a set of unique cases is a feature of enums not shared with other
types. Inevitably, when you generalize, then you must ask what it means for
a non-enum type to have a set of enumerable case-like values. For example,
note that there is no particular need for an ordered collection if we are
considering enums (cf. previous conversations about supporting resilient
reordering of enum cases); by contrast, such a design would be highly
unintuitive for a range, which has an implicitly ordered set of values.
Since each enum case is guaranteed to be unique, and since Equatable and
Hashable conformance is implicitly synthesized in such a way that all cases
compare not equal to other cases, we could even use a Set for an enum's
`allCases`. Of course, to eliminate the potential need for allocating
memory, this could be a custom type with set-like semantics, but
nonetheless it allows us to guarantee that `Set(T.allCases).count ==
T.allCases.count`, which would be a nontrivial semantic requirement in the
more general case of custom `ValueEnumerable` conformance that the author
of the conforming type would need to consider--if we want to insist on that
semantic requirement at all.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-dev/attachments/20171111/d969af47/attachment.html>


More information about the swift-dev mailing list