[swift-evolution] Pitch: Automatically deriving Equatable/Hashable for more value types

Tony Allevato tony.allevato at gmail.com
Thu May 4 15:37:03 CDT 2017


Hi all,

A conversation on Twitter last night brought up some interest in this
feature and I was encouraged to revive this proposal.

Jordan Rose mentioned
<https://twitter.com/UINT_MIN/status/859922619578986496> on Twitter that it
could possibly make it in by the Swift 4 deadline if others contributed—I
have a WIP branch (albeit not currently working because I rebased after a
couple months of it being idle) that does the work for enums but I got
stuck on the mutually recursive cases. If this got approved, I'd love to
collaborate with other interested folks to finish up the implementation.

Link: https://gist.github.com/allevato/2fd10290bfa84accfbe977d8ac07daad


Deriving Equatable and Hashable for value types

   - Proposal: SE-0000
   <https://github.com/apple/swift-evolution/blob/master/proposals/NNNN-name.md>
   - Author(s): Tony Allevato <https://github.com/allevato>
   - Status: Awaiting review
   <https://gist.github.com/allevato/2fd10290bfa84accfbe977d8ac07daad#rationale>
   - Review manager: TBD

<https://gist.github.com/allevato/2fd10290bfa84accfbe977d8ac07daad#introduction>
Introduction

Value types are prevalent throughout the Swift language, and we encourage
developers to think in those terms when writing their own types.
Frequently, developers have to write large amounts of boilerplate code to
support equatability and hashability of value types. This proposal offers a
way for the compiler to automatically derive conformance to Equatable and
Hashable to reduce this boilerplate, in a subset of scenarios where
generating the correct implementation is known to be possible.

Swift-evolution thread: Universal Equatability, Hashability, and
Comparability <http://thread.gmane.org/gmane.comp.lang.swift.evolution/8919>
<https://gist.github.com/allevato/2fd10290bfa84accfbe977d8ac07daad#motivation>
Motivation

Building robust value types in Swift can involve writing significant
boilerplate code to support hashability and equatability. Equality is
pervasive across many value types, and for each one users must implement the
 == operator such that it performs a fairly rote memberwise equality test.
As an example, an equality test for a struct looks fairly uninteresting:

struct Foo: Equatable {
  static func == (lhs: Foo, rhs: Foo) -> Bool {
    return lhs.property1 == rhs.property1 &&
           lhs.property2 == rhs.property2 &&
           lhs.property3 == rhs.property3 &&
           ...
  }
}

What's worse is that this operator must be updated if any properties are
added, removed, or changed, and since it must be manually written, it's
possible to get it wrong, either by omission or typographical error.

Likewise, hashability is necessary when one wishes to store a value type in
a Set or use one as a multi-valued Dictionary key. Writing high-quality,
well-distributed hash functions is not trivial so developers may not put a
great deal of thought into them – especially as the number of properties
increases – not realizing that their performance could potentially suffer
as a result. And as with equality, writing it manually means there is the
potential to get it wrong.

In particular, the code that must be written to implement equality for
enums is quite verbose:

enum Token: Equatable {
  case string(String)
  case number(Int)
  case lparen
  case rparen

  static func == (lhs: Token, rhs: Token) -> Bool {
    switch (lhs, rhs) {
    case (.string(let lhsString), .string(let rhsString)):
      return lhsString == rhsString
    case (.number(let lhsNumber), .number(let lhsNumber)):
      return lhsNumber == rhsNumber
    case (.lparen, .lparen), (.rparen, .rparen):
      return true
    default:
      return false
    }
  }
}

Crafting a high-quality hash function for this enum would be similarly
inconvenient to write.

Swift already derives Equatable and Hashable conformance for a small subset
of enums: those for which the cases have no associated values (including
enums with raw types). Two instances of such an enum are equal if they are
the same case, and an instance's hash value is its ordinal:

enum Foo  {
  case zero, one, two
}
let x = (Foo.one == Foo.two)  // evaluates to falselet y =
Foo.one.hashValue     // evaluates to 1

Likewise, conformance to RawRepresentable is automatically derived for
enums with a raw type. Since there is precedent for derived conformances in
Swift, we propose extending this support to more value types.
<https://gist.github.com/allevato/2fd10290bfa84accfbe977d8ac07daad#proposed-solution>Proposed
solution

In general, we propose that value types derive conformance to Equatable/
Hashable if all of its members are Equatable/Hashable. We describe the
specific conditions under which these conformances are derived below,
followed by the details of how the conformance requirements are implemented.
<https://gist.github.com/allevato/2fd10290bfa84accfbe977d8ac07daad#protocol-derivability-conditions>Protocol
derivability conditions

For brevity, let P represent either the protocol Equatable or Hashable in
the descriptions below.
<https://gist.github.com/allevato/2fd10290bfa84accfbe977d8ac07daad#derived-conformances-for-enums>Derived
conformances for enums

For an enum, derivability of P is based on the conformances of its cases'
associated values. Computed properties are not considered.

The following rules determine whether conformance to P can be derived for an
 enum:

   -

   An enum with no cases does not derive conformance to P, since it is not
   possible to create instances of such types.
   -

   An enum with one or more cases derives conformance to P if all of the
   associated values of all of its cases conform to P (either explicitly or
   derived).

<https://gist.github.com/allevato/2fd10290bfa84accfbe977d8ac07daad#derived-conformances-for-structs>Derived
conformances for structs

For a struct, derivability of P is based on the conformances of its stored
instance properties *only*. Neither static properties nor computed instance
properties (those with custom getters) are considered.

The following rules determine whether conformance to P can be derived for a
struct:

   -

   A struct with *no* stored properties does *not* derive conformance to P.
   (Even though it is vacuously true that all instances of a struct with no
   stored properties could be considered equal and hash to the same value, the
   reality is that such structs are more often used for grouping/nesting of
   other entities and not for their singular value, and we don't consider it
   worthwhile to generate extra code in this case.)
   -

   A struct with one or more stored properties derives conformance to P if
   all if the types of all of its stored properties conform to P (either
   explicitly or derived).

<https://gist.github.com/allevato/2fd10290bfa84accfbe977d8ac07daad#considerations-for-recursive-types>Considerations
for recursive types

For brevity in the discussion below, the term *members* refers only to
those members that are checked for deriving conformances: *stored
properties* for structs and *associated values* for enums.

Recursive value types require a bit more care when determining whether a
conformance can be derived. Consider the following enum with an indirect
case:

enum TreeNode {
  case empty
  case leaf(value: Int)
  case internal(left: TreeNode, right: TreeNode)
}

When examining the internal case, an application of the rules above implies
that "TreeNode derives P if TreeNode conforms to P"—a recursive condition.
In this situation, we note that any instance of this type—or of any
recursive type—forms a finite tree structure because the recursion must be
terminated eventually by using one of the other base cases. Therefore,
without changing the outcome, we can assume for the purposes of determining
whether T derives P that any members of type T already conform to P. If any
members of different types prohibit deriving P, then we know that the whole
type cannot derive it; likewise, if all of the other members permit deriving
 P, then we know that T can derive it by recursively applying the derived
operation.

This property can be extended to *mutually* recursive types as well.
Consider this contrived example:

enum A {
  case value(Int)
  case b(B)
}
enum B {
  case value(String)
  case c(C)
}
enum C {
  case value(Double)
  case a(A)
}

The rules state that—ignoring the trivial cases—"A derives P if B conforms
to P," and "B derives P if Cconforms to P," and "C derives P if A conforms
to P." The same observation about recursion and the finiteness of instances
from above holds here, so we can generalize the rule above as follows:

A type T can derive P if all members of T conform to P or are of types
found in cycles that lead back to Tsuch that the members of those other
types along the cycle also all conform to P or are themselves along such a
cycle.
<https://gist.github.com/allevato/2fd10290bfa84accfbe977d8ac07daad#other-considerations>Other
considerations

When conditional conformances are supported in Swift, generic types should
conditionally derive Equatable and Hashable for type argument substitutions
where the rules above are satisfied.

A notable side effect of this is that Optional<Wrapped> would derive
Equatable and Hashable conformance when Wrapped conforms to those
protocols, because it is an enum that would satisfy the rules described
above. Its implementation would not need to be in the standard library (but
there is also nothing preventing it from being there).

Conditional conformances will also significantly improve derivability
coverage over other payload/member types. For example, consider a
struct containing
a stored property that is an array of Equatable types:

struct Foo {
  var values: [String]
}

Today, Array<String> does not conform to Equatable, so its presence would
prohibit Foo from deriving Equatable. However, once Swift can express the
conformance Array<Element>: Equatable where Element: Equatable, Foo would
automatically derive Equatable as well. This makes derived conformances
significantly more powerful.
<https://gist.github.com/allevato/2fd10290bfa84accfbe977d8ac07daad#implementation-details>Implementation
details

An enum T that derives Equatable will receive a compiler-generated
implementation of static == (lhs: T, rhs: T) -> Bool that returns true if
and only if lhs and rhs are the same case and have payloads that are
memberwise-equal.

An enum T that derives Hashable will receive a compiler-generated
implementation of var hashValue: Int { get }that uses an unspecified hash
function† to compute the hash value by incorporating the case's ordinal
(i.e., definition order) followed by the hash values of its associated
values as its terms, also in definition order.

A struct T that derives Equatable will receive a compiler-generated
implementation of static == (lhs: T, rhs: T) -> Bool that returns true if
and only if lhs.x == rhs.x for all stored properties in T.

A struct T that derives Hashable will receive a compiler-generated
implementation of var hashValue: Int { get } that uses an unspecified hash
function† to compute the hash value by incorporating the hash values of the
fields as its terms, in definition order.

† We intentionally leave the exact definition of the hash function
unspecified here. A multiplicative hash function with good distribution is
the likely candidate, but we do not rule out other possibilities. Users
should not depend on the nature of the generated implementation or rely on
particular outputs; we reserve the right to change it in the future.
<https://gist.github.com/allevato/2fd10290bfa84accfbe977d8ac07daad#overriding-derived-conformances>Overriding
derived conformances

Any user-provided implementations of == or hashValue will override the
default implementations that would be provided by the compiler. This is
already the case possible today with raw-value enums so the same behavior
should be extended to other value types that are made to implicitly conform
to these protocols.
<https://gist.github.com/allevato/2fd10290bfa84accfbe977d8ac07daad#open-questions>Open
questions
<https://gist.github.com/allevato/2fd10290bfa84accfbe977d8ac07daad#omission-of-fields-from-generated-computations>Omission
of fields from generated computations

Some commenters have expressed a desire to tag certain properties of a
struct from being included in automatically generated equality tests or
hash value computations. This could be valuable, for example, if a property
is merely used as an internal cache and does not actually contribute to the
"value" of the instance. Under the rules above, if this cached value was
equatable, a user would have to override == and hashValue and provide their
own implementations to ignore it.
<https://gist.github.com/allevato/2fd10290bfa84accfbe977d8ac07daad#explicit-or-implicit-derivation>Explicit
or implicit derivation

As with raw-value enums today, should the derived conformance be completely
implicit, or should users have to explicitly list conformance with Equatable
 and Hashable in order for the compiler to generate the derived
implementation?

If derived conformances were made explicit, it could be argued that this
should also be done for consistency across raw-value enums as well. This
would be a source-breaking change, which should be avoided at this stage.
<https://gist.github.com/allevato/2fd10290bfa84accfbe977d8ac07daad#impact-on-existing-code>Impact
on existing code

This change would make types that satisfy the rules above Equatable and
Hashable when they previously were not. It is not expected that there would
be any *behavioral* changes because of this; since Equatable and Hashable have
associated type requirements, users cannot be dynamically testing for
conformance to them at runtime.

Value types that already provide custom implementations of Equatable and
Hashable would keep the custom implementation because it would override the
compiler-provided default.

This change would potentially increase binary size when it generates
conformances that did not exist before, at least for types where it is not
possible to know that the conformances are unused and thus cannot be
dead-stripped (i.e., public types).
<https://gist.github.com/allevato/2fd10290bfa84accfbe977d8ac07daad#alternatives-considered>Alternatives
considered

The original discussion thread also included Comparable as a candidate for
automatic generation. Unlike equatability and hashability, however,
comparability requires an ordering among the members being compared.
Automatically using the definition order here might be too surprising for
users, but worse, it also means that reordering properties in the source
code changes the code's behavior at runtime. (This is true for hashability
as well if a multiplicative hash function is used, but hash values are not
intended to be persistent and reordering the terms does not produce a
significant *behavioral* change.)
<https://gist.github.com/allevato/2fd10290bfa84accfbe977d8ac07daad#acknowledgments>
Acknowledgments

Thanks to Joe Groff for spinning off the original discussion thread, Jose
Cheyo Jimenez for providing great real-world examples of boilerplate needed
to support equatability for some value types, and to Mark Sands for
necromancing the swift-evolution thread that convinced me to write this up.
------------------------------
<https://gist.github.com/allevato/2fd10290bfa84accfbe977d8ac07daad#rationale>
Rationale

On [Date], the core team decided to (TBD) this proposal. When the core team
makes a decision regarding this proposal, their rationale for the decision
will be written here.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-evolution/attachments/20170504/8b76de27/attachment.html>


More information about the swift-evolution mailing list