[swift-evolution] [Draft] Hasher & HashVisitable

David Sweeris davesweeris at mac.com
Mon Mar 13 20:03:23 CDT 2017


> On Mar 13, 2017, at 8:38 AM, Vincent Esche via swift-evolution <swift-evolution at swift.org> wrote:
> 
> Hi there,
> 
> I've written up a proposal for an overhaul/replacement of the Hashable protocol and would love to hear your feedback on it!
> 
> Rendered <https://gist.github.com/regexident/1b8e84974da2243e5199e760508d2d25> | Blog Post <https://blog.definiteloops.com/ha-r-sh-visitors-8c0c3686a46f>
> 
> Cheers,
> Vincent
> 
> Ps: I'd like to thank David Hart (@hartbit) for his great editorial feedback on this proposal. đź‘Ť
> 
> HashVisitable
> 
> Proposal: SE-NNNN <https://gist.github.com/regexident/NNNN-filename.md>
> Authors: Vincent Esche <https://github.com/regexident>
> Review Manager: TBD
> Status: Awaiting review
>  <https://gist.github.com/regexident/1b8e84974da2243e5199e760508d2d25#introduction>Introduction
> 
> Replace the Hashable protocol by two new procotols (Hasher and HashVisitable) to improve safety, versatility and learnability.
> 
>  <https://gist.github.com/regexident/1b8e84974da2243e5199e760508d2d25#motivation>Motivation
> 
> Implementing Hashable is difficult and the consequences if not done well have dire performance and safety repercussions.
> 
> The documentation of Hashable lists a sample implementation <https://developer.apple.com/reference/swift/hashable> of var hashValue:
> 
> /// A point in an x-y coordinate system.
> struct GridPoint {
>     var x: Int
>     var y: Int
> }
> 
> extension GridPoint: Hashable {
>     var hashValue: Int {
>         return x.hashValue ^ y.hashValue
>     }
> 
>     static func == (lhs: GridPoint, rhs: GridPoint) -> Bool {
>         return lhs.x == rhs.x && lhs.y == rhs.y
>     }
> }
> Calculating the hashes of all GridPoints (given the above implementation) on a 1000 × 1000 grid …
> 
> let (width, height) = (1000, 1000)
> let total = width * height
> var hashes = Set<Int>()
> for x in 0..<width {
>     for y in 0..<height {
>         hashes.insert(GridPoint(x: x, y: y).hashValue)
>     }
> }
> print("\(hashes.count) unique hashes out of a total of \(total).")
> … results in just 1024 unique hash values for 1_000_000 unique values.
> 
> In other words: The recommended implementation causes 99.9% of values to trigger a hash collision.
> 

And that’s why I tend to look at Dictionary<> and Set<> kinda like they’re that kid in The Exorcist whenever I start thinking about how they work…

I understand hashing in theory, but can’t for the life of me come up with an algorithm for creating them the wouldn't almost immediately result in collisions for the kinds of use-cases I envision. I haven’t finished reading the proposal yet, but I wanted to get a firm and enthusiastic +1 in for its intent.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-evolution/attachments/20170313/6cc0615c/attachment.html>


More information about the swift-evolution mailing list