[swift-evolution] Custom equality/hash for Sets

Nicola Salmoria nicola.salmoria at gmail.com
Fri Feb 19 03:08:10 CST 2016


Would it make sense to abstract the hash functions into a protocol?

The limit of the Hashable protocol is that it forces a type to commit to a specific ‘canonical’ hash implementation, while there might be more efficient implementations depending on the application.

To truly support composition patterns, it would seem appropriate for the standard library to support decoupling the hashing functionality from the type of the container element.

E.g.

protocol Hash {
    typealias Element

    static func equal(lhs: Element, _ rhs: Element) -> Bool
    static func hashValue(x: Element) -> Int
}

// now we can define a generic box
struct HashableBox<T, H: Hash where H.Element == T> : Hashable {
    let value: T
    
    var hashValue: Int {
        return H.hashValue(value)
    }
}
func ==<T, H>(lhs: HashableBox<T, H>, rhs: HashableBox<T, H>) -> Bool {
    return H.equal(lhs.value, rhs.value)
}

// now we can define a wrapper around Set, using an explicit Hash
struct CustomHashSet<Element, H: Hash where H.Element == Element> {
    // this is an implementation detail; if CustomHashSet was part of the standard library,
    // the .gyb could produce a more efficient implementation
    private var set: Set<HashableBox<Element, H>>
    
    init() {}
    init<S: SequenceType where S.Generator.Element == Element>(_ sequence: S) {
        set = Set(sequence.lazy.map { HashableBox<Element, H>(value: $0) })
    }
    // etc.
}


// Example usage
struct NaiveCollectionHash<C: CollectionType where C.Generator.Element: Hashable>: Hash {
    typealias Element = C
    
    static func equal(lhs: Element, _ rhs: Element) -> Bool {
        guard lhs.count == rhs.count else { return false }
        return zip(lhs, rhs).lazy.reduce(true) { $0 && ($1.0 == $1.1) }
    }
    static func hashValue(x: Element) -> Int {
        return x.reduce(0) { $0 ^ $1.hashValue }    // a bad hash just as an example
    }
}

let arrays = [[1],[2]]

// This doesn't work because Array is not Hashable (yet)
var s1 = Set(arrays) // Error

// This works but we need to explicitly specify all the types
var s2 = CustomHashSet<Array<Int>, NaiveCollectionHash<Array<Int>>>(arrays) // ok

// Generic typealiases would help reducing the boilerplate, e.g.
// it would be useful if we could do
typealias NaiveHashSet<T> = CustomHashSet<T, NaiveCollectionHash<T>>    // Error
// then Swift's type inference should be able to automatically handle this:
var s3 = NaiveHashSet(arrays)   // Error


Nicola

> Cc:swift-evolution<swift-evolution at swift.org>
> Subject:[swift-evolution] Custom equality/hash for Sets
> Date:19 February 2016 at 05:04:51 GMT+1
> 
> 
> 
> > On Feb 18, 2016, at 16:09 , Dave Abrahams via swift-evolution<swift-evolution at swift.org(mailto:swift-evolution at swift.org)>wrote:
> > 
> > on Thu Feb 18 2016, Jacob Bandes-Storch<swift-evolution at swift.org(mailto:swift-evolution at swift.org)>wrote:
> > 
> > > Would it make sense for the standard library Set to provide variants (or
> > > parallel versions of the same data structure) that take custom hashValue/==
> > > implementations at init time (functions taking in Elements), rather than
> > > relying on Hashable/Comparable protocols?
> > > 
> > > Use case: I want a set of objects that are compared for equality using ===
> > > rather than ==. This doesn't seem possible today, using Set, without
> > > creating some sort of wrapper object.
> > > 
> > > This particular case would be analogous to using NSHashTable with
> > > NSPointerFunctionsObjectPointerPersonality. (Maybe all I'm asking for is a
> > > Swiftier API for NSHashTable — including ArrayLiteralConvertible, using
> > > generics instead of UnsafePointer<Void>, etc.)
> > > 
> > > Similarly, C++'s unordered_map
> > > <http://en.cppreference.com/w/cpp/container/unordered_map>and friends have
> > > template parameters specifying the hash function and equality comparator,
> > > which use std::hash and == by default.
> > 
> > It might make sense.How bad is the wrapper solution for user code?
> > struct CustomHashableFoo: Hashable {
> > var value: Foo
> > func hash() ->Int {
> > // custom hash function here
> > }
> > }
> > func ==(a: CustomHashableWrapped, b: CustomHashableWrapped) {
> > // custom == here
> > }
> Really not that bad, although you do have to get 'value' in and out of the box. It's also not reusable code—you have to rewrite the box for every type.
> 
> I'd say you usuallydon'twant to allow custom hash/== closures because (a) then you have to store them somewhere, and (b) the compiler won't usually be able to inline them away. You also end up with a Set<Foo>that doesn't behave like a normal Set<Foo>—maybe you can insert equal-but-not-identical elements—which is bad if anyone's relying on normal Set-like guarantees.
> 
> -1 from me until we can put functions in generics, if ever.
> 
> Jordan_______________________________________________
> swift-evolution mailing list
> swift-evolution at swift.org
> https://lists.swift.org/mailman/listinfo/swift-evolution
> 
> 
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-evolution/attachments/20160219/24fa4be0/attachment.html>


More information about the swift-evolution mailing list