[swift-evolution] Refining SE-0185: Should providing a custom == suppress the default hashValue?

Tony Allevato tony.allevato at gmail.com
Sat Dec 16 00:41:58 CST 2017


On Fri, Dec 15, 2017 at 10:27 PM Kevin Nattinger <swift at nattinger.net>
wrote:

> I’m skeptical that your use case is common enough to justify leaving open
> the glaring bug magnet that started this thread.
>

The use case you're replying to is not the bug described in the original
post.

The original post described "what happens if you implement ==, but keep
synthesized hashValue?" That has a high probability of being wrong,
especially if == touches a subset of the fields but hashValue continues to
use all of them. As I stated in my original reply, I fully support plugging
that hole.

The use case I mentioned later in response to Howard is "what happens is
you implement hashValue, but keep synthesized ==?" That's perfectly valid.
The required relationship between Equatable and Hashable is that any two
values that are equal must hash to the same value, but two values that have
the same hash value need not be equal.



> Could you give an example of a common occurrence where it would be a
> significant improvement to explicitly write a *less precise* hash function
> that’s only “good enough” and still want the more precise full equality?
>

Though I did mention subsets of fields earlier, that's not the only valid
case—it could just be a *different* hash function. If someone wants to
implement a different hash algorithm than the default, that should *not*
require them to also write out all the boilerplate to test the pairwise
fields for ==.


>
> TBH, I think contributors here are often too quick to demand padding the
> walls to protect the most incompetent of engineers from themselves, but I
> feel like the root proposal here is a good idea.
>

I agree, that's why I support it :)


>
>
> On Dec 15, 2017, at 9:59 PM, Tony Allevato via swift-evolution <
> swift-evolution at swift.org> wrote:
>
> Those are valid concerns for hashing algorithms in general, but there's no
> connection between that and a statement that an explicitly implemented
> hashValue should also require an explicitly implemented ==. Requiring that
> certainly doesn't make it less likely that people will run into the problem
> you've described if they implement their own hashValue—if they implement it
> poorly, it just means that the could also shoot themselves in the foot by
> then being forced to also implement == and possibly doing it poorly.
>
>
> IMO, it’s far easier to implement hashValue poorly, so I think reminding
> the dev they need to think about `==` too is more helpful than not. I’m not
> often in favor of the padded cell, but I would even consider a proposal to
> emit a warning if fields read in `==` is a strict subset of fields read in
> `hashValue`.
>
>
>
>
> On Fri, Dec 15, 2017 at 9:53 PM Howard Lovatt <howard.lovatt at gmail.com>
> wrote:
>
>> I would say it is an advanced use because it is an optimisation and in
>> addition an optimisation that requires a lot of knowledge of the fields to
>> be certain that a reduced hash is going to be good enough.
>>
>> The optimisation doesn’t have a great history, for example in Java they
>> used to hash only the 1st 6 characters of a string. However this was
>> exploited in denial of service attacks that generated a vast number of
>> strings with the same hash value, i.e same 1st 6 characters, that then
>> overwhelmed the dictionary (map in Java) used in the web server software to
>> store logins.
>>
>> So it wouldn’t be something I would encourage people to do or even worse
>> do by accident.
>>
>>
>> -- Howard.
>>
>> On 16 Dec 2017, at 3:36 pm, Tony Allevato <tony.allevato at gmail.com>
>> wrote:
>>
>>
>>
>> On Fri, Dec 15, 2017 at 6:41 PM Howard Lovatt <howard.lovatt at gmail.com>
>> wrote:
>>
>>> I think that is an advanced use, rather than a common use. I would
>>> prefer that to be something you manually code.
>>>
>>
>> But why? Why should implementing a subset of fields for hashValue require
>> a developer to also manually implement == when the default synthesized
>> version would be perfectly fine? The relationship between Equatable and
>> Hashable does not go both ways.
>>
>> In fact, requiring that they do so is *more* error prone because now
>> they're being forced to implement something that the compiler would have
>> otherwise generated for them.
>>
>>
>>
>>>
>>>
>>> -- Howard.
>>>
>>> On 16 Dec 2017, at 7:08 am, Tony Allevato <tony.allevato at gmail.com>
>>> wrote:
>>>
>>>
>>>
>>> On Fri, Dec 15, 2017 at 11:39 AM Howard Lovatt via swift-evolution <
>>> swift-evolution at swift.org> wrote:
>>>
>>>> +1
>>>> I think the simple solution of if you provide either == or hashValue
>>>> you have to provide both is the best approach. Good catch of this bug.
>>>> -- Howard.
>>>>
>>>
>>> That would be a significant usability hit to a common use case. There
>>> are times where a value is composed of N fields where N is large-ish, and
>>> equality is dependent on the values of all N fields but the hash value only
>>> needs to be "good enough" by considering some subset of those fields (to
>>> make computing it more efficient).
>>>
>>> That still satisfies the related relationship between == and hashValue,
>>> but a user wanting to explicitly implement a more efficient hashValue
>>> should *not* necessarily be required to explicitly write the same == that
>>> would be synthesized for them in that case.
>>>
>>>
>>>
>>>>
>>>> > On 16 Dec 2017, at 6:24 am, Daniel Duan via swift-evolution <
>>>> swift-evolution at swift.org> wrote:
>>>> >
>>>> > +1. The proposal wasn’t explicit enough to have either supported or
>>>> be against this IMO. It’s a sensible thing to spell out.
>>>> >
>>>> > Daniel Duan
>>>> > Sent from my iPhone
>>>> >
>>>> >> On Dec 15, 2017, at 9:58 AM, Joe Groff via swift-evolution <
>>>> swift-evolution at swift.org> wrote:
>>>> >>
>>>> >> SE-0185 is awesome, and brings the long-awaited ability for the
>>>> compiler to provide a default implementation of `==` and `hashValue` when
>>>> you don't provide one yourself. Doug and I were talking the other day and
>>>> thought of a potential pitfall: what should happen if you provide a manual
>>>> implementation of `==` without also manually writing your own `hashValue`?
>>>> It's highly likely that the default implementation of `hashValue` will be
>>>> inconsistent with `==` and therefore invalid in a situation like this:
>>>> >>
>>>> >> struct Foo: Hashable {
>>>> >> // This property is "part of the value"
>>>> >> var involvedInEquality: Int
>>>> >> // This property isn't; maybe it's a cache or something like that
>>>> >> var notInvolvedInEquality: Int
>>>> >>
>>>> >> static func ==(a: Foo, b: Foo) -> Bool {
>>>> >>   return a.involvedInEquality == b.involvedInEquality
>>>> >> }
>>>> >> }
>>>> >>
>>>> >> As currently implemented, the compiler will still give `Foo` the
>>>> default hashValue implementation, which will use both of `Foo`'s properties
>>>> to compute the hash, even though `==` only tests one. This could be
>>>> potentially dangerous. Should we suppress the default hashValue derivation
>>>> when an explicit == implementation is provided?
>>>> >>
>>>> >> -Joe
>>>> >> _______________________________________________
>>>> >> swift-evolution mailing list
>>>> >> swift-evolution at swift.org
>>>> >> https://lists.swift.org/mailman/listinfo/swift-evolution
>>>> > _______________________________________________
>>>> > swift-evolution mailing list
>>>> > swift-evolution at swift.org
>>>> > https://lists.swift.org/mailman/listinfo/swift-evolution
>>>> _______________________________________________
>>>> swift-evolution mailing list
>>>> swift-evolution at swift.org
>>>> https://lists.swift.org/mailman/listinfo/swift-evolution
>>>>
>>> _______________________________________________
> swift-evolution mailing list
> swift-evolution at swift.org
> https://lists.swift.org/mailman/listinfo/swift-evolution
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-evolution/attachments/20171216/b172ceb3/attachment.html>


More information about the swift-evolution mailing list