-
-
Notifications
You must be signed in to change notification settings - Fork 5.7k
Description
julia> isequal(BitSet(1:100000), Set(1:100000))
trueso
julia> hash(BitSet(1:100000))
0x0bbb2afc9d6e69a0
julia> hash(Set(1:100000))
0x0bbb2afc9d6e69a0hash needs to be defined so that isequal(x,y) implies hash(x)==hash(y), and it is good that two sets with the same members are isequal, even if their internal representation differs.
But the performance...
julia> @time hash(BitSet(1:100000))
0.000281 seconds (4 allocations: 12.469 KiB)
0x0bbb2afc9d6e69a0
julia> @time hash(BitSet(1:100000).bits)
0.000009 seconds (4 allocations: 12.469 KiB)
0x8f1d49c388957d15It would be nice if we could have the best of both worlds. What about a hash function like hash2(domain, x, h) with a fallback
hash2(::Type{<:Any}, x, h) = hash(x, h), but then e.g. hash2(::Type{BitSet}, x::BitSet, h) = hash(x.bits, h)
Then code like
Lines 280 to 300 in 30fe8cc
| function ht_keyindex(h::Dict{K,V}, key) where V where K | |
| sz = length(h.keys) | |
| iter = 0 | |
| maxprobe = h.maxprobe | |
| index = hashindex(key, sz) | |
| keys = h.keys | |
| @inbounds while true | |
| if isslotempty(h,index) | |
| break | |
| end | |
| if !isslotmissing(h,index) && (key === keys[index] || isequal(key,keys[index])) | |
| return index | |
| end | |
| index = (index & (sz-1)) + 1 | |
| iter += 1 | |
| iter > maxprobe && break | |
| end | |
| return -1 | |
| end |
would make sure that the type parameter K gets to the 3-argument hash function.
The original property that isequal(x,y) implies hash(x)==hash(y) would still hold.
A new property emerges:
isequal(x,y) implies hash2(T, x) == hash2(T, y) where T = promote_foo(typeof(x), typeof(y)) and promote_foo could be something simple like promote_foo(S, T) = ifelse(S == T, S, Any)