Skip to content

Commit fb64914

Browse files
committed
Add README for HashSet with open-addressing
1 parent 006dda0 commit fb64914

File tree

1 file changed

+47
-0
lines changed
  • src/dataStructures/hashSet/openAddressing

1 file changed

+47
-0
lines changed
Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,47 @@
1+
# HashSet (Open-addressing)
2+
Open-addressing is another approach to resolving collisions in hash tables.
3+
4+
A hash collision is resolved by <b>probing<b>, or searching through alternative locations in
5+
the array (the probe sequence) until either the target element is found, or an unused array slot is found,
6+
which indicates that there is no such key in the table.
7+
8+
## Probing Strategies
9+
10+
### Linear Probing
11+
The probing strategy used in our implementation.
12+
13+
Simplest form of probing and involves linearly searching the hash table for an empty spot upon collision.
14+
15+
However, this method of probing can result in a phenomenon called (primary) clustering where a large run of
16+
occupied slots builds up, which can drastically degrade the performance of add, remove and contains operations.
17+
18+
h(k, i) = (h'(k) + i) mod m where h'(k) is an ordinary hash function
19+
20+
### Quadratic Probing
21+
This method of probing involves taking the original hash index, and adding successive values of an arbitrary quadratic
22+
polynomial until an open slot is found.
23+
24+
This helps to avoid primary clustering of entries (like in Linear Probing), but might still result in secondary
25+
clustering where keys that hash to the same value probe the same alternative cells when a collision occurs.
26+
27+
h(k, i) = ( h`(k) + c1 * i + c2 * (i^2) ) mod m where c1 and c2 are arbitrary constants
28+
29+
### Double Hashing
30+
This is a method of probing where a secondary hash function is used for probing whenever a collision occurs.
31+
32+
If h2(k) is relatively prime to m for all k, Uniform Hashing Assumption can hold true, as all permutations of probe
33+
sequences occur in equal probability.
34+
35+
h(k, i) = (h1(k) + i * h2(k)) mod m where h1(k) and h2(k) are two ordinary hash functions
36+
37+
*Source: https://courses.csail.mit.edu/6.006/fall11/lectures/lecture10.pdf*
38+
39+
## Analysis
40+
let α = n / m where α is the load factor of the table
41+
42+
For n items, in a table of size m, assuming uniform hashing, the expected cost of an operation is:
43+
44+
<div style="text-align: center;">1/1-α</div>
45+
46+
e.g. if α = 90%, then E[#probes] = 10;
47+

0 commit comments

Comments
 (0)