|
| 1 | +# HashSet (Open-addressing) |
| 2 | +Open-addressing is another approach to resolving collisions in hash tables. |
| 3 | + |
| 4 | +A hash collision is resolved by <b>probing<b>, or searching through alternative locations in |
| 5 | +the array (the probe sequence) until either the target element is found, or an unused array slot is found, |
| 6 | +which indicates that there is no such key in the table. |
| 7 | + |
| 8 | +## Probing Strategies |
| 9 | + |
| 10 | +### Linear Probing |
| 11 | +The probing strategy used in our implementation. |
| 12 | + |
| 13 | +Simplest form of probing and involves linearly searching the hash table for an empty spot upon collision. |
| 14 | + |
| 15 | +However, this method of probing can result in a phenomenon called (primary) clustering where a large run of |
| 16 | +occupied slots builds up, which can drastically degrade the performance of add, remove and contains operations. |
| 17 | + |
| 18 | +h(k, i) = (h'(k) + i) mod m where h'(k) is an ordinary hash function |
| 19 | + |
| 20 | +### Quadratic Probing |
| 21 | +This method of probing involves taking the original hash index, and adding successive values of an arbitrary quadratic |
| 22 | +polynomial until an open slot is found. |
| 23 | + |
| 24 | +This helps to avoid primary clustering of entries (like in Linear Probing), but might still result in secondary |
| 25 | +clustering where keys that hash to the same value probe the same alternative cells when a collision occurs. |
| 26 | + |
| 27 | +h(k, i) = ( h`(k) + c1 * i + c2 * (i^2) ) mod m where c1 and c2 are arbitrary constants |
| 28 | + |
| 29 | +### Double Hashing |
| 30 | +This is a method of probing where a secondary hash function is used for probing whenever a collision occurs. |
| 31 | + |
| 32 | +If h2(k) is relatively prime to m for all k, Uniform Hashing Assumption can hold true, as all permutations of probe |
| 33 | +sequences occur in equal probability. |
| 34 | + |
| 35 | +h(k, i) = (h1(k) + i * h2(k)) mod m where h1(k) and h2(k) are two ordinary hash functions |
| 36 | + |
| 37 | +*Source: https://courses.csail.mit.edu/6.006/fall11/lectures/lecture10.pdf* |
| 38 | + |
| 39 | +## Analysis |
| 40 | +let α = n / m where α is the load factor of the table |
| 41 | + |
| 42 | +For n items, in a table of size m, assuming uniform hashing, the expected cost of an operation is: |
| 43 | + |
| 44 | +<div style="text-align: center;">1/1-α</div> |
| 45 | + |
| 46 | +e.g. if α = 90%, then E[#probes] = 10; |
| 47 | + |
0 commit comments