| description |
|---|
Hash tables are a corner stone of data structures for most problems, they are useful when there are unique keys with given values |
- Insert:
$$O(1)$$ - Search:
$$O(1)$$ - Delete:
$$O(1)$$ - Update:
$$O(1)$$
- Do we need a custom hashing function?
This technique is particularly when trying to detect all anagrams and store them in buckets of same anagrams
- Multiplicative hash with prime numbers (3 onwards)
- Assign each character a prime number and calculate the hash
- Hash will be unique per anagram
- Does not scale well if more than 52 characters possible or
$$n > 5000$$
primes = [3, 5, 7, ...]
def hash(string):
value = 1
for ch in string:
value *= primes[ord(ch) - ord('a')]
return valueI would recommend using these over hash tables where possible as they can represent the same information while reducing the complexity of the code
freq = [0] * 26
# is the same as
freq = {}
for ch in s:
if ch not in freq:
freq[ch] = 0This is a slight optimization over using hash tables/sets if the only thing that needs to be done is duplicate detection
- Not feasible if the numbers can be very large as size grows by powers of 2
acc = 0
for ch in string:
mask = 1 << (ord(ch) - ord('a'))
if acc ^ mask < acc: # duplicate found
return False
acc |= mask- Where
$$k$$ is the key to hash,$$\phi$$ is the golden ratio =$$\sqrt{5} - \frac{1}{2}$$
Often used to improve performance at the cost of using more space to store the hash table
- Use linked-lists to optimize to quickly delete/insert nodes in a certain order
- Hash table has pointers to the nodes of the linked list
- Chaining (linked list per bucket)
- Open addressing (probing for a blank spot if current bucket is filled)