Skip to content

Commit 72e6987

Browse files
committed
feat: rust hashing cheat sheet
1 parent b104ca4 commit 72e6987

File tree

1 file changed

+148
-0
lines changed

1 file changed

+148
-0
lines changed
Lines changed: 148 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,148 @@
1+
+++
2+
title = "Rust Hashing Cheat Sheet"
3+
description = "Several examples of how to use Rust's hashing traits and types"
4+
5+
[taxonomies]
6+
tags = ["rust"]
7+
+++
8+
9+
Hashing is the process of transforming arbitrary data into a fixed-size number. Several useful programming concepts arise out of hash codes:
10+
11+
- Hash sets and maps
12+
- Data digests
13+
- Cheap identifiers / inequality checks
14+
- Storing passwords
15+
16+
Recently I tried writing a hash set from scratch in [Rust](https://rust-lang.org/) for educational purposes, but was awfully confused by the collection of traits and types provided by [`std::hash`](https://doc.rust-lang.org/stable/std/hash/index.html). In this post I hope to share some common patterns related to hashing in Rust, while explaining `std::hash` as I go.
17+
18+
## Hashing a Single Value
19+
20+
Hashing a value is as simple as creating a `Hasher`, calling `value.hash(&mut hasher)`, and then calling `hasher.finish()`.
21+
22+
```rust
23+
use std::hash::{DefaultHasher, Hash, Hasher};
24+
25+
let mut hasher = DefaultHasher::new();
26+
"Hello, world!".hash(&mut hasher);
27+
let hash: u64 = hasher.finish();
28+
29+
println!("Hash: {hash}"); // Hash: 7092736762612737980
30+
```
31+
32+
## Hashing Several Values into One Code
33+
34+
You can call `value.hash(&mut hasher)` several times to create a hash code composed of multiple data sources. This is useful when hashing structs or arrays.
35+
36+
```rust
37+
use std::hash::{DefaultHasher, Hash, Hasher};
38+
39+
let mut hasher = DefaultHasher::new();
40+
41+
"Hello".hash(&mut hasher);
42+
13u64.hash(&mut hasher);
43+
false.hash(&mut hasher);
44+
45+
let hash: u64 = hasher.finish();
46+
47+
println!("Hash: {hash}"); // Hash: 3402450879032501501
48+
```
49+
50+
## `Hash`, `Hasher`, and `DefaultHasher`
51+
52+
- [`Hash`](https://doc.rust-lang.org/stable/std/hash/trait.Hash.html): a type that can be hashed (`str`, `u64`, `bool`, etc.)
53+
- [`Hasher`](https://doc.rust-lang.org/stable/std/hash/trait.Hasher.html): a hashing algorithm (`DefaultHasher`, 3rd-party implementations)
54+
- [`DefaultHasher`](https://doc.rust-lang.org/stable/std/hash/struct.DefaultHasher.html): Rust's default hashing algorithm[^siphash]
55+
56+
`Hasher`s are never re-used to make several hash codes. If you want to compute a new hash code, you discard the current `Hasher` and create a new one.
57+
58+
[^siphash]: In Rust 1.91.0 the default hashing algorithm is [SipHash 1-3](https://en.wikipedia.org/wiki/SipHash), but this is an internal detail that may change in the future.
59+
60+
## Hashing with a Random Seed
61+
62+
To make a hash resilient to [hash flooding](https://en.wikipedia.org/wiki/Collision_attack#Hash_flooding), you can create a `Hasher` with a random seed using `RandomState`.
63+
64+
```rust
65+
use std::hash::{BuildHasher, Hash, Hasher, RandomState};
66+
67+
let state = RandomState::new();
68+
69+
let mut hasher = state.build_hasher();
70+
"Hello, world!".hash(&mut hasher);
71+
let hash = hasher.finish();
72+
73+
println!("Hash: {hash}"); // Hash: 1905042730872565693
74+
```
75+
76+
There's also a shorthand for this pattern using [`BuildHasher::hash_one()`](https://doc.rust-lang.org/stable/std/hash/trait.BuildHasher.html#method.hash_one).
77+
78+
```rust
79+
use std::hash::{BuildHasher, RandomState};
80+
81+
let state = RandomState::new();
82+
83+
let hash = state.hash_one("Hello, world!");
84+
85+
println!("Hash: {hash}"); // Hash: 11506452463443521132
86+
```
87+
88+
Note how the hash codes are different from the two examples, even though they're both hashing `"Hello, world!"`, because `RandomState::new()` creates a new random seed each time it is called.
89+
90+
## `BuildHasher` and `RandomSeed`
91+
92+
- [`BuildHasher`](https://doc.rust-lang.org/stable/std/hash/trait.BuildHasher.html): a type that can create a new `Hasher` with a seed
93+
- [`RandomState`](https://doc.rust-lang.org/stable/std/hash/struct.RandomState.html): generates a random seed when constructed, then builds `Hasher`s using that seed
94+
95+
If you want to hash two separate values and compare them for equality, you would typically create one `RandomState` then use it to build two `DefaultHasher`s with the same seed.[^default-hasher-new]
96+
97+
[^default-hasher-new]: Technically, you could just create the two hashers by calling `DefaultHasher::new()`, which initializes them with a seed of 0. This is vulnerable to [hash flooding](https://en.wikipedia.org/wiki/Collision_attack#Hash_flooding) attacks, however, so I don't recommend it!
98+
99+
## Deriving `Hash`
100+
101+
The easiest way to make a custom type hashable is by deriving `Hash`.
102+
103+
```rust
104+
use std::hash::{DefaultHasher, Hash, Hasher};
105+
106+
#[derive(Hash)]
107+
struct Foo {
108+
a: &'static str,
109+
b: u64,
110+
c: bool,
111+
}
112+
113+
let mut hasher = DefaultHasher::new();
114+
115+
Foo { a: "Hello", b: 13, c: false }.hash(&mut hasher);
116+
117+
let hash: u64 = hasher.finish();
118+
119+
println!("Hash: {hash}"); // Hash: 3402450879032501501
120+
```
121+
122+
## Implementing `Hash` Manually
123+
124+
If you look closely, you'll notice that the hash codes from [Hashing Several Values into One Code](#hashing-several-values-into-one-code) and [Deriving `Hash`](#deriving-hash) are equal! This is because they're hashing the same data in the same order with the same seed. To prove this, we can expand the `Hash` derivation:
125+
126+
```rust
127+
use std::hash::{Hash, Hasher};
128+
129+
struct Foo {
130+
a: &'static str,
131+
b: u64,
132+
c: bool,
133+
}
134+
135+
impl Hash for Foo {
136+
fn hash<H: Hasher>(&self, state: &mut H) {
137+
self.a.hash(state);
138+
self.b.hash(state);
139+
self.c.hash(state);
140+
}
141+
}
142+
```
143+
144+
## Conclusion
145+
146+
I hope these examples help you wrap your head around Rust's hashing support! While I didn't cover it in this article, you may be also interested in [`Hasher`](https://doc.rust-lang.org/stable/std/hash/trait.Hasher.html)'s methods and how primitives like [`bool`](https://doc.rust-lang.org/stable/std/hash/trait.Hash.html#impl-Hash-for-bool), [`char`](https://doc.rust-lang.org/stable/std/hash/trait.Hash.html#impl-Hash-for-char), and [tuples](https://doc.rust-lang.org/stable/std/hash/trait.Hash.html#impl-Hash-for-(T,)) implement `Hash`. You may also enjoy looking at [`rustc-hash`](https://lib.rs/crates/rustc-hash) (previous `fxhash`), [`fnv`](https://lib.rs/crates/fnv), [`sha2`](https://lib.rs/crates/sha2), and [`blake2`](https://lib.rs/crates/blake2).
147+
148+
Happy hacking!

0 commit comments

Comments
 (0)