Commit 5878f35
committed
Merge bitcoin/bitcoin#31144: [IBD] multi-byte block obfuscation
248b6a2 optimization: peel align-head and unroll body to 64 bytes (Lőrinc)
e7114fc optimization: migrate fixed-size obfuscation from `std::vector<std::byte>` to `uint64_t` (Lőrinc)
478d40a refactor: encapsulate `vector`/`array` keys into `Obfuscation` (Lőrinc)
377aab8 refactor: move `util::Xor` to `Obfuscation().Xor` (Lőrinc)
fa5d296 refactor: prepare mempool_persist for obfuscation key change (Lőrinc)
6bbf2d9 refactor: prepare `DBWrapper` for obfuscation key change (Lőrinc)
0b8bec8 scripted-diff: unify xor-vs-obfuscation nomenclature (Lőrinc)
9726979 bench: make ObfuscationBench more representative (Lőrinc)
618a30e test: compare util::Xor with randomized inputs against simple impl (Lőrinc)
a5141cd test: make sure dbwrapper obfuscation key is never obfuscated (Lőrinc)
54ab0bd refactor: commit to 8 byte obfuscation keys (Lőrinc)
7aa557a random: add fixed-size `std::array` generation (Lőrinc)
Pull request description:
This change is part of [[IBD] - Tracking PR for speeding up Initial Block Download](bitcoin/bitcoin#32043)
### Summary
Current block obfuscations are done byte-by-byte, this PR batches them to 64 bit primitives to speed up obfuscating bigger memory batches.
This is especially relevant now that bitcoin/bitcoin#31551 was merged, having bigger obfuscatable chunks.
Since this obfuscation is optional, the speedup measured here depends on whether it's a [random value](bitcoin/bitcoin#31144 (comment)) or [completely turned off](bitcoin/bitcoin#31144 (comment)) (i.e. XOR-ing with 0).
### Changes in testing, benchmarking and implementation
* Added new tests comparing randomized inputs against a trivial implementation and performing roundtrip checks with random chunks.
* Migrated `std::vector<std::byte>(8)` keys to plain `uint64_t`;
* Process unaligned bytes separately and unroll body to 64 bytes.
### Assembly
Memory alignment is enforced by a small peel-loop (`std::memcpy` is optimized out on tested platform), with an `std::assume_aligned<8>` check, see the Godbolt listing at https://godbolt.org/z/59EMv7h6Y for details
<details>
<summary>Details</summary>
Target & Compiler | Stride (per hot-loop iter) | Main operation(s) in loop | Effective XORs / iter
-- | -- | -- | --
Clang x86-64 (trunk) | 64 bytes | 4 × movdqu → pxor → store | 8 × 64-bit
GCC x86-64 (trunk) | 64 bytes | 4 × movdqu/pxor sequence, enabled by 8-way unroll | 8 × 64-bit
GCC RV32 (trunk) | 8 bytes | copy 8 B to temp → 2 × 32-bit XOR → copy back | 1 × 64-bit (as 2 × 32-bit)
GCC s390x (big-endian 14.2) | 64 bytes | 8 × XC (mem-mem 8-B XOR) with key cached on stack | 8 × 64-bit
</details>
### Endianness
The only endianness issue was with bit rotation, intended to realign the key if obfuscation halted before full key consumption.
Elsewhere, memory is read, processed, and written back in the same endianness, preserving byte order.
Since CI lacks a big-endian machine, testing was done locally via Docker.
<details>
<summary>Details</summary>
```bash
brew install podman pigz
softwareupdate --install-rosetta
podman machine init
podman machine start
docker run --platform linux/s390x -it ubuntu:latest /bin/bash
apt update && apt install -y git build-essential cmake ccache pkg-config libevent-dev libboost-dev libssl-dev libsqlite3-dev python3 && \
cd /mnt && git clone --depth=1 https://github.com/bitcoin/bitcoin.git && cd bitcoin && git remote add l0rinc https://github.com/l0rinc/bitcoin.git && git fetch --all && git checkout l0rinc/optimize-xor && \
cmake -B build && cmake --build build --target test_bitcoin -j$(nproc) && \
./build/bin/test_bitcoin --run_test=streams_tests
```
</details>
### Measurements (micro benchmarks and full IBDs)
> cmake -B build -DBUILD_BENCH=ON -DCMAKE_BUILD_TYPE=Release -DCMAKE_C_COMPILER=gcc/clang -DCMAKE_CXX_COMPILER=g++/clang++ && \
cmake --build build -j$(nproc) && \
build/bin/bench_bitcoin -filter='ObfuscationBench' -min-time=5000
<details>
<summary>GNU 14.2.0</summary>
> Before:
| ns/byte | byte/s | err% | ins/byte | cyc/byte | IPC | bra/byte | miss% | total | benchmark
|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
| 0.84 | 1,184,138,235.64 | 0.0% | 9.01 | 3.03 | 2.971 | 1.00 | 0.1% | 5.50 | `ObfuscationBench`
> After (first optimizing commit):
| ns/byte | byte/s | err% | ins/byte | cyc/byte | IPC | bra/byte | miss% | total | benchmark
|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
| 0.04 | 28,365,698,819.44 | 0.0% | 0.34 | 0.13 | 2.714 | 0.07 | 0.0% | 5.33 | `ObfuscationBench`
> and (second optimizing commit):
| ns/byte | byte/s | err% | ins/byte | cyc/byte | IPC | bra/byte | miss% | total | benchmark
|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
| 0.03 | 32,464,658,919.11 | 0.0% | 0.50 | 0.11 | 4.474 | 0.08 | 0.0% | 5.29 | `ObfuscationBench`
</details>
<details>
<summary>Clang 20.1.7</summary>
> Before:
| ns/byte | byte/s | err% | ins/byte | cyc/byte | IPC | bra/byte | miss% | total | benchmark
|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
| 0.89 | 1,124,087,330.23 | 0.1% | 6.52 | 3.20 | 2.041 | 0.50 | 0.2% | 5.50 | `ObfuscationBench`
> After (first optimizing commit):
| ns/byte | byte/s | err% | ins/byte | cyc/byte | IPC | bra/byte | miss% | total | benchmark
|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
| 0.08 | 13,012,464,203.00 | 0.0% | 0.65 | 0.28 | 2.338 | 0.13 | 0.8% | 5.50 | `ObfuscationBench`
> and (second optimizing commit):
| ns/byte | byte/s | err% | ins/byte | cyc/byte | IPC | bra/byte | miss% | total | benchmark
|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
| 0.02 | 41,231,547,045.17 | 0.0% | 0.30 | 0.09 | 3.463 | 0.02 | 0.0% | 5.47 | `ObfuscationBench`
</details>
i.e. 27.4x faster obfuscation with GCC, 36.7x faster with Clang
For other benchmark speedups see https://corecheck.dev/bitcoin/bitcoin/pulls/31144
------
Running an IBD until 888888 blocks reveals a 4% speedup.
<details>
<summary>Details</summary>
SSD:
```bash
COMMITS="8324a00bd4a6a5291c841f2d01162d8a014ddb02 5ddfd31b4158a89b0007cfb2be970c03d9278525"; \
STOP_HEIGHT=888888; DBCACHE=1000; \
CC=gcc; CXX=g++; \
BASE_DIR="/mnt/my_storage"; DATA_DIR="$BASE_DIR/BitcoinData"; LOG_DIR="$BASE_DIR/logs"; \
(for c in $COMMITS; do git fetch origin $c -q && git log -1 --pretty=format:'%h %s' $c || exit 1; done) && \
hyperfine \
--sort 'command' \
--runs 1 \
--export-json "$BASE_DIR/ibd-${COMMITS// /-}-$STOP_HEIGHT-$DBCACHE-$CC.json" \
--parameter-list COMMIT ${COMMITS// /,} \
--prepare "killall bitcoind; rm -rf $DATA_DIR/*; git checkout {COMMIT}; git clean -fxd; git reset --hard; \
cmake -B build -DCMAKE_BUILD_TYPE=Release -DENABLE_WALLET=OFF && \
cmake --build build -j$(nproc) --target bitcoind && \
./build/bin/bitcoind -datadir=$DATA_DIR -stopatheight=1 -printtoconsole=0; sleep 100" \
--cleanup "cp $DATA_DIR/debug.log $LOG_DIR/debug-{COMMIT}-$(date +%s).log" \
"COMPILER=$CC ./build/bin/bitcoind -datadir=$DATA_DIR -stopatheight=$STOP_HEIGHT -dbcache=$DBCACHE -blocksonly -printtoconsole=0"
```
> 8324a00bd4 test: Compare util::Xor with randomized inputs against simple impl
> 5ddfd31b41 optimization: Xor 64 bits together instead of byte-by-byte
```python
Benchmark 1: COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=888888 -dbcache=1000 -blocksonly -printtoconsole=0 (COMMIT = 8324a00bd4a6a5291c841f2d01162d8a014ddb02)
Time (abs ≡): 25033.413 s [User: 33953.984 s, System: 2613.604 s]
Benchmark 2: COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=888888 -dbcache=1000 -blocksonly -printtoconsole=0 (COMMIT = 5ddfd31b4158a89b0007cfb2be970c03d9278525)
Time (abs ≡): 24110.710 s [User: 33389.536 s, System: 2660.292 s]
Relative speed comparison
1.04 COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=888888 -dbcache=1000 -blocksonly -printtoconsole=0 (COMMIT = 8324a00bd4a6a5291c841f2d01162d8a014ddb02)
1.00 COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=888888 -dbcache=1000 -blocksonly -printtoconsole=0 (COMMIT = 5ddfd31b4158a89b0007cfb2be970c03d9278525)
```
> HDD:
```bash
COMMITS="71eb6eaa740ad0b28737e90e59b89a8e951d90d9 46854038e7984b599d25640de26d4680e62caba7"; \
STOP_HEIGHT=888888; DBCACHE=4500; \
CC=gcc; CXX=g++; \
BASE_DIR="/mnt/my_storage"; DATA_DIR="$BASE_DIR/BitcoinData"; LOG_DIR="$BASE_DIR/logs"; \
(for c in $COMMITS; do git fetch origin $c -q && git log -1 --pretty=format:'%h %s' $c || exit 1; done) && \
hyperfine \
--sort 'command' \
--runs 2 \
--export-json "$BASE_DIR/ibd-${COMMITS// /-}-$STOP_HEIGHT-$DBCACHE-$CC.json" \
--parameter-list COMMIT ${COMMITS// /,} \
--prepare "killall bitcoind; rm -rf $DATA_DIR/*; git checkout {COMMIT}; git clean -fxd; git reset --hard; \
cmake -B build -DCMAKE_BUILD_TYPE=Release -DENABLE_WALLET=OFF && cmake --build build -j$(nproc) --target bitcoind && \
./build/bin/bitcoind -datadir=$DATA_DIR -stopatheight=1 -printtoconsole=0; sleep 100" \
--cleanup "cp $DATA_DIR/debug.log $LOG_DIR/debug-{COMMIT}-$(date +%s).log" \
"COMPILER=$CC ./build/bin/bitcoind -datadir=$DATA_DIR -stopatheight=$STOP_HEIGHT -dbcache=$DBCACHE -blocksonly -printtoconsole=0"
```
> 71eb6eaa74 test: compare util::Xor with randomized inputs against simple impl
> 46854038e7 optimization: migrate fixed-size obfuscation from `std::vector<std::byte>` to `uint64_t`
```python
Benchmark 1: COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=888888 -dbcache=4500 -blocksonly -printtoconsole=0 (COMMIT = 71eb6eaa740ad0b28737e90e59b89a8e951d90d9)
Time (mean ± σ): 37676.293 s ± 83.100 s [User: 36900.535 s, System: 2220.382 s]
Range (min … max): 37617.533 s … 37735.053 s 2 runs
Benchmark 2: COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=888888 -dbcache=4500 -blocksonly -printtoconsole=0 (COMMIT = 46854038e7984b599d25640de26d4680e62caba7)
Time (mean ± σ): 36181.287 s ± 195.248 s [User: 34962.822 s, System: 1988.614 s]
Range (min … max): 36043.226 s … 36319.349 s 2 runs
Relative speed comparison
1.04 ± 0.01 COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=888888 -dbcache=4500 -blocksonly -printtoconsole=0 (COMMIT = 71eb6eaa740ad0b28737e90e59b89a8e951d90d9)
1.00 COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=888888 -dbcache=4500 -blocksonly -printtoconsole=0 (COMMIT = 46854038e7984b599d25640de26d4680e62caba7)
```
</details>
ACKs for top commit:
achow101:
ACK 248b6a2
maflcko:
review ACK 248b6a2 🎻
ryanofsky:
Code review ACK 248b6a2. Looks good! Thanks for adapting this and considering all the suggestions. I did leave more comments below but non are important and this looks good as-is
Tree-SHA512: ef541cd8a1f1dc504613c4eaa708202e32ae5ac86f9c875e03bcdd6357121f6af0860ef83d513c473efa5445b701e59439d416effae1085a559716b0fd45ecd6File tree
16 files changed
+363
-187
lines changed- src
- bench
- node
- test
- fuzz
- util
16 files changed
+363
-187
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
35 | 35 | | |
36 | 36 | | |
37 | 37 | | |
| 38 | + | |
38 | 39 | | |
39 | 40 | | |
40 | 41 | | |
| |||
52 | 53 | | |
53 | 54 | | |
54 | 55 | | |
55 | | - | |
56 | 56 | | |
57 | 57 | | |
58 | 58 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
4 | 4 | | |
5 | 5 | | |
6 | 6 | | |
7 | | - | |
8 | | - | |
| 7 | + | |
9 | 8 | | |
10 | 9 | | |
11 | 10 | | |
12 | 11 | | |
13 | | - | |
| 12 | + | |
14 | 13 | | |
15 | 14 | | |
16 | 15 | | |
17 | | - | |
| 16 | + | |
18 | 17 | | |
| 18 | + | |
19 | 19 | | |
20 | | - | |
| 20 | + | |
| 21 | + | |
21 | 22 | | |
22 | 23 | | |
23 | 24 | | |
24 | | - | |
| 25 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
11 | 11 | | |
12 | 12 | | |
13 | 13 | | |
| 14 | + | |
14 | 15 | | |
15 | 16 | | |
16 | 17 | | |
| |||
173 | 174 | | |
174 | 175 | | |
175 | 176 | | |
176 | | - | |
| 177 | + | |
177 | 178 | | |
178 | 179 | | |
179 | 180 | | |
| |||
248 | 249 | | |
249 | 250 | | |
250 | 251 | | |
251 | | - | |
252 | | - | |
253 | | - | |
254 | | - | |
255 | | - | |
256 | | - | |
257 | | - | |
258 | | - | |
259 | | - | |
260 | | - | |
261 | | - | |
262 | | - | |
263 | | - | |
264 | | - | |
265 | | - | |
| 252 | + | |
| 253 | + | |
| 254 | + | |
| 255 | + | |
| 256 | + | |
| 257 | + | |
266 | 258 | | |
267 | | - | |
268 | | - | |
| 259 | + | |
269 | 260 | | |
270 | 261 | | |
271 | 262 | | |
| |||
310 | 301 | | |
311 | 302 | | |
312 | 303 | | |
313 | | - | |
314 | | - | |
315 | | - | |
316 | | - | |
317 | | - | |
318 | | - | |
319 | | - | |
320 | | - | |
321 | | - | |
322 | | - | |
323 | | - | |
324 | | - | |
325 | | - | |
326 | | - | |
327 | | - | |
328 | | - | |
329 | | - | |
330 | | - | |
331 | | - | |
332 | 304 | | |
333 | 305 | | |
334 | 306 | | |
| |||
412 | 384 | | |
413 | 385 | | |
414 | 386 | | |
415 | | - | |
| 387 | + | |
416 | 388 | | |
417 | | - | |
| 389 | + | |
418 | 390 | | |
419 | 391 | | |
420 | 392 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
18 | 18 | | |
19 | 19 | | |
20 | 20 | | |
21 | | - | |
22 | 21 | | |
23 | 22 | | |
24 | 23 | | |
| |||
63 | 62 | | |
64 | 63 | | |
65 | 64 | | |
66 | | - | |
67 | | - | |
| 65 | + | |
68 | 66 | | |
69 | 67 | | |
70 | 68 | | |
| |||
166 | 164 | | |
167 | 165 | | |
168 | 166 | | |
169 | | - | |
| 167 | + | |
170 | 168 | | |
171 | 169 | | |
172 | 170 | | |
| |||
179 | 177 | | |
180 | 178 | | |
181 | 179 | | |
182 | | - | |
| 180 | + | |
183 | 181 | | |
184 | 182 | | |
185 | 183 | | |
186 | 184 | | |
187 | 185 | | |
188 | 186 | | |
189 | 187 | | |
190 | | - | |
191 | | - | |
192 | | - | |
193 | | - | |
194 | | - | |
195 | | - | |
196 | | - | |
197 | | - | |
| 188 | + | |
| 189 | + | |
198 | 190 | | |
199 | | - | |
| 191 | + | |
| 192 | + | |
200 | 193 | | |
201 | 194 | | |
202 | 195 | | |
| |||
228 | 221 | | |
229 | 222 | | |
230 | 223 | | |
231 | | - | |
| 224 | + | |
232 | 225 | | |
233 | 226 | | |
234 | 227 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
31 | 31 | | |
32 | 32 | | |
33 | 33 | | |
| 34 | + | |
34 | 35 | | |
35 | 36 | | |
36 | 37 | | |
| |||
779 | 780 | | |
780 | 781 | | |
781 | 782 | | |
782 | | - | |
| 783 | + | |
783 | 784 | | |
784 | 785 | | |
785 | 786 | | |
786 | 787 | | |
787 | 788 | | |
788 | | - | |
| 789 | + | |
789 | 790 | | |
790 | 791 | | |
791 | 792 | | |
| |||
1123 | 1124 | | |
1124 | 1125 | | |
1125 | 1126 | | |
1126 | | - | |
| 1127 | + | |
1127 | 1128 | | |
1128 | 1129 | | |
1129 | 1130 | | |
| |||
1140 | 1141 | | |
1141 | 1142 | | |
1142 | 1143 | | |
1143 | | - | |
| 1144 | + | |
1144 | 1145 | | |
1145 | 1146 | | |
1146 | 1147 | | |
1147 | 1148 | | |
1148 | 1149 | | |
1149 | 1150 | | |
1150 | | - | |
| 1151 | + | |
1151 | 1152 | | |
1152 | 1153 | | |
1153 | 1154 | | |
| |||
1157 | 1158 | | |
1158 | 1159 | | |
1159 | 1160 | | |
1160 | | - | |
| 1161 | + | |
1161 | 1162 | | |
1162 | 1163 | | |
1163 | 1164 | | |
1164 | 1165 | | |
1165 | 1166 | | |
1166 | 1167 | | |
1167 | 1168 | | |
1168 | | - | |
| 1169 | + | |
1169 | 1170 | | |
1170 | 1171 | | |
1171 | 1172 | | |
1172 | | - | |
| 1173 | + | |
1173 | 1174 | | |
1174 | 1175 | | |
1175 | | - | |
1176 | | - | |
| 1176 | + | |
| 1177 | + | |
1177 | 1178 | | |
1178 | 1179 | | |
1179 | 1180 | | |
1180 | 1181 | | |
1181 | | - | |
| 1182 | + | |
1182 | 1183 | | |
1183 | 1184 | | |
1184 | 1185 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
235 | 235 | | |
236 | 236 | | |
237 | 237 | | |
238 | | - | |
| 238 | + | |
239 | 239 | | |
240 | 240 | | |
241 | 241 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
16 | 16 | | |
17 | 17 | | |
18 | 18 | | |
| 19 | + | |
19 | 20 | | |
20 | 21 | | |
21 | 22 | | |
| |||
59 | 60 | | |
60 | 61 | | |
61 | 62 | | |
62 | | - | |
| 63 | + | |
63 | 64 | | |
64 | | - | |
| 65 | + | |
65 | 66 | | |
66 | | - | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
67 | 70 | | |
68 | 71 | | |
69 | 72 | | |
70 | | - | |
| 73 | + | |
71 | 74 | | |
72 | 75 | | |
73 | 76 | | |
| |||
179 | 182 | | |
180 | 183 | | |
181 | 184 | | |
182 | | - | |
183 | 185 | | |
184 | | - | |
185 | | - | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
| 190 | + | |
186 | 191 | | |
187 | | - | |
188 | 192 | | |
189 | 193 | | |
190 | 194 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
301 | 301 | | |
302 | 302 | | |
303 | 303 | | |
| 304 | + | |
| 305 | + | |
| 306 | + | |
| 307 | + | |
| 308 | + | |
| 309 | + | |
| 310 | + | |
| 311 | + | |
| 312 | + | |
304 | 313 | | |
305 | 314 | | |
306 | 315 | | |
| |||
0 commit comments