Commit 80067ac
committed
Merge bitcoin/bitcoin#31829: p2p: improve TxOrphanage denial of service bounds
5002462 [bench] worst case LimitOrphans and EraseForBlock (glozow)
45c7a4b [functional test] orphan resolution works in the presence of DoSy peers (glozow)
835f5c7 [prep/test] restart instead of bumpmocktime between p2p_orphan_handling subtests (glozow)
b113877 [fuzz] Add simulation fuzz test for TxOrphanage (Pieter Wuille)
03aaaed [prep] Return the made-reconsiderable announcements in AddChildrenToWorkSet (Pieter Wuille)
ea29c43 [p2p] bump DEFAULT_MAX_ORPHANAGE_LATENCY_SCORE to 3,000 (glozow)
24afee8 [fuzz] TxOrphanage protects peers that don't go over limit (glozow)
a2878cf [unit test] strengthen GetChildrenFromSamePeer tests: results are in recency order (glozow)
7ce3b7e [unit test] basic TxOrphanage eviction and protection (glozow)
4d23d1d [cleanup] remove unused rng param from LimitOrphans (glozow)
067365d [p2p] overhaul TxOrphanage with smarter limits (glozow)
1a41e79 [refactor] create aliases for TxOrphanage Count and Usage (glozow)
b50bd72 [prep] change return type of EraseTx to bool (glozow)
3da6d7f [prep/refactor] make TxOrphanage a virtual class implemented by TxOrphanageImpl (glozow)
77ebe8f [prep/test] have TxOrphanage remember its own limits in LimitOrphans (glozow)
d0af423 [prep/refactor] move DEFAULT_MAX_ORPHAN_TRANSACTIONS to txorphanage.h (glozow)
5136522 [prep/config] remove -maxorphantx (glozow)
8dd24c2 [prep/test] modify test to not access TxOrphanage internals (glozow)
44f5327 [fuzz] add SeedRandomStateForTest(SeedRand::ZEROS) to txorphan (glozow)
15a4ec9 [prep/rpc] remove entry and expiry time from getorphantxs (glozow)
08e58fa [prep/refactor] move txorphanage to node namespace and directory (glozow)
bb91d23 [txorphanage] change type of usage to int64_t (glozow)
Pull request description:
This PR is part of the orphan resolution project, see #27463.
This design came from collaboration with sipa - thanks.
We want to limit the CPU work and memory used by `TxOrphanage` to avoid denial of service attacks. On master, this is achieved by limiting the number of transactions in this data structure to 100, and the weight of each transaction to 400KWu (the largest standard tx) [0]. We always allow new orphans, but if the addition causes us to exceed 100, we evict one randomly. This is dead simple, but has problems:
- It makes the orphanage trivially churnable: any one peer can render it useless by spamming us with lots of orphans. It's possible this is happening: "Looking at data from node alice on 2024-09-14 shows that we’re sometimes removing more than 100k orphans per minute. This feels like someone flooding us with orphans." [1]
- Effectively, opportunistic 1p1c is useless in the presence of adversaries: it is *opportunistic* and pairs a low feerate tx with a child that happens to be in the orphanage. So if nothing is able to stay in orphanages, we can't expect 1p1cs to propagate.
- This number is also often insufficient for the volume of orphans we handle: historical data show that overflows are pretty common, and there are times where "it seems like [the node] forgot about the orphans and re-requested them multiple times." [1]
Just jacking up the `-maxorphantxs` number is not a good enough solution, because it doesn't solve the churnability problem, and the effective resource bounds scale poorly.
This PR introduces numbers for {global, per-peer} {memory usage, announcements + number of inputs}, representing resource limits:
- The (constant) **global latency score limit** is the number of unique (wtxid, peer) pairs in the orphanage + the number of inputs spent by those (deduplicated) transactions floor-divided by 10 [2]. This represents a cap on CPU or latency for any given operation, and does not change with the number of peers we have. Evictions must happen whenever this limit is reached. The primary goal of this limit is to ensure we do not spend more than a few ms on any call to `LimitOrphans` or `EraseForBlock`.
- The (variable) **per-peer latency score limit** is the global latency score limit divided by the number of peers. Peers are allowed to exceed this limit provided the global announcement limit has not been reached. The per-peer announcement limit decreases with more peers.
- The (constant) **per-peer memory usage reservation** is the amount of orphan weight [3] reserved per peer [4]. Reservation means that peers are effectively guaranteed this amount of space. Peers are allowed to exceed this limit provided the global usage limit is not reached. The primary goal of this limit is to ensure we don't oom.
- The (variable) **global memory usage limit** is the number of peers multiplied by the per-peer reservation [5]. As such, the global memory usage limit scales up with the number of peers we have. Evictions must happen whenever this limit is reached.
- We introduce a "Peer DoS Score" which is the maximum between its "CPU Score" and "Memory Score." The CPU score is the ratio between the number of orphans announced by this peer / peer announcement limit. The memory score is the total usage of all orphans announced by this peer / peer usage reservation.
Eviction changes in a few ways:
- It is triggered if either limit is exceeded.
- On each iteration of the loop, instead of selecting a random orphan, we select a peer and delete 1 of its announcements. Specifically, we select the peer with the highest DoS score, which is the maximum between its CPU DoS score (based on announcements) and Memory DoS score (based on tx weight). After the peer has been selected, we evict the oldest orphan (non-reconsiderable sorted before reconsiderable).
- Instead of evicting orphans, we evict announcements. An orphan is still in the orphanage as long as there is 1 peer announcer. Of course, over the course of several iteration loops, we may erase all announcers, thus erasing the orphan itself. The purpose of this change is to prevent a peer from being able to trigger eviction of another peer's orphans.
This PR also:
- Reimplements `TxOrphanage` as single multi-index container.
- Effectively bounds the number of transactions that can be in a peer's work set by ensuring it is a subset of the peer's announcements.
- Removes the `-maxorphantxs` config option, as the orphanage no longer limits by unique orphans.
This means we can receive 1p1c packages in the presence of spammy peers. It also makes the orphanage more useful and increases our download capacity without drastically increasing orphanage resource usage.
[0]: This means the effective memory limit in orphan weight is 100 * 400KWu = 40MWu
[1]: https://delvingbitcoin.org/t/stats-on-orphanage-overflows/1421
[2]: Limit is 3000, which is equivalent to one max size ancestor package (24 transactions can be missing inputs) for each peer (default max connections is 125).
[3]: Orphan weight is used in place of actual memory usage because something like "one maximally sized standard tx" is easier to reason about than "considering the bytes allocated for vin and vout vectors, it needs to be within N bytes..." etc. We can also consider a different formula to encapsulate more the memory overhead but still have an interface that is easy to reason about.
[4]: The limit is 404KWu, which is the maximum size of an ancestor package.
[5]: With 125 peers, this is 50.5MWu, which is a small increase from the existing limit of 40MWu. While the actual memory usage limit is higher (this number does not include the other memory used by `TxOrphanage` to store the outpoints map, etc.), this is within the same ballpark as the old limit.
ACKs for top commit:
marcofleon:
ReACK 5002462
achow101:
light ACK 5002462
instagibbs:
ACK 5002462
theStack:
Code-review ACK 5002462
Tree-SHA512: 270c11a2d116a1bf222358a1b4e25ffd1f01e24da958284fa8c4678bee5547f9e0554e87da7b7d5d5d172ca11da147f54a69b3436cc8f382debb6a45a90647fdFile tree
24 files changed
+2657
-880
lines changed- src
- bench
- node
- rpc
- test
- fuzz
- test/functional
- test_framework
24 files changed
+2657
-880
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
277 | 277 | | |
278 | 278 | | |
279 | 279 | | |
| 280 | + | |
280 | 281 | | |
281 | 282 | | |
282 | 283 | | |
| |||
308 | 309 | | |
309 | 310 | | |
310 | 311 | | |
311 | | - | |
312 | 312 | | |
313 | 313 | | |
314 | 314 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
49 | 49 | | |
50 | 50 | | |
51 | 51 | | |
| 52 | + | |
52 | 53 | | |
53 | 54 | | |
54 | 55 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
| 190 | + | |
| 191 | + | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
| 200 | + | |
| 201 | + | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
| 208 | + | |
| 209 | + | |
| 210 | + | |
| 211 | + | |
| 212 | + | |
| 213 | + | |
| 214 | + | |
| 215 | + | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
| 244 | + | |
| 245 | + | |
| 246 | + | |
| 247 | + | |
| 248 | + | |
| 249 | + | |
| 250 | + | |
| 251 | + | |
| 252 | + | |
| 253 | + | |
| 254 | + | |
| 255 | + | |
| 256 | + | |
| 257 | + | |
| 258 | + | |
| 259 | + | |
| 260 | + | |
| 261 | + | |
| 262 | + | |
| 263 | + | |
| 264 | + | |
| 265 | + | |
| 266 | + | |
| 267 | + | |
| 268 | + | |
| 269 | + | |
| 270 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
490 | 490 | | |
491 | 491 | | |
492 | 492 | | |
493 | | - | |
494 | 493 | | |
495 | 494 | | |
496 | 495 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
35 | 35 | | |
36 | 36 | | |
37 | 37 | | |
| 38 | + | |
38 | 39 | | |
39 | 40 | | |
40 | 41 | | |
| |||
53 | 54 | | |
54 | 55 | | |
55 | 56 | | |
56 | | - | |
57 | 57 | | |
58 | 58 | | |
59 | 59 | | |
| |||
533 | 533 | | |
534 | 534 | | |
535 | 535 | | |
536 | | - | |
| 536 | + | |
537 | 537 | | |
538 | 538 | | |
539 | 539 | | |
| |||
1754 | 1754 | | |
1755 | 1755 | | |
1756 | 1756 | | |
1757 | | - | |
| 1757 | + | |
1758 | 1758 | | |
1759 | 1759 | | |
1760 | 1760 | | |
| |||
1925 | 1925 | | |
1926 | 1926 | | |
1927 | 1927 | | |
1928 | | - | |
| 1928 | + | |
1929 | 1929 | | |
1930 | 1930 | | |
1931 | 1931 | | |
| |||
0 commit comments