Skip to content

Commit f1d129d

Browse files
committed
Merge bitcoin/bitcoin#31363: cluster mempool: introduce TxGraph
b2ea365 txgraph: Add Get{Ancestors,Descendants}Union functions (feature) (Pieter Wuille) 54bcedd txgraph: Multiple inputs to Get{Ancestors,Descendant}Refs (preparation) (Pieter Wuille) aded047 txgraph: Add CountDistinctClusters function (feature) (Pieter Wuille) b685d32 txgraph: Add DoWork function (feature) (Pieter Wuille) 295a1ca txgraph: Expose ability to compare transactions (feature) (Pieter Wuille) 22c68cd txgraph: Allow Refs to outlive the TxGraph (feature) (Pieter Wuille) 82fa357 txgraph: Destroying Ref means removing transaction (feature) (Pieter Wuille) 6b037ce txgraph: Cache oversizedness of graphs (optimization) (Pieter Wuille) 8c70688 txgraph: Add staging support (feature) (Pieter Wuille) c99c730 txgraph: Abstract out ClearLocator (refactor) (Pieter Wuille) 34aa3da txgraph: Group per-graph data in ClusterSet (refactor) (Pieter Wuille) 36dd5ed txgraph: Special-case removal of tail of cluster (Optimization) (Pieter Wuille) 5801e0f txgraph: Delay chunking while sub-acceptable (optimization) (Pieter Wuille) 57f5499 txgraph: Avoid looking up the same child cluster repeatedly (optimization) (Pieter Wuille) 1171953 txgraph: Avoid representative lookup for each dependency (optimization) (Pieter Wuille) 64f69ec txgraph: Make max cluster count configurable and "oversize" state (feature) (Pieter Wuille) 1d27b74 txgraph: Add GetChunkFeerate function (feature) (Pieter Wuille) c80aecc txgraph: Avoid per-group vectors for clusters & dependencies (optimization) (Pieter Wuille) ee57e93 txgraph: Add internal sanity check function (tests) (Pieter Wuille) 05abf33 txgraph: Add simulation fuzz test (tests) (Pieter Wuille) 8ad3ed2 txgraph: Add initial version (feature) (Pieter Wuille) 6eab3b2 feefrac: Introduce tagged wrappers to distinguish vsize/WU rates (Pieter Wuille) d449773 scripted-diff: (refactor) ClusterIndex -> DepGraphIndex (Pieter Wuille) bfeb69f clusterlin: Make IsAcyclic() a DepGraph member function (Pieter Wuille) 0aa874a clusterlin: Add FixLinearization function + fuzz test (Pieter Wuille) Pull request description: Part of cluster mempool: #30289. ### 1. Overview This introduces the `TxGraph` class, which encapsulates knowledge about the (effective) fees, sizes, and dependencies between all mempool transactions, but nothing else. In particular, it lacks knowledge about `CTransaction`, inputs, outputs, txids, wtxids, prioritization, validatity, policy rules, and a lot more. Being restricted to just those aspects of the mempool makes the behavior very easy to fully specify (ignoring the actual linearizations produced), and write simulation-based tests for (which are included in this PR). ### 2. Interface The interface can be largely categorized into: * Mutation functions: * `AddTransaction` (add a new transaction with specified feerate, and get a `Ref` object back to identify it). * `RemoveTransaction` (given a `Ref` object, remove the transaction). * `AddDependency` (given two `Ref` objects, add a dependency between them). * `SetTransactionFee` (modify the fee associated with a Ref object). * Inspector functions: * `GetAncestors` (get the ancestor set in the form of `Ref*` pointers) * `GetAncestorsUnion` (like above, but for the union of ancestors of multiple `Ref*` pointers) * `GetDescendants` (get the descendant set in the form of `Ref*` pointers) * `GetDescendantsUnion` (like above, but for the union of ancestors of multiple `Ref*` pointers) * `GetCluster` (get the connected component set in the form of `Ref*` pointers, in the order they would be mined). * `GetIndividualFeerate` (get the feerate of a transaction) * `GetChunkFeerate` (get the mining score of a transaction) * `CountDistinctClusters` (count the number of distinct clusters a list of `Ref`s belong to) * Staging functions: * `StartStaging` (make all future mutations operate on a proposed transaction graph) * `CommitStaging` (apply all the changes that are staged) * `AbortStaging` (discard all the changes that are staged) * Miscellaneous functions: * `DoWork` (do queued-up computations now, so that future operations are fast) This `TxGraph::Ref` type used as a "handle" on transactions in the graph can be inherited from, and the idea is that in the full cluster mempool implementation (#28676, after it is rebased on this), `CTxMempoolEntry` will inherit from it, and all actually used Ref objects will be `CTxMempoolEntry`s. With that, the mempool code can just cast any `Ref*` returned by txgraph to `CTxMempoolEntry*`. ### 3. Implementation Internally the graph data is kept in clustered form (partitioned into connected components), for which linearizations are maintained and updated as needed using the `cluster_linearize.h` algorithms under the hood, but this is hidden from the users of this class. Implementation-wise, mutations are generally applied lazily, appending to queues of to-be-removed transactions and to-be-added dependencies, so they can be batched for higher performance. Inspectors will generally only evaluate as much as is needed to answer queries, with roughly 5 levels of processing to go to fully instantiated and acceptable cluster linearizations, in order: 1. `ApplyRemovals` (take batches of to-be-removed transactions and translate them to "holes" in the corresponding Clusters/DepGraphs). 2. `SplitAll` (creating holes in Clusters may cause them to break apart into smaller connected components, so make turn them into separate Clusters/linearizations). 3. `GroupClusters` (figure out which Clusters will need to be combined in order to add requested to-be-added dependencies, as these may span clusters). 4. `ApplyDependencies` (actually merge Clusters as precomputed by `GroupClusters`, and add the dependencies between them). 5. `MakeAcceptable` (perform the LIMO linearization algorithm on Clusters to make sure their linearizations are acceptable). ### 4. Future work This is only an initial version of TxGraph, and some functionality is missing before #28676 can be rebased on top of it: * The ability to get comparative feerate diagrams before/after for the set of staged changes (to evaluate RBF incentive-compatibility). * Mining interface (ability to iterate transactions quickly in mining score order) (see #31444). * Eviction interface (reverse of mining order, plus memory usage accounting) (see #31444). * Ability to fix oversizedness of clusters (before or after committing) - this is needed for reorgs where aborting/rejecting the change just is not an option (see #31553). * Interface for controlling how much effort is spent on LIMO. In this PR it is hardcoded. Then there are further improvements possible which would not block other work: * Making Cluster a virtual class with different implementations based on transaction count (which could dramatically reduce memory usage, as most Clusters are just a single transaction, for which the current implementation is overkill). * The ability to have background thread(s) for improving cluster linearizations. ACKs for top commit: instagibbs: reACK b2ea365 ajtowns: reACK b2ea365 ismaelsadeeq: reACK b2ea365 🚀 glozow: ACK b2ea365 Tree-SHA512: 0f86f73d37651fe47d469db1384503bbd1237b4556e5d50b1d0a3dd27754792d6fc3481f77a201cf2ed36c6ca76e0e44c30e175d112aacb53dfdb9e11d8abc6b
2 parents c0b7159 + b2ea365 commit f1d129d

File tree

11 files changed

+3360
-150
lines changed

11 files changed

+3360
-150
lines changed

src/CMakeLists.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -301,6 +301,7 @@ add_library(bitcoin_node STATIC EXCLUDE_FROM_ALL
301301
signet.cpp
302302
torcontrol.cpp
303303
txdb.cpp
304+
txgraph.cpp
304305
txmempool.cpp
305306
txorphanage.cpp
306307
txrequest.cpp

src/bench/cluster_linearize.cpp

Lines changed: 22 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -23,10 +23,10 @@ namespace {
2323
* remaining transaction, whose removal requires updating all remaining transactions' ancestor
2424
* set feerates. */
2525
template<typename SetType>
26-
DepGraph<SetType> MakeLinearGraph(ClusterIndex ntx)
26+
DepGraph<SetType> MakeLinearGraph(DepGraphIndex ntx)
2727
{
2828
DepGraph<SetType> depgraph;
29-
for (ClusterIndex i = 0; i < ntx; ++i) {
29+
for (DepGraphIndex i = 0; i < ntx; ++i) {
3030
depgraph.AddTransaction({-int32_t(i), 1});
3131
if (i > 0) depgraph.AddDependencies(SetType::Singleton(i - 1), i);
3232
}
@@ -38,10 +38,10 @@ DepGraph<SetType> MakeLinearGraph(ClusterIndex ntx)
3838
* rechunking is needed after every candidate (the last transaction gets picked every time).
3939
*/
4040
template<typename SetType>
41-
DepGraph<SetType> MakeWideGraph(ClusterIndex ntx)
41+
DepGraph<SetType> MakeWideGraph(DepGraphIndex ntx)
4242
{
4343
DepGraph<SetType> depgraph;
44-
for (ClusterIndex i = 0; i < ntx; ++i) {
44+
for (DepGraphIndex i = 0; i < ntx; ++i) {
4545
depgraph.AddTransaction({int32_t(i) + 1, 1});
4646
if (i > 0) depgraph.AddDependencies(SetType::Singleton(0), i);
4747
}
@@ -51,10 +51,10 @@ DepGraph<SetType> MakeWideGraph(ClusterIndex ntx)
5151
// Construct a difficult graph. These need at least sqrt(2^(n-1)) iterations in the implemented
5252
// algorithm (purely empirically determined).
5353
template<typename SetType>
54-
DepGraph<SetType> MakeHardGraph(ClusterIndex ntx)
54+
DepGraph<SetType> MakeHardGraph(DepGraphIndex ntx)
5555
{
5656
DepGraph<SetType> depgraph;
57-
for (ClusterIndex i = 0; i < ntx; ++i) {
57+
for (DepGraphIndex i = 0; i < ntx; ++i) {
5858
if (ntx & 1) {
5959
// Odd cluster size.
6060
//
@@ -121,7 +121,7 @@ DepGraph<SetType> MakeHardGraph(ClusterIndex ntx)
121121
* iterations difference.
122122
*/
123123
template<typename SetType>
124-
void BenchLinearizeWorstCase(ClusterIndex ntx, benchmark::Bench& bench, uint64_t iter_limit)
124+
void BenchLinearizeWorstCase(DepGraphIndex ntx, benchmark::Bench& bench, uint64_t iter_limit)
125125
{
126126
const auto depgraph = MakeHardGraph<SetType>(ntx);
127127
uint64_t rng_seed = 0;
@@ -147,12 +147,12 @@ void BenchLinearizeWorstCase(ClusterIndex ntx, benchmark::Bench& bench, uint64_t
147147
* cheap.
148148
*/
149149
template<typename SetType>
150-
void BenchLinearizeNoItersWorstCaseAnc(ClusterIndex ntx, benchmark::Bench& bench)
150+
void BenchLinearizeNoItersWorstCaseAnc(DepGraphIndex ntx, benchmark::Bench& bench)
151151
{
152152
const auto depgraph = MakeLinearGraph<SetType>(ntx);
153153
uint64_t rng_seed = 0;
154-
std::vector<ClusterIndex> old_lin(ntx);
155-
for (ClusterIndex i = 0; i < ntx; ++i) old_lin[i] = i;
154+
std::vector<DepGraphIndex> old_lin(ntx);
155+
for (DepGraphIndex i = 0; i < ntx; ++i) old_lin[i] = i;
156156
bench.run([&] {
157157
Linearize(depgraph, /*max_iterations=*/0, rng_seed++, old_lin);
158158
});
@@ -167,41 +167,41 @@ void BenchLinearizeNoItersWorstCaseAnc(ClusterIndex ntx, benchmark::Bench& bench
167167
* AncestorCandidateFinder is cheap.
168168
*/
169169
template<typename SetType>
170-
void BenchLinearizeNoItersWorstCaseLIMO(ClusterIndex ntx, benchmark::Bench& bench)
170+
void BenchLinearizeNoItersWorstCaseLIMO(DepGraphIndex ntx, benchmark::Bench& bench)
171171
{
172172
const auto depgraph = MakeWideGraph<SetType>(ntx);
173173
uint64_t rng_seed = 0;
174-
std::vector<ClusterIndex> old_lin(ntx);
175-
for (ClusterIndex i = 0; i < ntx; ++i) old_lin[i] = i;
174+
std::vector<DepGraphIndex> old_lin(ntx);
175+
for (DepGraphIndex i = 0; i < ntx; ++i) old_lin[i] = i;
176176
bench.run([&] {
177177
Linearize(depgraph, /*max_iterations=*/0, rng_seed++, old_lin);
178178
});
179179
}
180180

181181
template<typename SetType>
182-
void BenchPostLinearizeWorstCase(ClusterIndex ntx, benchmark::Bench& bench)
182+
void BenchPostLinearizeWorstCase(DepGraphIndex ntx, benchmark::Bench& bench)
183183
{
184184
DepGraph<SetType> depgraph = MakeWideGraph<SetType>(ntx);
185-
std::vector<ClusterIndex> lin(ntx);
185+
std::vector<DepGraphIndex> lin(ntx);
186186
bench.run([&] {
187-
for (ClusterIndex i = 0; i < ntx; ++i) lin[i] = i;
187+
for (DepGraphIndex i = 0; i < ntx; ++i) lin[i] = i;
188188
PostLinearize(depgraph, lin);
189189
});
190190
}
191191

192192
template<typename SetType>
193-
void BenchMergeLinearizationsWorstCase(ClusterIndex ntx, benchmark::Bench& bench)
193+
void BenchMergeLinearizationsWorstCase(DepGraphIndex ntx, benchmark::Bench& bench)
194194
{
195195
DepGraph<SetType> depgraph;
196-
for (ClusterIndex i = 0; i < ntx; ++i) {
196+
for (DepGraphIndex i = 0; i < ntx; ++i) {
197197
depgraph.AddTransaction({i, 1});
198198
if (i) depgraph.AddDependencies(SetType::Singleton(0), i);
199199
}
200-
std::vector<ClusterIndex> lin1;
201-
std::vector<ClusterIndex> lin2;
200+
std::vector<DepGraphIndex> lin1;
201+
std::vector<DepGraphIndex> lin2;
202202
lin1.push_back(0);
203203
lin2.push_back(0);
204-
for (ClusterIndex i = 1; i < ntx; ++i) {
204+
for (DepGraphIndex i = 1; i < ntx; ++i) {
205205
lin1.push_back(i);
206206
lin2.push_back(ntx - i);
207207
}
@@ -214,7 +214,7 @@ template<size_t N>
214214
void BenchLinearizeOptimally(benchmark::Bench& bench, const std::array<uint8_t, N>& serialized)
215215
{
216216
// Determine how many transactions the serialized cluster has.
217-
ClusterIndex num_tx{0};
217+
DepGraphIndex num_tx{0};
218218
{
219219
SpanReader reader{serialized};
220220
DepGraph<BitSet<128>> depgraph;

0 commit comments

Comments
 (0)