Skip to content

Commit f4f1d6d

Browse files
committed
Merge bitcoin/bitcoin#27746: Rework validation logic for assumeutxo
a733dd7 Remove unused function `reliesOnAssumedValid` (Suhas Daftuar) d4a11ab Cache block index entry corresponding to assumeutxo snapshot base blockhash (Suhas Daftuar) 3556b85 Move CheckBlockIndex() from Chainstate to ChainstateManager (Suhas Daftuar) 0ce805b Documentation improvements for assumeutxo (Ryan Ofsky) 768690b Fix initialization of setBlockIndexCandidates when working with multiple chainstates (Suhas Daftuar) d43a1f1 Tighten requirements for adding elements to setBlockIndexCandidates (Suhas Daftuar) d0d40ea Move block-storage-related logic to ChainstateManager (Suhas Daftuar) 3cfc753 test: Clear block index flags when testing snapshots (Suhas Daftuar) 272fbc3 Update CheckBlockIndex invariants for chains based on an assumeutxo snapshot (Suhas Daftuar) 10c0571 Add wrapper for adding entries to a chainstate's block index candidates (Suhas Daftuar) 471da5f Move block-arrival information / preciousblock counters to ChainstateManager (Suhas Daftuar) 1cfc887 Remove CChain dependency in node/blockstorage (Suhas Daftuar) fe86a7c Explicitly track maximum block height stored in undo files (Suhas Daftuar) Pull request description: This PR proposes a clean up of the relationship between block storage and the chainstate objects, by moving the decision of whether to store a block on disk to something that is not chainstate-specific. Philosophically, the decision of whether to store a block on disk is related to validation rules that do not require any UTXO state; for anti-DoS reasons we were using some chainstate-specific heuristics, and those have been reworked here to achieve the proposed separation. This PR also fixes a bug in how a chainstate's `setBlockIndexCandidates` was being initialized; it should always have all the HAVE_DATA block index entries that have more work than the chain tip. During startup, we were not fully populating `setBlockIndexCandidates` in some scenarios involving multiple chainstates. Further, this PR establishes a concept that whenever we have 2 chainstates, that we always know the snapshotted chain's base block and the base block's hash must be an element of our block index. Given that, we can establish a new invariant that the background validation chainstate only needs to consider blocks leading to that snapshotted block entry as potential candidates for its tip. As a followup I would imagine that when writing net_processing logic to download blocks for the background chainstate, that we would use this concept to only download blocks towards the snapshotted entry as well. ACKs for top commit: achow101: ACK a733dd7 jamesob: reACK a733dd7 ([`jamesob/ackr/27746.5.sdaftuar.rework_validation_logic`](https://github.com/jamesob/bitcoin/tree/ackr/27746.5.sdaftuar.rework_validation_logic)) Sjors: Code review ACK a733dd7. ryanofsky: Code review ACK a733dd7. Just suggested changes since the last review. There are various small things that could be followed up on, but I think this is ready for merge. Tree-SHA512: 9ec17746f22b9c27082743ee581b8adceb2bd322fceafa507b428bdcc3ffb8b4c6601fc61cc7bb1161f890c3d38503e8b49474da7b5ab1b1f38bda7aa8668675
2 parents 44b05bf + a733dd7 commit f4f1d6d

14 files changed

+378
-254
lines changed

doc/design/assumeutxo.md

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -17,10 +17,9 @@ respectively generate and load UTXO snapshots. The utility script
1717

1818
- A new block index `nStatus` flag is introduced, `BLOCK_ASSUMED_VALID`, to mark block
1919
index entries that are required to be assumed-valid by a chainstate created
20-
from a UTXO snapshot. This flag is mostly used as a way to modify certain
20+
from a UTXO snapshot. This flag is used as a way to modify certain
2121
CheckBlockIndex() logic to account for index entries that are pending validation by a
22-
chainstate running asynchronously in the background. We also use this flag to control
23-
which index entries are added to setBlockIndexCandidates during LoadBlockIndex().
22+
chainstate running asynchronously in the background.
2423

2524
- The concept of UTXO snapshots is treated as an implementation detail that lives
2625
behind the ChainstateManager interface. The external presentation of the changes

src/bench/load_external.cpp

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -49,14 +49,13 @@ static void LoadExternalBlockFile(benchmark::Bench& bench)
4949
fclose(file);
5050
}
5151

52-
Chainstate& chainstate{testing_setup->m_node.chainman->ActiveChainstate()};
5352
std::multimap<uint256, FlatFilePos> blocks_with_unknown_parent;
5453
FlatFilePos pos;
5554
bench.run([&] {
5655
// "rb" is "binary, O_RDONLY", positioned to the start of the file.
5756
// The file will be closed by LoadExternalBlockFile().
5857
FILE* file{fsbridge::fopen(blkfile, "rb")};
59-
chainstate.LoadExternalBlockFile(file, &pos, &blocks_with_unknown_parent);
58+
testing_setup->m_node.chainman->LoadExternalBlockFile(file, &pos, &blocks_with_unknown_parent);
6059
});
6160
fs::remove(blkfile);
6261
}

src/chain.h

Lines changed: 14 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -113,10 +113,10 @@ enum BlockStatus : uint32_t {
113113
BLOCK_VALID_TRANSACTIONS = 3,
114114

115115
//! Outputs do not overspend inputs, no double spends, coinbase output ok, no immature coinbase spends, BIP30.
116-
//! Implies all parents are also at least CHAIN.
116+
//! Implies all parents are either at least VALID_CHAIN, or are ASSUMED_VALID
117117
BLOCK_VALID_CHAIN = 4,
118118

119-
//! Scripts & signatures ok. Implies all parents are also at least SCRIPTS.
119+
//! Scripts & signatures ok. Implies all parents are either at least VALID_SCRIPTS, or are ASSUMED_VALID.
120120
BLOCK_VALID_SCRIPTS = 5,
121121

122122
//! All validity bits.
@@ -134,10 +134,18 @@ enum BlockStatus : uint32_t {
134134
BLOCK_OPT_WITNESS = 128, //!< block data in blk*.dat was received with a witness-enforcing client
135135

136136
/**
137-
* If set, this indicates that the block index entry is assumed-valid.
138-
* Certain diagnostics will be skipped in e.g. CheckBlockIndex().
139-
* It almost certainly means that the block's full validation is pending
140-
* on a background chainstate. See `doc/design/assumeutxo.md`.
137+
* If ASSUMED_VALID is set, it means that this block has not been validated
138+
* and has validity status less than VALID_SCRIPTS. Also that it may have
139+
* descendant blocks with VALID_SCRIPTS set, because they can be validated
140+
* based on an assumeutxo snapshot.
141+
*
142+
* When an assumeutxo snapshot is loaded, the ASSUMED_VALID flag is added to
143+
* unvalidated blocks at the snapshot height and below. Then, as the background
144+
* validation progresses, and these blocks are validated, the ASSUMED_VALID
145+
* flags are removed. See `doc/design/assumeutxo.md` for details.
146+
*
147+
* This flag is only used to implement checks in CheckBlockIndex() and
148+
* should not be used elsewhere.
141149
*/
142150
BLOCK_ASSUMED_VALID = 256,
143151
};

src/node/blockstorage.cpp

Lines changed: 9 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -618,7 +618,7 @@ fs::path BlockManager::GetBlockPosFilename(const FlatFilePos& pos) const
618618
return BlockFileSeq().FileName(pos);
619619
}
620620

621-
bool BlockManager::FindBlockPos(FlatFilePos& pos, unsigned int nAddSize, unsigned int nHeight, CChain& active_chain, uint64_t nTime, bool fKnown)
621+
bool BlockManager::FindBlockPos(FlatFilePos& pos, unsigned int nAddSize, unsigned int nHeight, uint64_t nTime, bool fKnown)
622622
{
623623
LOCK(cs_LastBlockFile);
624624

@@ -644,7 +644,7 @@ bool BlockManager::FindBlockPos(FlatFilePos& pos, unsigned int nAddSize, unsigne
644644
// when the undo file is keeping up with the block file, we want to flush it explicitly
645645
// when it is lagging behind (more blocks arrive than are being connected), we let the
646646
// undo block write case handle it
647-
finalize_undo = (m_blockfile_info[nFile].nHeightLast == (unsigned int)active_chain.Tip()->nHeight);
647+
finalize_undo = (m_blockfile_info[nFile].nHeightLast == m_undo_height_in_last_blockfile);
648648
nFile++;
649649
if (m_blockfile_info.size() <= nFile) {
650650
m_blockfile_info.resize(nFile + 1);
@@ -660,6 +660,7 @@ bool BlockManager::FindBlockPos(FlatFilePos& pos, unsigned int nAddSize, unsigne
660660
}
661661
FlushBlockFile(!fKnown, finalize_undo);
662662
m_last_blockfile = nFile;
663+
m_undo_height_in_last_blockfile = 0; // No undo data yet in the new file, so reset our undo-height tracking.
663664
}
664665

665666
m_blockfile_info[nFile].AddBlock(nHeight, nTime);
@@ -749,8 +750,9 @@ bool BlockManager::WriteUndoDataForBlock(const CBlockUndo& blockundo, BlockValid
749750
// the FindBlockPos function
750751
if (_pos.nFile < m_last_blockfile && static_cast<uint32_t>(block.nHeight) == m_blockfile_info[_pos.nFile].nHeightLast) {
751752
FlushUndoFile(_pos.nFile, true);
753+
} else if (_pos.nFile == m_last_blockfile && static_cast<uint32_t>(block.nHeight) > m_undo_height_in_last_blockfile) {
754+
m_undo_height_in_last_blockfile = block.nHeight;
752755
}
753-
754756
// update nUndoPos in block index
755757
block.nUndoPos = _pos.nPos;
756758
block.nStatus |= BLOCK_HAVE_UNDO;
@@ -839,7 +841,7 @@ bool BlockManager::ReadRawBlockFromDisk(std::vector<uint8_t>& block, const FlatF
839841
return true;
840842
}
841843

842-
FlatFilePos BlockManager::SaveBlockToDisk(const CBlock& block, int nHeight, CChain& active_chain, const FlatFilePos* dbp)
844+
FlatFilePos BlockManager::SaveBlockToDisk(const CBlock& block, int nHeight, const FlatFilePos* dbp)
843845
{
844846
unsigned int nBlockSize = ::GetSerializeSize(block, CLIENT_VERSION);
845847
FlatFilePos blockPos;
@@ -852,7 +854,7 @@ FlatFilePos BlockManager::SaveBlockToDisk(const CBlock& block, int nHeight, CCha
852854
// we add BLOCK_SERIALIZATION_HEADER_SIZE only for new blocks since they will have the serialization header added when written to disk.
853855
nBlockSize += static_cast<unsigned int>(BLOCK_SERIALIZATION_HEADER_SIZE);
854856
}
855-
if (!FindBlockPos(blockPos, nBlockSize, nHeight, active_chain, block.GetBlockTime(), position_known)) {
857+
if (!FindBlockPos(blockPos, nBlockSize, nHeight, block.GetBlockTime(), position_known)) {
856858
error("%s: FindBlockPos failed", __func__);
857859
return FlatFilePos();
858860
}
@@ -905,7 +907,7 @@ void ImportBlocks(ChainstateManager& chainman, std::vector<fs::path> vImportFile
905907
break; // This error is logged in OpenBlockFile
906908
}
907909
LogPrintf("Reindexing block file blk%05u.dat...\n", (unsigned int)nFile);
908-
chainman.ActiveChainstate().LoadExternalBlockFile(file, &pos, &blocks_with_unknown_parent);
910+
chainman.LoadExternalBlockFile(file, &pos, &blocks_with_unknown_parent);
909911
if (chainman.m_interrupt) {
910912
LogPrintf("Interrupt requested. Exit %s\n", __func__);
911913
return;
@@ -924,7 +926,7 @@ void ImportBlocks(ChainstateManager& chainman, std::vector<fs::path> vImportFile
924926
FILE* file = fsbridge::fopen(path, "rb");
925927
if (file) {
926928
LogPrintf("Importing blocks file %s...\n", fs::PathToString(path));
927-
chainman.ActiveChainstate().LoadExternalBlockFile(file);
929+
chainman.LoadExternalBlockFile(file);
928930
if (chainman.m_interrupt) {
929931
LogPrintf("Interrupt requested. Exit %s\n", __func__);
930932
return;

src/node/blockstorage.h

Lines changed: 15 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,6 @@ class BlockValidationState;
2424
class CBlock;
2525
class CBlockFileInfo;
2626
class CBlockUndo;
27-
class CChain;
2827
class CChainParams;
2928
class Chainstate;
3029
class ChainstateManager;
@@ -94,7 +93,7 @@ class BlockManager
9493
EXCLUSIVE_LOCKS_REQUIRED(cs_main);
9594
void FlushBlockFile(bool fFinalize = false, bool finalize_undo = false);
9695
void FlushUndoFile(int block_file, bool finalize = false);
97-
bool FindBlockPos(FlatFilePos& pos, unsigned int nAddSize, unsigned int nHeight, CChain& active_chain, uint64_t nTime, bool fKnown);
96+
bool FindBlockPos(FlatFilePos& pos, unsigned int nAddSize, unsigned int nHeight, uint64_t nTime, bool fKnown);
9897
bool FindUndoPos(BlockValidationState& state, int nFile, FlatFilePos& pos, unsigned int nAddSize);
9998

10099
FlatFileSeq BlockFileSeq() const;
@@ -128,6 +127,19 @@ class BlockManager
128127
RecursiveMutex cs_LastBlockFile;
129128
std::vector<CBlockFileInfo> m_blockfile_info;
130129
int m_last_blockfile = 0;
130+
131+
// Track the height of the highest block in m_last_blockfile whose undo
132+
// data has been written. Block data is written to block files in download
133+
// order, but is written to undo files in validation order, which is
134+
// usually in order by height. To avoid wasting disk space, undo files will
135+
// be trimmed whenever the corresponding block file is finalized and
136+
// the height of the highest block written to the block file equals the
137+
// height of the highest block written to the undo file. This is a
138+
// heuristic and can sometimes preemptively trim undo files that will write
139+
// more data later, and sometimes fail to trim undo files that can't have
140+
// more data written later.
141+
unsigned int m_undo_height_in_last_blockfile = 0;
142+
131143
/** Global flag to indicate we should check to see if there are
132144
* block/undo files that should be deleted. Set on startup
133145
* or if we allocate more file space when we're in prune mode
@@ -202,7 +214,7 @@ class BlockManager
202214
EXCLUSIVE_LOCKS_REQUIRED(::cs_main);
203215

204216
/** Store block on disk. If dbp is not nullptr, then it provides the known position of the block within a block file on disk. */
205-
FlatFilePos SaveBlockToDisk(const CBlock& block, int nHeight, CChain& active_chain, const FlatFilePos* dbp);
217+
FlatFilePos SaveBlockToDisk(const CBlock& block, int nHeight, const FlatFilePos* dbp);
206218

207219
/** Whether running in -prune mode. */
208220
[[nodiscard]] bool IsPruneMode() const { return m_prune_mode; }

src/node/chainstate.cpp

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -221,7 +221,7 @@ ChainstateLoadResult LoadChainstate(ChainstateManager& chainman, const CacheSize
221221

222222
// A reload of the block index is required to recompute setBlockIndexCandidates
223223
// for the fully validated chainstate.
224-
chainman.ActiveChainstate().UnloadBlockIndex();
224+
chainman.ActiveChainstate().ClearBlockIndexCandidates();
225225

226226
auto [init_status, init_error] = CompleteChainstateInitialization(chainman, cache_sizes, options);
227227
if (init_status != ChainstateLoadStatus::SUCCESS) {

src/test/blockmanager_tests.cpp

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -30,22 +30,21 @@ BOOST_AUTO_TEST_CASE(blockmanager_find_block_pos)
3030
.notifications = notifications,
3131
};
3232
BlockManager blockman{m_node.kernel->interrupt, blockman_opts};
33-
CChain chain {};
3433
// simulate adding a genesis block normally
35-
BOOST_CHECK_EQUAL(blockman.SaveBlockToDisk(params->GenesisBlock(), 0, chain, nullptr).nPos, BLOCK_SERIALIZATION_HEADER_SIZE);
34+
BOOST_CHECK_EQUAL(blockman.SaveBlockToDisk(params->GenesisBlock(), 0, nullptr).nPos, BLOCK_SERIALIZATION_HEADER_SIZE);
3635
// simulate what happens during reindex
3736
// simulate a well-formed genesis block being found at offset 8 in the blk00000.dat file
3837
// the block is found at offset 8 because there is an 8 byte serialization header
3938
// consisting of 4 magic bytes + 4 length bytes before each block in a well-formed blk file.
4039
FlatFilePos pos{0, BLOCK_SERIALIZATION_HEADER_SIZE};
41-
BOOST_CHECK_EQUAL(blockman.SaveBlockToDisk(params->GenesisBlock(), 0, chain, &pos).nPos, BLOCK_SERIALIZATION_HEADER_SIZE);
40+
BOOST_CHECK_EQUAL(blockman.SaveBlockToDisk(params->GenesisBlock(), 0, &pos).nPos, BLOCK_SERIALIZATION_HEADER_SIZE);
4241
// now simulate what happens after reindex for the first new block processed
4342
// the actual block contents don't matter, just that it's a block.
4443
// verify that the write position is at offset 0x12d.
4544
// this is a check to make sure that https://github.com/bitcoin/bitcoin/issues/21379 does not recur
4645
// 8 bytes (for serialization header) + 285 (for serialized genesis block) = 293
4746
// add another 8 bytes for the second block's serialization header and we get 293 + 8 = 301
48-
FlatFilePos actual{blockman.SaveBlockToDisk(params->GenesisBlock(), 1, chain, nullptr)};
47+
FlatFilePos actual{blockman.SaveBlockToDisk(params->GenesisBlock(), 1, nullptr)};
4948
BOOST_CHECK_EQUAL(actual.nPos, BLOCK_SERIALIZATION_HEADER_SIZE + ::GetSerializeSize(params->GenesisBlock(), CLIENT_VERSION) + BLOCK_SERIALIZATION_HEADER_SIZE);
5049
}
5150

src/test/coinstatsindex_tests.cpp

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -98,7 +98,7 @@ BOOST_FIXTURE_TEST_CASE(coinstatsindex_unclean_shutdown, TestChain100Setup)
9898
LOCK(cs_main);
9999
BlockValidationState state;
100100
BOOST_CHECK(CheckBlock(block, state, params.GetConsensus()));
101-
BOOST_CHECK(chainstate.AcceptBlock(new_block, state, &new_block_index, true, nullptr, nullptr, true));
101+
BOOST_CHECK(m_node.chainman->AcceptBlock(new_block, state, &new_block_index, true, nullptr, nullptr, true));
102102
CCoinsViewCache view(&chainstate.CoinsTip());
103103
BOOST_CHECK(chainstate.ConnectBlock(block, state, new_block_index, view));
104104
}

src/test/fuzz/load_external_block_file.cpp

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -35,9 +35,9 @@ FUZZ_TARGET(load_external_block_file, .init = initialize_load_external_block_fil
3535
// Corresponds to the -reindex case (track orphan blocks across files).
3636
FlatFilePos flat_file_pos;
3737
std::multimap<uint256, FlatFilePos> blocks_with_unknown_parent;
38-
g_setup->m_node.chainman->ActiveChainstate().LoadExternalBlockFile(fuzzed_block_file, &flat_file_pos, &blocks_with_unknown_parent);
38+
g_setup->m_node.chainman->LoadExternalBlockFile(fuzzed_block_file, &flat_file_pos, &blocks_with_unknown_parent);
3939
} else {
4040
// Corresponds to the -loadblock= case (orphan blocks aren't tracked across files).
41-
g_setup->m_node.chainman->ActiveChainstate().LoadExternalBlockFile(fuzzed_block_file);
41+
g_setup->m_node.chainman->LoadExternalBlockFile(fuzzed_block_file);
4242
}
4343
}

src/test/util/chainstate.h

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -71,6 +71,7 @@ CreateAndActivateUTXOSnapshot(
7171
// This is a stripped-down version of node::LoadChainstate which
7272
// preserves the block index.
7373
LOCK(::cs_main);
74+
CBlockIndex *orig_tip = node.chainman->ActiveChainstate().m_chain.Tip();
7475
uint256 gen_hash = node.chainman->ActiveChainstate().m_chain[0]->GetBlockHash();
7576
node.chainman->ResetChainstates();
7677
node.chainman->InitializeChainstate(node.mempool.get());
@@ -83,6 +84,22 @@ CreateAndActivateUTXOSnapshot(
8384
chain.setBlockIndexCandidates.insert(node.chainman->m_blockman.LookupBlockIndex(gen_hash));
8485
chain.LoadChainTip();
8586
node.chainman->MaybeRebalanceCaches();
87+
88+
// Reset the HAVE_DATA flags below the snapshot height, simulating
89+
// never-having-downloaded them in the first place.
90+
// TODO: perhaps we could improve this by using pruning to delete
91+
// these blocks instead
92+
CBlockIndex *pindex = orig_tip;
93+
while (pindex && pindex != chain.m_chain.Tip()) {
94+
pindex->nStatus &= ~BLOCK_HAVE_DATA;
95+
pindex->nStatus &= ~BLOCK_HAVE_UNDO;
96+
// We have to set the ASSUMED_VALID flag, because otherwise it
97+
// would not be possible to have a block index entry without HAVE_DATA
98+
// and with nTx > 0 (since we aren't setting the pruned flag);
99+
// see CheckBlockIndex().
100+
pindex->nStatus |= BLOCK_ASSUMED_VALID;
101+
pindex = pindex->pprev;
102+
}
86103
}
87104
BlockValidationState state;
88105
if (!node.chainman->ActiveChainstate().ActivateBestChain(state)) {

0 commit comments

Comments
 (0)