Skip to content

Commit 5871b5b

Browse files
committed
Merge bitcoin/bitcoin#25571: refactor: Make mapBlocksUnknownParent local, and rename it
dd065da refactor: Make mapBlocksUnknownParent local, and rename it (Hennadii Stepanov) Pull request description: This PR is a second attempt at #19594. This PR has two motivations: - Improve code hygiene by eliminating a global variable, `mapBlocksUnknownParent` - Fix fuzz test OOM when running too long ([see #19594 comment](bitcoin/bitcoin#19594 (comment))) A minor added advantage is to release `mapBlocksUnknownParent` memory when the reindexing phase is done. The current situation is somewhat similar to a memory leak because this map exists unused for the remaining lifetime of the process. It's true that this map should be empty of data elements after use, but its internal metadata (indexing structures, etc.) can have non-trivial size because there can be many thousands of simultaneous elements in this map. This PR helps our efforts to reduce the use of global variables. This variable isn't just global, it's hidden inside a function (it looks like a local variable but has the `static` attribute). This global variable exists because the `-reindex` processing code calls `LoadExternalBlockFile()` multiple times (once for each block file), but that function must preserve some state between calls (the `mapBlocksUnknownParent` map). This PR fixes this by allocating this map as a local variable in the caller's scope and passing it in on each call. When reindexing completes, the map goes out of scope and is deallocated. I tested this manually by reindexing on mainnet and signet. Also, the existing `feature_reindex.py` functional test passes. ACKs for top commit: mzumsande: re-ACK dd065da theStack: re-ACK dd065da shaavan: reACK dd065da Tree-SHA512: 9cd20e44d2fa1096dd405bc107bc065ea8f904f5b3f63080341b08d8cf57b790df565f58815c2f331377d044d5306708b4bf6bdfc5ef8d0ed85d8e97d744732c
2 parents b1c8ea4 + dd065da commit 5871b5b

File tree

4 files changed

+60
-12
lines changed

4 files changed

+60
-12
lines changed

src/node/blockstorage.cpp

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,7 @@
2121
#include <util/system.h>
2222
#include <validation.h>
2323

24+
#include <map>
2425
#include <unordered_map>
2526

2627
namespace node {
@@ -834,6 +835,9 @@ void ThreadImport(ChainstateManager& chainman, std::vector<fs::path> vImportFile
834835
// -reindex
835836
if (fReindex) {
836837
int nFile = 0;
838+
// Map of disk positions for blocks with unknown parent (only used for reindex);
839+
// parent hash -> child disk position, multiple children can have the same parent.
840+
std::multimap<uint256, FlatFilePos> blocks_with_unknown_parent;
837841
while (true) {
838842
FlatFilePos pos(nFile, 0);
839843
if (!fs::exists(GetBlockPosFilename(pos))) {
@@ -844,7 +848,7 @@ void ThreadImport(ChainstateManager& chainman, std::vector<fs::path> vImportFile
844848
break; // This error is logged in OpenBlockFile
845849
}
846850
LogPrintf("Reindexing block file blk%05u.dat...\n", (unsigned int)nFile);
847-
chainman.ActiveChainstate().LoadExternalBlockFile(file, &pos);
851+
chainman.ActiveChainstate().LoadExternalBlockFile(file, &pos, &blocks_with_unknown_parent);
848852
if (ShutdownRequested()) {
849853
LogPrintf("Shutdown requested. Exit %s\n", __func__);
850854
return;

src/test/fuzz/load_external_block_file.cpp

Lines changed: 9 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -31,6 +31,13 @@ FUZZ_TARGET_INIT(load_external_block_file, initialize_load_external_block_file)
3131
if (fuzzed_block_file == nullptr) {
3232
return;
3333
}
34-
FlatFilePos flat_file_pos;
35-
g_setup->m_node.chainman->ActiveChainstate().LoadExternalBlockFile(fuzzed_block_file, fuzzed_data_provider.ConsumeBool() ? &flat_file_pos : nullptr);
34+
if (fuzzed_data_provider.ConsumeBool()) {
35+
// Corresponds to the -reindex case (track orphan blocks across files).
36+
FlatFilePos flat_file_pos;
37+
std::multimap<uint256, FlatFilePos> blocks_with_unknown_parent;
38+
g_setup->m_node.chainman->ActiveChainstate().LoadExternalBlockFile(fuzzed_block_file, &flat_file_pos, &blocks_with_unknown_parent);
39+
} else {
40+
// Corresponds to the -loadblock= case (orphan blocks aren't tracked across files).
41+
g_setup->m_node.chainman->ActiveChainstate().LoadExternalBlockFile(fuzzed_block_file);
42+
}
3643
}

src/validation.cpp

Lines changed: 16 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -57,6 +57,7 @@
5757
#include <warnings.h>
5858

5959
#include <algorithm>
60+
#include <cassert>
6061
#include <chrono>
6162
#include <deque>
6263
#include <numeric>
@@ -4256,11 +4257,16 @@ bool CChainState::LoadGenesisBlock()
42564257
return true;
42574258
}
42584259

4259-
void CChainState::LoadExternalBlockFile(FILE* fileIn, FlatFilePos* dbp)
4260+
void CChainState::LoadExternalBlockFile(
4261+
FILE* fileIn,
4262+
FlatFilePos* dbp,
4263+
std::multimap<uint256, FlatFilePos>* blocks_with_unknown_parent)
42604264
{
42614265
AssertLockNotHeld(m_chainstate_mutex);
4262-
// Map of disk positions for blocks with unknown parent (only used for reindex)
4263-
static std::multimap<uint256, FlatFilePos> mapBlocksUnknownParent;
4266+
4267+
// Either both should be specified (-reindex), or neither (-loadblock).
4268+
assert(!dbp == !blocks_with_unknown_parent);
4269+
42644270
int64_t nStart = GetTimeMillis();
42654271

42664272
int nLoaded = 0;
@@ -4310,8 +4316,9 @@ void CChainState::LoadExternalBlockFile(FILE* fileIn, FlatFilePos* dbp)
43104316
if (hash != m_params.GetConsensus().hashGenesisBlock && !m_blockman.LookupBlockIndex(block.hashPrevBlock)) {
43114317
LogPrint(BCLog::REINDEX, "%s: Out of order block %s, parent %s not known\n", __func__, hash.ToString(),
43124318
block.hashPrevBlock.ToString());
4313-
if (dbp)
4314-
mapBlocksUnknownParent.insert(std::make_pair(block.hashPrevBlock, *dbp));
4319+
if (dbp && blocks_with_unknown_parent) {
4320+
blocks_with_unknown_parent->emplace(block.hashPrevBlock, *dbp);
4321+
}
43154322
continue;
43164323
}
43174324

@@ -4340,13 +4347,15 @@ void CChainState::LoadExternalBlockFile(FILE* fileIn, FlatFilePos* dbp)
43404347

43414348
NotifyHeaderTip(*this);
43424349

4350+
if (!blocks_with_unknown_parent) continue;
4351+
43434352
// Recursively process earlier encountered successors of this block
43444353
std::deque<uint256> queue;
43454354
queue.push_back(hash);
43464355
while (!queue.empty()) {
43474356
uint256 head = queue.front();
43484357
queue.pop_front();
4349-
std::pair<std::multimap<uint256, FlatFilePos>::iterator, std::multimap<uint256, FlatFilePos>::iterator> range = mapBlocksUnknownParent.equal_range(head);
4358+
auto range = blocks_with_unknown_parent->equal_range(head);
43504359
while (range.first != range.second) {
43514360
std::multimap<uint256, FlatFilePos>::iterator it = range.first;
43524361
std::shared_ptr<CBlock> pblockrecursive = std::make_shared<CBlock>();
@@ -4361,7 +4370,7 @@ void CChainState::LoadExternalBlockFile(FILE* fileIn, FlatFilePos* dbp)
43614370
}
43624371
}
43634372
range.first++;
4364-
mapBlocksUnknownParent.erase(it);
4373+
blocks_with_unknown_parent->erase(it);
43654374
NotifyHeaderTip(*this);
43664375
}
43674376
}

src/validation.h

Lines changed: 30 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -575,8 +575,36 @@ class CChainState
575575
bool ResizeCoinsCaches(size_t coinstip_size, size_t coinsdb_size)
576576
EXCLUSIVE_LOCKS_REQUIRED(::cs_main);
577577

578-
/** Import blocks from an external file */
579-
void LoadExternalBlockFile(FILE* fileIn, FlatFilePos* dbp = nullptr)
578+
/**
579+
* Import blocks from an external file
580+
*
581+
* During reindexing, this function is called for each block file (datadir/blocks/blk?????.dat).
582+
* It reads all blocks contained in the given file and attempts to process them (add them to the
583+
* block index). The blocks may be out of order within each file and across files. Often this
584+
* function reads a block but finds that its parent hasn't been read yet, so the block can't be
585+
* processed yet. The function will add an entry to the blocks_with_unknown_parent map (which is
586+
* passed as an argument), so that when the block's parent is later read and processed, this
587+
* function can re-read the child block from disk and process it.
588+
*
589+
* Because a block's parent may be in a later file, not just later in the same file, the
590+
* blocks_with_unknown_parent map must be passed in and out with each call. It's a multimap,
591+
* rather than just a map, because multiple blocks may have the same parent (when chain splits
592+
* or stale blocks exist). It maps from parent-hash to child-disk-position.
593+
*
594+
* This function can also be used to read blocks from user-specified block files using the
595+
* -loadblock= option. There's no unknown-parent tracking, so the last two arguments are omitted.
596+
*
597+
*
598+
* @param[in] fileIn FILE handle to file containing blocks to read
599+
* @param[in] dbp (optional) Disk block position (only for reindex)
600+
* @param[in,out] blocks_with_unknown_parent (optional) Map of disk positions for blocks with
601+
* unknown parent, key is parent block hash
602+
* (only used for reindex)
603+
* */
604+
void LoadExternalBlockFile(
605+
FILE* fileIn,
606+
FlatFilePos* dbp = nullptr,
607+
std::multimap<uint256, FlatFilePos>* blocks_with_unknown_parent = nullptr)
580608
EXCLUSIVE_LOCKS_REQUIRED(!m_chainstate_mutex);
581609

582610
/**

0 commit comments

Comments
 (0)