Skip to content

Commit 6912a28

Browse files
committed
Merge bitcoin/bitcoin#25667: assumeutxo: snapshot initialization
bf95976 doc: add note about snapshot chainstate init (James O'Beirne) e4d7995 test: add testcases for snapshot initialization (James O'Beirne) cced4e7 test: move-only-ish: factor out LoadVerifyActivateChainstate() (James O'Beirne) 51fc924 test: allow on-disk coins and block tree dbs in tests (James O'Beirne) 3c36139 test: add reset_chainstate parameter for snapshot unittests (James O'Beirne) 00b357c validation: add ResetChainstates() (James O'Beirne) 3a29dfb move-only: test: make snapshot chainstate setup reusable (James O'Beirne) 8153bd9 blockmanager: avoid undefined behavior during FlushBlockFile (James O'Beirne) ad67ff3 validation: remove snapshot datadirs upon validation failure (James O'Beirne) 34d1590 add utilities for deleting on-disk leveldb data (James O'Beirne) 252abd1 init: add utxo snapshot detection (James O'Beirne) f9f1735 validation: rename snapshot chainstate dir (James O'Beirne) d14bebf db: add StoragePath to CDBWrapper/CCoinsViewDB (James O'Beirne) Pull request description: This is part of the [assumeutxo project](https://github.com/bitcoin/bitcoin/projects/11) (parent PR: bitcoin/bitcoin#15606) --- Half of the replacement for #24232. The original PR grew larger than expected throughout the review process. This change adds the ability to initialize a snapshot-based chainstate during init if one is detected on disk. This is of course unused as of now (aside from in unittests) given that we haven't yet enabled actually loading snapshots. Don't be scared! There are some big move-only commits in here. Accompanying changes include: - moving the snapshot coinsdb directory from being called `chainstate_[base blockhash]` to `chainstate_snapshot`, since we only support one snapshot in use at a time. This simplifies some logic, but it necessitates writing that base blockhash out to a file within the coinsdb dir. See [discussion here](bitcoin/bitcoin#24232 (comment)). - adding a simple fix in `FlushBlockFile()` that avoids a crash when attemping to flush to disk before `LoadBlockIndexDB()` is called, which happens when calling `MaybeRebalanceCaches()` during multiple chainstate init. - improving the unittest to allow testing with on-disk chainstates - necessary to test a simulated restart and re-initialization. ACKs for top commit: naumenkogs: utACK bf95976 ariard: Code Review ACK bf95976 ryanofsky: Code review ACK bf95976. Changes since last review: rebasing, switching from CAutoFile to AutoFile, adding comments, switching from BOOST_CHECK to Assert in test util, using chainman.GetMutex() in tests, destroying one ChainstateManager before creating a new one in tests fjahr: utACK bf95976 aureleoules: ACK bf95976 Tree-SHA512: 15ae75caf19f8d12a12d2647c52897904d27b265a7af6b4ae7b858592eeadb8f9da6c2394b6baebec90adc28742c053e3eb506119577dae7c1e722ebb3b7bcc0
2 parents 147d64d + bf95976 commit 6912a28

19 files changed

+662
-186
lines changed

doc/design/assumeutxo.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -76,8 +76,9 @@ original chainstate remains in use as active.
7676

7777
Once the snapshot chainstate is loaded and validated, it is promoted to active
7878
chainstate and a sync to tip begins. A new chainstate directory is created in the
79-
datadir for the snapshot chainstate called
80-
`chainstate_[SHA256 blockhash of snapshot base block]`.
79+
datadir for the snapshot chainstate called `chainstate_snapshot`. When this directory
80+
is present in the datadir, the snapshot chainstate will be detected and loaded as
81+
active on node startup (via `DetectSnapshotChainstate()`).
8182

8283
| | |
8384
| ---------- | ----------- |

src/Makefile.am

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -396,6 +396,7 @@ libbitcoin_node_a_SOURCES = \
396396
node/minisketchwrapper.cpp \
397397
node/psbt.cpp \
398398
node/transaction.cpp \
399+
node/utxo_snapshot.cpp \
399400
node/validation_cache_args.cpp \
400401
noui.cpp \
401402
policy/fees.cpp \
@@ -902,6 +903,7 @@ libbitcoinkernel_la_SOURCES = \
902903
node/blockstorage.cpp \
903904
node/chainstate.cpp \
904905
node/interface_ui.cpp \
906+
node/utxo_snapshot.cpp \
905907
policy/feerate.cpp \
906908
policy/fees.cpp \
907909
policy/packages.cpp \

src/dbwrapper.cpp

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -128,7 +128,7 @@ static leveldb::Options GetOptions(size_t nCacheSize)
128128
}
129129

130130
CDBWrapper::CDBWrapper(const fs::path& path, size_t nCacheSize, bool fMemory, bool fWipe, bool obfuscate)
131-
: m_name{fs::PathToString(path.stem())}
131+
: m_name{fs::PathToString(path.stem())}, m_path{path}, m_is_memory{fMemory}
132132
{
133133
penv = nullptr;
134134
readoptions.verify_checksums = true;

src/dbwrapper.h

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,10 @@ class dbwrapper_error : public std::runtime_error
3939

4040
class CDBWrapper;
4141

42+
namespace dbwrapper {
43+
using leveldb::DestroyDB;
44+
}
45+
4246
/** These should be considered an implementation detail of the specific database.
4347
*/
4448
namespace dbwrapper_private {
@@ -219,6 +223,12 @@ class CDBWrapper
219223

220224
std::vector<unsigned char> CreateObfuscateKey() const;
221225

226+
//! path to filesystem storage
227+
const fs::path m_path;
228+
229+
//! whether or not the database resides in memory
230+
bool m_is_memory;
231+
222232
public:
223233
/**
224234
* @param[in] path Location in the filesystem where leveldb data will be stored.
@@ -268,6 +278,14 @@ class CDBWrapper
268278
return WriteBatch(batch, fSync);
269279
}
270280

281+
//! @returns filesystem path to the on-disk data.
282+
std::optional<fs::path> StoragePath() {
283+
if (m_is_memory) {
284+
return {};
285+
}
286+
return m_path;
287+
}
288+
271289
template <typename K>
272290
bool Exists(const K& key) const
273291
{

src/node/blockstorage.cpp

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -524,6 +524,16 @@ void BlockManager::FlushUndoFile(int block_file, bool finalize)
524524
void BlockManager::FlushBlockFile(bool fFinalize, bool finalize_undo)
525525
{
526526
LOCK(cs_LastBlockFile);
527+
528+
if (m_blockfile_info.size() < 1) {
529+
// Return if we haven't loaded any blockfiles yet. This happens during
530+
// chainstate init, when we call ChainstateManager::MaybeRebalanceCaches() (which
531+
// then calls FlushStateToDisk()), resulting in a call to this function before we
532+
// have populated `m_blockfile_info` via LoadBlockIndexDB().
533+
return;
534+
}
535+
assert(static_cast<int>(m_blockfile_info.size()) > m_last_blockfile);
536+
527537
FlatFilePos block_pos_old(m_last_blockfile, m_blockfile_info[m_last_blockfile].nSize);
528538
if (!BlockFileSeq().Flush(block_pos_old, fFinalize)) {
529539
AbortNode("Flushing block file to disk failed. This is likely the result of an I/O error.");

src/node/chainstate.cpp

Lines changed: 21 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -48,10 +48,15 @@ ChainstateLoadResult LoadChainstate(ChainstateManager& chainman, const CacheSize
4848
}
4949

5050
LOCK(cs_main);
51-
chainman.InitializeChainstate(options.mempool);
5251
chainman.m_total_coinstip_cache = cache_sizes.coins;
5352
chainman.m_total_coinsdb_cache = cache_sizes.coins_db;
5453

54+
// Load the fully validated chainstate.
55+
chainman.InitializeChainstate(options.mempool);
56+
57+
// Load a chain created from a UTXO snapshot, if any exist.
58+
chainman.DetectSnapshotChainstate(options.mempool);
59+
5560
auto& pblocktree{chainman.m_blockman.m_block_tree_db};
5661
// new CBlockTreeDB tries to delete the existing file, which
5762
// fails if it's still open from the previous loop. Close it first:
@@ -98,12 +103,20 @@ ChainstateLoadResult LoadChainstate(ChainstateManager& chainman, const CacheSize
98103
return {ChainstateLoadStatus::FAILURE, _("Error initializing block database")};
99104
}
100105

106+
// Conservative value which is arbitrarily chosen, as it will ultimately be changed
107+
// by a call to `chainman.MaybeRebalanceCaches()`. We just need to make sure
108+
// that the sum of the two caches (40%) does not exceed the allowable amount
109+
// during this temporary initialization state.
110+
double init_cache_fraction = 0.2;
111+
101112
// At this point we're either in reindex or we've loaded a useful
102113
// block tree into BlockIndex()!
103114

104115
for (Chainstate* chainstate : chainman.GetAll()) {
116+
LogPrintf("Initializing chainstate %s\n", chainstate->ToString());
117+
105118
chainstate->InitCoinsDB(
106-
/*cache_size_bytes=*/cache_sizes.coins_db,
119+
/*cache_size_bytes=*/chainman.m_total_coinsdb_cache * init_cache_fraction,
107120
/*in_memory=*/options.coins_db_in_memory,
108121
/*should_wipe=*/options.reindex || options.reindex_chainstate);
109122

@@ -125,7 +138,7 @@ ChainstateLoadResult LoadChainstate(ChainstateManager& chainman, const CacheSize
125138
}
126139

127140
// The on-disk coinsdb is now in a good state, create the cache
128-
chainstate->InitCoinsCache(cache_sizes.coins);
141+
chainstate->InitCoinsCache(chainman.m_total_coinstip_cache * init_cache_fraction);
129142
assert(chainstate->CanFlushToDisk());
130143

131144
if (!is_coinsview_empty(chainstate)) {
@@ -146,6 +159,11 @@ ChainstateLoadResult LoadChainstate(ChainstateManager& chainman, const CacheSize
146159
};
147160
}
148161

162+
// Now that chainstates are loaded and we're able to flush to
163+
// disk, rebalance the coins caches to desired levels based
164+
// on the condition of each chainstate.
165+
chainman.MaybeRebalanceCaches();
166+
149167
return {ChainstateLoadStatus::SUCCESS, {}};
150168
}
151169

src/node/utxo_snapshot.cpp

Lines changed: 91 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,91 @@
1+
// Copyright (c) 2022 The Bitcoin Core developers
2+
// Distributed under the MIT software license, see the accompanying
3+
// file COPYING or http://www.opensource.org/licenses/mit-license.php.
4+
5+
#include <node/utxo_snapshot.h>
6+
7+
#include <fs.h>
8+
#include <logging.h>
9+
#include <streams.h>
10+
#include <uint256.h>
11+
#include <util/system.h>
12+
#include <validation.h>
13+
14+
#include <cstdio>
15+
#include <optional>
16+
17+
namespace node {
18+
19+
bool WriteSnapshotBaseBlockhash(Chainstate& snapshot_chainstate)
20+
{
21+
AssertLockHeld(::cs_main);
22+
assert(snapshot_chainstate.m_from_snapshot_blockhash);
23+
24+
const std::optional<fs::path> chaindir = snapshot_chainstate.CoinsDB().StoragePath();
25+
assert(chaindir); // Sanity check that chainstate isn't in-memory.
26+
const fs::path write_to = *chaindir / node::SNAPSHOT_BLOCKHASH_FILENAME;
27+
28+
FILE* file{fsbridge::fopen(write_to, "wb")};
29+
AutoFile afile{file};
30+
if (afile.IsNull()) {
31+
LogPrintf("[snapshot] failed to open base blockhash file for writing: %s\n",
32+
fs::PathToString(write_to));
33+
return false;
34+
}
35+
afile << *snapshot_chainstate.m_from_snapshot_blockhash;
36+
37+
if (afile.fclose() != 0) {
38+
LogPrintf("[snapshot] failed to close base blockhash file %s after writing\n",
39+
fs::PathToString(write_to));
40+
return false;
41+
}
42+
return true;
43+
}
44+
45+
std::optional<uint256> ReadSnapshotBaseBlockhash(fs::path chaindir)
46+
{
47+
if (!fs::exists(chaindir)) {
48+
LogPrintf("[snapshot] cannot read base blockhash: no chainstate dir " /* Continued */
49+
"exists at path %s\n", fs::PathToString(chaindir));
50+
return std::nullopt;
51+
}
52+
const fs::path read_from = chaindir / node::SNAPSHOT_BLOCKHASH_FILENAME;
53+
const std::string read_from_str = fs::PathToString(read_from);
54+
55+
if (!fs::exists(read_from)) {
56+
LogPrintf("[snapshot] snapshot chainstate dir is malformed! no base blockhash file " /* Continued */
57+
"exists at path %s. Try deleting %s and calling loadtxoutset again?\n",
58+
fs::PathToString(chaindir), read_from_str);
59+
return std::nullopt;
60+
}
61+
62+
uint256 base_blockhash;
63+
FILE* file{fsbridge::fopen(read_from, "rb")};
64+
AutoFile afile{file};
65+
if (afile.IsNull()) {
66+
LogPrintf("[snapshot] failed to open base blockhash file for reading: %s\n",
67+
read_from_str);
68+
return std::nullopt;
69+
}
70+
afile >> base_blockhash;
71+
72+
if (std::fgetc(afile.Get()) != EOF) {
73+
LogPrintf("[snapshot] warning: unexpected trailing data in %s\n", read_from_str);
74+
} else if (std::ferror(afile.Get())) {
75+
LogPrintf("[snapshot] warning: i/o error reading %s\n", read_from_str);
76+
}
77+
return base_blockhash;
78+
}
79+
80+
std::optional<fs::path> FindSnapshotChainstateDir()
81+
{
82+
fs::path possible_dir =
83+
gArgs.GetDataDirNet() / fs::u8path(strprintf("chainstate%s", SNAPSHOT_CHAINSTATE_SUFFIX));
84+
85+
if (fs::exists(possible_dir)) {
86+
return possible_dir;
87+
}
88+
return std::nullopt;
89+
}
90+
91+
} // namespace node

src/node/utxo_snapshot.h

Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,8 +6,14 @@
66
#ifndef BITCOIN_NODE_UTXO_SNAPSHOT_H
77
#define BITCOIN_NODE_UTXO_SNAPSHOT_H
88

9+
#include <fs.h>
910
#include <uint256.h>
1011
#include <serialize.h>
12+
#include <validation.h>
13+
14+
#include <optional>
15+
16+
extern RecursiveMutex cs_main;
1117

1218
namespace node {
1319
//! Metadata describing a serialized version of a UTXO set from which an
@@ -33,6 +39,33 @@ class SnapshotMetadata
3339

3440
SERIALIZE_METHODS(SnapshotMetadata, obj) { READWRITE(obj.m_base_blockhash, obj.m_coins_count); }
3541
};
42+
43+
//! The file in the snapshot chainstate dir which stores the base blockhash. This is
44+
//! needed to reconstruct snapshot chainstates on init.
45+
//!
46+
//! Because we only allow loading a single snapshot at a time, there will only be one
47+
//! chainstate directory with this filename present within it.
48+
const fs::path SNAPSHOT_BLOCKHASH_FILENAME{"base_blockhash"};
49+
50+
//! Write out the blockhash of the snapshot base block that was used to construct
51+
//! this chainstate. This value is read in during subsequent initializations and
52+
//! used to reconstruct snapshot-based chainstates.
53+
bool WriteSnapshotBaseBlockhash(Chainstate& snapshot_chainstate)
54+
EXCLUSIVE_LOCKS_REQUIRED(::cs_main);
55+
56+
//! Read the blockhash of the snapshot base block that was used to construct the
57+
//! chainstate.
58+
std::optional<uint256> ReadSnapshotBaseBlockhash(fs::path chaindir)
59+
EXCLUSIVE_LOCKS_REQUIRED(::cs_main);
60+
61+
//! Suffix appended to the chainstate (leveldb) dir when created based upon
62+
//! a snapshot.
63+
constexpr std::string_view SNAPSHOT_CHAINSTATE_SUFFIX = "_snapshot";
64+
65+
66+
//! Return a path to the snapshot-based chainstate dir, if one exists.
67+
std::optional<fs::path> FindSnapshotChainstateDir();
68+
3669
} // namespace node
3770

3871
#endif // BITCOIN_NODE_UTXO_SNAPSHOT_H

src/streams.h

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -487,12 +487,14 @@ class AutoFile
487487
AutoFile(const AutoFile&) = delete;
488488
AutoFile& operator=(const AutoFile&) = delete;
489489

490-
void fclose()
490+
int fclose()
491491
{
492+
int retval{0};
492493
if (file) {
493-
::fclose(file);
494+
retval = ::fclose(file);
494495
file = nullptr;
495496
}
497+
return retval;
496498
}
497499

498500
/** Get wrapped FILE* with transfer of ownership.

src/test/util/chainstate.h

Lines changed: 47 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,7 @@
1111
#include <node/context.h>
1212
#include <node/utxo_snapshot.h>
1313
#include <rpc/blockchain.h>
14+
#include <test/util/setup_common.h>
1415
#include <validation.h>
1516

1617
#include <univalue.h>
@@ -20,11 +21,24 @@ const auto NoMalleation = [](AutoFile& file, node::SnapshotMetadata& meta){};
2021
/**
2122
* Create and activate a UTXO snapshot, optionally providing a function to
2223
* malleate the snapshot.
24+
*
25+
* If `reset_chainstate` is true, reset the original chainstate back to the genesis
26+
* block. This allows us to simulate more realistic conditions in which a snapshot is
27+
* loaded into an otherwise mostly-uninitialized datadir. It also allows us to test
28+
* conditions that would otherwise cause shutdowns based on the IBD chainstate going
29+
* past the snapshot it generated.
2330
*/
2431
template<typename F = decltype(NoMalleation)>
2532
static bool
26-
CreateAndActivateUTXOSnapshot(node::NodeContext& node, const fs::path root, F malleation = NoMalleation)
33+
CreateAndActivateUTXOSnapshot(
34+
TestingSetup* fixture,
35+
F malleation = NoMalleation,
36+
bool reset_chainstate = false,
37+
bool in_memory_chainstate = false)
2738
{
39+
node::NodeContext& node = fixture->m_node;
40+
fs::path root = fixture->m_path_root;
41+
2842
// Write out a snapshot to the test's tempdir.
2943
//
3044
int height;
@@ -47,7 +61,38 @@ CreateAndActivateUTXOSnapshot(node::NodeContext& node, const fs::path root, F ma
4761

4862
malleation(auto_infile, metadata);
4963

50-
return node.chainman->ActivateSnapshot(auto_infile, metadata, /*in_memory=*/true);
64+
if (reset_chainstate) {
65+
{
66+
// What follows is code to selectively reset chainstate data without
67+
// disturbing the existing BlockManager instance, which is needed to
68+
// recognize the headers chain previously generated by the chainstate we're
69+
// removing. Without those headers, we can't activate the snapshot below.
70+
//
71+
// This is a stripped-down version of node::LoadChainstate which
72+
// preserves the block index.
73+
LOCK(::cs_main);
74+
uint256 gen_hash = node.chainman->ActiveChainstate().m_chain[0]->GetBlockHash();
75+
node.chainman->ResetChainstates();
76+
node.chainman->InitializeChainstate(node.mempool.get());
77+
Chainstate& chain = node.chainman->ActiveChainstate();
78+
Assert(chain.LoadGenesisBlock());
79+
// These cache values will be corrected shortly in `MaybeRebalanceCaches`.
80+
chain.InitCoinsDB(1 << 20, true, false, "");
81+
chain.InitCoinsCache(1 << 20);
82+
chain.CoinsTip().SetBestBlock(gen_hash);
83+
chain.setBlockIndexCandidates.insert(node.chainman->m_blockman.LookupBlockIndex(gen_hash));
84+
chain.LoadChainTip();
85+
node.chainman->MaybeRebalanceCaches();
86+
}
87+
BlockValidationState state;
88+
if (!node.chainman->ActiveChainstate().ActivateBestChain(state)) {
89+
throw std::runtime_error(strprintf("ActivateBestChain failed. (%s)", state.ToString()));
90+
}
91+
Assert(
92+
0 == WITH_LOCK(node.chainman->GetMutex(), return node.chainman->ActiveHeight()));
93+
}
94+
95+
return node.chainman->ActivateSnapshot(auto_infile, metadata, in_memory_chainstate);
5196
}
5297

5398

0 commit comments

Comments
 (0)