Skip to content

Commit a07e8ca

Browse files
committed
Merge #13033: Build txindex in parallel with validation
9b27047 [doc] Include txindex changes in the release notes. (Jim Posen) ed77dd6 [test] Simple unit test for TxIndex. (Jim Posen) 6d772a3 [rpc] Public interfaces to GetTransaction block until synced. (Jim Posen) a03f804 [index] Move disk IO logic from GetTransaction to TxIndex::FindTx. (Jim Posen) e0a3b80 [validation] Replace tx index code in validation code with TxIndex. (Jim Posen) 8181db8 [init] Initialize and start TxIndex in init code. (Jim Posen) f90c3a6 [index] TxIndex method to wait until caught up. (Jim Posen) 70d510d [index] Allow TxIndex sync thread to be interrupted. (Jim Posen) 94b4f8b [index] TxIndex initial sync thread. (Jim Posen) 34d68bf [index] Create new TxIndex class. (Jim Posen) c88bcec [db] Migration for txindex data to new, separate database. (Jim Posen) 0cb8303 [db] Create separate database for txindex. (Jim Posen) Pull request description: I'm re-opening #11857 as a new pull request because the last one stopped loading for people ------------------------------- This refactors the tx index code to be in it's own class and get built concurrently with validation code. The main benefit is decoupling and moving the txindex into a separate DB. The primary motivation is to lay the groundwork for other indexers that might be desired (such as the [compact filters](bitcoin/bips#636)). The basic idea is that the TxIndex spins up its own thread, which first syncs the txindex to the current block index, then once in sync the BlockConnected ValidationInterface hook writes new blocks. ### DB changes At the suggestion of some other developers, the txindex has been split out into a separate database. A data migration runs at startup on any nodes with a legacy txindex. Currently the migration blocks node initialization until complete. ### Open questions - Should the migration of txindex data from the old DB to the new DB block in init or should it happen in a background thread? The downside to backgrounding it is that `getrawtransaction` would return an error message saying the txindex is syncing while the migration is running. ### Impact In a sample size n=1 test where I synced nodes from scratch, the average time [Index writing](https://github.com/bitcoin/bitcoin/blob/master/src/validation.cpp#L1903) was 3.36ms in master and 1.72ms in this branch. The average time between `UpdateTip` log lines for sequential blocks between 400,000 and IBD end on mainnet was 0.297204s in master and 0.286134s in this branch. Most likely this is just variance in IBD times, but I can try with some more trials if people want. Tree-SHA512: 451fd7d95df89dfafceaa723cdf0f7b137615b531cf5c5035cfb54e9ccc2026cec5ac85edbcf71b7f4e2f102e36e9202b8b3a667e1504a9e1a9976ab1f0079c4
2 parents 24106a8 + 9b27047 commit a07e8ca

18 files changed

+757
-86
lines changed

doc/files.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@
1010
* db.log: wallet database log file; moved to wallets/ directory on new installs since 0.16.0
1111
* debug.log: contains debug information and general logging generated by bitcoind or bitcoin-qt
1212
* fee_estimates.dat: stores statistics used to estimate minimum transaction fees and priorities required for confirmation; since 0.10.0
13+
* indexes/txindex/*: optional transaction index database (LevelDB); since 0.17.0
1314
* mempool.dat: dump of the mempool's transactions; since 0.14.0.
1415
* peers.dat: peer IP address database (custom format); since 0.7.0
1516
* wallet.dat: personal wallet (BDB) with keys and transactions; moved to wallets/ directory on new installs since 0.16.0

doc/release-notes-pr13033.md

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
Transaction index changes
2+
-------------------------
3+
4+
The transaction index is now built separately from the main node procedure,
5+
meaning the `-txindex` flag can be toggled without a full reindex. If bitcoind
6+
is run with `-txindex` on a node that is already partially or fully synced
7+
without one, the transaction index will be built in the background and become
8+
available once caught up. When switching from running `-txindex` to running
9+
without the flag, the transaction index database will *not* be deleted
10+
automatically, meaning it could be turned back on at a later time without a full
11+
resync.

src/Makefile.am

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -103,6 +103,7 @@ BITCOIN_CORE_H = \
103103
fs.h \
104104
httprpc.h \
105105
httpserver.h \
106+
index/txindex.h \
106107
indirectmap.h \
107108
init.h \
108109
interfaces/handler.h \
@@ -204,6 +205,7 @@ libbitcoin_server_a_SOURCES = \
204205
consensus/tx_verify.cpp \
205206
httprpc.cpp \
206207
httpserver.cpp \
208+
index/txindex.cpp \
207209
init.cpp \
208210
dbwrapper.cpp \
209211
merkleblock.cpp \

src/Makefile.test.include

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -83,6 +83,7 @@ BITCOIN_TESTS =\
8383
test/timedata_tests.cpp \
8484
test/torcontrol_tests.cpp \
8585
test/transaction_tests.cpp \
86+
test/txindex_tests.cpp \
8687
test/txvalidation_tests.cpp \
8788
test/txvalidationcache_tests.cpp \
8889
test/versionbits_tests.cpp \

src/dbwrapper.h

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -224,6 +224,9 @@ class CDBWrapper
224224
CDBWrapper(const fs::path& path, size_t nCacheSize, bool fMemory = false, bool fWipe = false, bool obfuscate = false);
225225
~CDBWrapper();
226226

227+
CDBWrapper(const CDBWrapper&) = delete;
228+
CDBWrapper& operator=(const CDBWrapper&) = delete;
229+
227230
template <typename K, typename V>
228231
bool Read(const K& key, V& value) const
229232
{

src/index/txindex.cpp

Lines changed: 309 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,309 @@
1+
// Copyright (c) 2017-2018 The Bitcoin Core developers
2+
// Distributed under the MIT software license, see the accompanying
3+
// file COPYING or http://www.opensource.org/licenses/mit-license.php.
4+
5+
#include <chainparams.h>
6+
#include <index/txindex.h>
7+
#include <init.h>
8+
#include <tinyformat.h>
9+
#include <ui_interface.h>
10+
#include <util.h>
11+
#include <validation.h>
12+
#include <warnings.h>
13+
14+
constexpr int64_t SYNC_LOG_INTERVAL = 30; // seconds
15+
constexpr int64_t SYNC_LOCATOR_WRITE_INTERVAL = 30; // seconds
16+
17+
std::unique_ptr<TxIndex> g_txindex;
18+
19+
template<typename... Args>
20+
static void FatalError(const char* fmt, const Args&... args)
21+
{
22+
std::string strMessage = tfm::format(fmt, args...);
23+
SetMiscWarning(strMessage);
24+
LogPrintf("*** %s\n", strMessage);
25+
uiInterface.ThreadSafeMessageBox(
26+
"Error: A fatal internal error occurred, see debug.log for details",
27+
"", CClientUIInterface::MSG_ERROR);
28+
StartShutdown();
29+
}
30+
31+
TxIndex::TxIndex(std::unique_ptr<TxIndexDB> db) :
32+
m_db(std::move(db)), m_synced(false), m_best_block_index(nullptr)
33+
{}
34+
35+
TxIndex::~TxIndex()
36+
{
37+
Interrupt();
38+
Stop();
39+
}
40+
41+
bool TxIndex::Init()
42+
{
43+
LOCK(cs_main);
44+
45+
// Attempt to migrate txindex from the old database to the new one. Even if
46+
// chain_tip is null, the node could be reindexing and we still want to
47+
// delete txindex records in the old database.
48+
if (!m_db->MigrateData(*pblocktree, chainActive.GetLocator())) {
49+
return false;
50+
}
51+
52+
CBlockLocator locator;
53+
if (!m_db->ReadBestBlock(locator)) {
54+
locator.SetNull();
55+
}
56+
57+
m_best_block_index = FindForkInGlobalIndex(chainActive, locator);
58+
m_synced = m_best_block_index.load() == chainActive.Tip();
59+
return true;
60+
}
61+
62+
static const CBlockIndex* NextSyncBlock(const CBlockIndex* pindex_prev)
63+
{
64+
AssertLockHeld(cs_main);
65+
66+
if (!pindex_prev) {
67+
return chainActive.Genesis();
68+
}
69+
70+
const CBlockIndex* pindex = chainActive.Next(pindex_prev);
71+
if (pindex) {
72+
return pindex;
73+
}
74+
75+
return chainActive.Next(chainActive.FindFork(pindex_prev));
76+
}
77+
78+
void TxIndex::ThreadSync()
79+
{
80+
const CBlockIndex* pindex = m_best_block_index.load();
81+
if (!m_synced) {
82+
auto& consensus_params = Params().GetConsensus();
83+
84+
int64_t last_log_time = 0;
85+
int64_t last_locator_write_time = 0;
86+
while (true) {
87+
if (m_interrupt) {
88+
WriteBestBlock(pindex);
89+
return;
90+
}
91+
92+
{
93+
LOCK(cs_main);
94+
const CBlockIndex* pindex_next = NextSyncBlock(pindex);
95+
if (!pindex_next) {
96+
WriteBestBlock(pindex);
97+
m_best_block_index = pindex;
98+
m_synced = true;
99+
break;
100+
}
101+
pindex = pindex_next;
102+
}
103+
104+
int64_t current_time = GetTime();
105+
if (last_log_time + SYNC_LOG_INTERVAL < current_time) {
106+
LogPrintf("Syncing txindex with block chain from height %d\n", pindex->nHeight);
107+
last_log_time = current_time;
108+
}
109+
110+
if (last_locator_write_time + SYNC_LOCATOR_WRITE_INTERVAL < current_time) {
111+
WriteBestBlock(pindex);
112+
last_locator_write_time = current_time;
113+
}
114+
115+
CBlock block;
116+
if (!ReadBlockFromDisk(block, pindex, consensus_params)) {
117+
FatalError("%s: Failed to read block %s from disk",
118+
__func__, pindex->GetBlockHash().ToString());
119+
return;
120+
}
121+
if (!WriteBlock(block, pindex)) {
122+
FatalError("%s: Failed to write block %s to tx index database",
123+
__func__, pindex->GetBlockHash().ToString());
124+
return;
125+
}
126+
}
127+
}
128+
129+
if (pindex) {
130+
LogPrintf("txindex is enabled at height %d\n", pindex->nHeight);
131+
} else {
132+
LogPrintf("txindex is enabled\n");
133+
}
134+
}
135+
136+
bool TxIndex::WriteBlock(const CBlock& block, const CBlockIndex* pindex)
137+
{
138+
CDiskTxPos pos(pindex->GetBlockPos(), GetSizeOfCompactSize(block.vtx.size()));
139+
std::vector<std::pair<uint256, CDiskTxPos>> vPos;
140+
vPos.reserve(block.vtx.size());
141+
for (const auto& tx : block.vtx) {
142+
vPos.emplace_back(tx->GetHash(), pos);
143+
pos.nTxOffset += ::GetSerializeSize(*tx, SER_DISK, CLIENT_VERSION);
144+
}
145+
return m_db->WriteTxs(vPos);
146+
}
147+
148+
bool TxIndex::WriteBestBlock(const CBlockIndex* block_index)
149+
{
150+
LOCK(cs_main);
151+
if (!m_db->WriteBestBlock(chainActive.GetLocator(block_index))) {
152+
return error("%s: Failed to write locator to disk", __func__);
153+
}
154+
return true;
155+
}
156+
157+
void TxIndex::BlockConnected(const std::shared_ptr<const CBlock>& block, const CBlockIndex* pindex,
158+
const std::vector<CTransactionRef>& txn_conflicted)
159+
{
160+
if (!m_synced) {
161+
return;
162+
}
163+
164+
const CBlockIndex* best_block_index = m_best_block_index.load();
165+
if (!best_block_index) {
166+
if (pindex->nHeight != 0) {
167+
FatalError("%s: First block connected is not the genesis block (height=%d)",
168+
__func__, pindex->nHeight);
169+
return;
170+
}
171+
} else {
172+
// Ensure block connects to an ancestor of the current best block. This should be the case
173+
// most of the time, but may not be immediately after the the sync thread catches up and sets
174+
// m_synced. Consider the case where there is a reorg and the blocks on the stale branch are
175+
// in the ValidationInterface queue backlog even after the sync thread has caught up to the
176+
// new chain tip. In this unlikely event, log a warning and let the queue clear.
177+
if (best_block_index->GetAncestor(pindex->nHeight - 1) != pindex->pprev) {
178+
LogPrintf("%s: WARNING: Block %s does not connect to an ancestor of " /* Continued */
179+
"known best chain (tip=%s); not updating txindex\n",
180+
__func__, pindex->GetBlockHash().ToString(),
181+
best_block_index->GetBlockHash().ToString());
182+
return;
183+
}
184+
}
185+
186+
if (WriteBlock(*block, pindex)) {
187+
m_best_block_index = pindex;
188+
} else {
189+
FatalError("%s: Failed to write block %s to txindex",
190+
__func__, pindex->GetBlockHash().ToString());
191+
return;
192+
}
193+
}
194+
195+
void TxIndex::SetBestChain(const CBlockLocator& locator)
196+
{
197+
if (!m_synced) {
198+
return;
199+
}
200+
201+
const uint256& locator_tip_hash = locator.vHave.front();
202+
const CBlockIndex* locator_tip_index;
203+
{
204+
LOCK(cs_main);
205+
locator_tip_index = LookupBlockIndex(locator_tip_hash);
206+
}
207+
208+
if (!locator_tip_index) {
209+
FatalError("%s: First block (hash=%s) in locator was not found",
210+
__func__, locator_tip_hash.ToString());
211+
return;
212+
}
213+
214+
// This checks that SetBestChain callbacks are received after BlockConnected. The check may fail
215+
// immediately after the the sync thread catches up and sets m_synced. Consider the case where
216+
// there is a reorg and the blocks on the stale branch are in the ValidationInterface queue
217+
// backlog even after the sync thread has caught up to the new chain tip. In this unlikely
218+
// event, log a warning and let the queue clear.
219+
const CBlockIndex* best_block_index = m_best_block_index.load();
220+
if (best_block_index->GetAncestor(locator_tip_index->nHeight) != locator_tip_index) {
221+
LogPrintf("%s: WARNING: Locator contains block (hash=%s) not on known best " /* Continued */
222+
"chain (tip=%s); not writing txindex locator\n",
223+
__func__, locator_tip_hash.ToString(),
224+
best_block_index->GetBlockHash().ToString());
225+
return;
226+
}
227+
228+
if (!m_db->WriteBestBlock(locator)) {
229+
error("%s: Failed to write locator to disk", __func__);
230+
}
231+
}
232+
233+
bool TxIndex::BlockUntilSyncedToCurrentChain()
234+
{
235+
AssertLockNotHeld(cs_main);
236+
237+
if (!m_synced) {
238+
return false;
239+
}
240+
241+
{
242+
// Skip the queue-draining stuff if we know we're caught up with
243+
// chainActive.Tip().
244+
LOCK(cs_main);
245+
const CBlockIndex* chain_tip = chainActive.Tip();
246+
const CBlockIndex* best_block_index = m_best_block_index.load();
247+
if (best_block_index->GetAncestor(chain_tip->nHeight) == chain_tip) {
248+
return true;
249+
}
250+
}
251+
252+
LogPrintf("%s: txindex is catching up on block notifications\n", __func__);
253+
SyncWithValidationInterfaceQueue();
254+
return true;
255+
}
256+
257+
bool TxIndex::FindTx(const uint256& tx_hash, uint256& block_hash, CTransactionRef& tx) const
258+
{
259+
CDiskTxPos postx;
260+
if (!m_db->ReadTxPos(tx_hash, postx)) {
261+
return false;
262+
}
263+
264+
CAutoFile file(OpenBlockFile(postx, true), SER_DISK, CLIENT_VERSION);
265+
if (file.IsNull()) {
266+
return error("%s: OpenBlockFile failed", __func__);
267+
}
268+
CBlockHeader header;
269+
try {
270+
file >> header;
271+
fseek(file.Get(), postx.nTxOffset, SEEK_CUR);
272+
file >> tx;
273+
} catch (const std::exception& e) {
274+
return error("%s: Deserialize or I/O error - %s", __func__, e.what());
275+
}
276+
if (tx->GetHash() != tx_hash) {
277+
return error("%s: txid mismatch", __func__);
278+
}
279+
block_hash = header.GetHash();
280+
return true;
281+
}
282+
283+
void TxIndex::Interrupt()
284+
{
285+
m_interrupt();
286+
}
287+
288+
void TxIndex::Start()
289+
{
290+
// Need to register this ValidationInterface before running Init(), so that
291+
// callbacks are not missed if Init sets m_synced to true.
292+
RegisterValidationInterface(this);
293+
if (!Init()) {
294+
FatalError("%s: txindex failed to initialize", __func__);
295+
return;
296+
}
297+
298+
m_thread_sync = std::thread(&TraceThread<std::function<void()>>, "txindex",
299+
std::bind(&TxIndex::ThreadSync, this));
300+
}
301+
302+
void TxIndex::Stop()
303+
{
304+
UnregisterValidationInterface(this);
305+
306+
if (m_thread_sync.joinable()) {
307+
m_thread_sync.join();
308+
}
309+
}

0 commit comments

Comments
 (0)