Skip to content

Commit 8da0524

Browse files
committed
Merge #101: connection: run async cleanups in LIFO not FIFO order
6825523 connection: run async cleanups in LIFO not FIFO order (Ryan Ofsky) Pull request description: Run Connection class asynchronous cleanups in LIFO not FIFO order, which is more natural ordering and prevents a bitcoin wallet shutdown deadlock when the connection to the node process is broken. Also add comments better documenting cleanup order. This change fixes one of two shutdown deadlocks in the bitcoin wallet when the node process is killed in bitcoin/bitcoin#10102 as of bitcoin/bitcoin@3f12b43. The other deadlock is caused by the `ProxyServerCustom<WalletLoader>` destructor in that PR and is described in commit efe42cc from #100. The bitcoin wallet shutdown deadlocks have probably been around for some time, but were not encountered because before bitcoin/bitcoin@0b75315 from bitcoin/bitcoin#26606 there weren't any tests which killed the bitcoin node process and required the bitcoin-wallet process to shut down gracefully. Top commit has no ACKs. Tree-SHA512: cbbedcc71e4c3ebb541d95f2444151c4058fff3b6357bed5c1b2929d5311fd2a621aeed1af7f56a6159bb3c1ac8abf051acc22830f748edaa6b61ecd557f5074
2 parents 53ee9fa + 6825523 commit 8da0524

File tree

1 file changed

+25
-1
lines changed

1 file changed

+25
-1
lines changed

src/mp/proxy.cpp

Lines changed: 25 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -107,6 +107,14 @@ Connection::~Connection()
107107
CleanupIt Connection::addSyncCleanup(std::function<void()> fn)
108108
{
109109
std::unique_lock<std::mutex> lock(m_loop.m_mutex);
110+
// Add cleanup callbacks to the front of list, so sync cleanup functions run
111+
// in LIFO order. This is a good approach because sync cleanup functions are
112+
// added as client objects are created, and it is natural to clean up
113+
// objects in the reverse order they were created. In practice, however,
114+
// order should not be significant because the cleanup callbacks run
115+
// synchronously in a single batch when the connection is broken, and they
116+
// only reset the connection pointers in the client objects without actually
117+
// deleting the client objects.
110118
return m_sync_cleanup_fns.emplace(m_sync_cleanup_fns.begin(), std::move(fn));
111119
}
112120

@@ -119,7 +127,23 @@ void Connection::removeSyncCleanup(CleanupIt it)
119127
void Connection::addAsyncCleanup(std::function<void()> fn)
120128
{
121129
std::unique_lock<std::mutex> lock(m_loop.m_mutex);
122-
m_async_cleanup_fns.emplace(m_async_cleanup_fns.begin(), std::move(fn));
130+
// Add async cleanup callbacks to the back of the list. Unlike the sync
131+
// cleanup list, this list order is more significant because it determines
132+
// the order server objects are destroyed when there is a sudden disconnect,
133+
// and it is possible objects may need to be destroyed in a certain order.
134+
// This function is called in ProxyServerBase destructors, and since capnp
135+
// destroys ProxyServer objects in LIFO order, we should preserve this
136+
// order, and add cleanup callbacks to the end of the list so they can be
137+
// run starting from the beginning of the list.
138+
//
139+
// In bitcoin core, running these callbacks in the right order is
140+
// particularly important for the wallet process, because it uses blocking
141+
// shared_ptrs and requires Chain::Notification pointers owned by the node
142+
// process to be destroyed before the WalletLoader objects owned by the node
143+
// process, otherwise shared pointer counts of the CWallet objects (which
144+
// inherit from Chain::Notification) will not be 1 when WalletLoader
145+
// destructor runs and it will wait forever for them to be released.
146+
m_async_cleanup_fns.emplace(m_async_cleanup_fns.end(), std::move(fn));
123147
}
124148

125149
EventLoop::EventLoop(const char* exe_name, LogFn log_fn, void* context)

0 commit comments

Comments
 (0)