Skip to content

Commit 6825523

Browse files
committed
connection: run async cleanups in LIFO not FIFO order
Run Connection class asynchronous cleaups in LIFO not FIFO order, which is more natural ordering and prevents a bitcoin wallet shutdown deadlock when the connection to the node process is broken. Also add comments better documenting cleanup order.
1 parent e29f74e commit 6825523

File tree

1 file changed

+25
-1
lines changed

1 file changed

+25
-1
lines changed

src/mp/proxy.cpp

Lines changed: 25 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -107,6 +107,14 @@ Connection::~Connection()
107107
CleanupIt Connection::addSyncCleanup(std::function<void()> fn)
108108
{
109109
std::unique_lock<std::mutex> lock(m_loop.m_mutex);
110+
// Add cleanup callbacks to the front of list, so sync cleanup functions run
111+
// in LIFO order. This is a good approach because sync cleanup functions are
112+
// added as client objects are created, and it is natural to clean up
113+
// objects in the reverse order they were created. In practice, however,
114+
// order should not be significant because the cleanup callbacks run
115+
// synchronously in a single batch when the connection is broken, and they
116+
// only reset the connection pointers in the client objects without actually
117+
// deleting the client objects.
110118
return m_sync_cleanup_fns.emplace(m_sync_cleanup_fns.begin(), std::move(fn));
111119
}
112120

@@ -119,7 +127,23 @@ void Connection::removeSyncCleanup(CleanupIt it)
119127
void Connection::addAsyncCleanup(std::function<void()> fn)
120128
{
121129
std::unique_lock<std::mutex> lock(m_loop.m_mutex);
122-
m_async_cleanup_fns.emplace(m_async_cleanup_fns.begin(), std::move(fn));
130+
// Add async cleanup callbacks to the back of the list. Unlike the sync
131+
// cleanup list, this list order is more significant because it determines
132+
// the order server objects are destroyed when there is a sudden disconnect,
133+
// and it is possible objects may need to be destroyed in a certain order.
134+
// This function is called in ProxyServerBase destructors, and since capnp
135+
// destroys ProxyServer objects in LIFO order, we should preserve this
136+
// order, and add cleanup callbacks to the end of the list so they can be
137+
// run starting from the beginning of the list.
138+
//
139+
// In bitcoin core, running these callbacks in the right order is
140+
// particularly important for the wallet process, because it uses blocking
141+
// shared_ptrs and requires Chain::Notification pointers owned by the node
142+
// process to be destroyed before the WalletLoader objects owned by the node
143+
// process, otherwise shared pointer counts of the CWallet objects (which
144+
// inherit from Chain::Notification) will not be 1 when WalletLoader
145+
// destructor runs and it will wait forever for them to be released.
146+
m_async_cleanup_fns.emplace(m_async_cleanup_fns.end(), std::move(fn));
123147
}
124148

125149
EventLoop::EventLoop(const char* exe_name, LogFn log_fn, void* context)

0 commit comments

Comments
 (0)