Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 0 additions & 2 deletions include/mp/proxy-io.h
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tangent: Why does EventLoop use Unix sockets within the same process to signal when there's a new m_post_fn to execute? Seems like there should be thread synchronization primitives with marginally less overhead?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re: https://github.com/bitcoin-core/libmultiprocess/pull/201/files#r2359060452

Tangent: Why does EventLoop use Unix sockets within the same process to signal when there's a new m_post_fn to execute? Seems like there should be thread synchronization primitives with marginally less overhead?

EventLoop documentation is meant to provide some background on this and specifically the link at the bottom https://groups.google.com/d/msg/capnproto/TuQFF1eH2-M/g81sHaTAAQAJ addresses this question.

Cap'n Proto doesn't use threads and assumes all code calling it runs on a single thread. There are benefits and costs that come from this but one of the costs is that external threads need to use I/O to communicate with cap'n proto's event loop. There isn't another way for threads to send signals. (Or at least there wasn't at time I wrote this, maybe newer versions of cap'n proto provide something)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the main thing I was missing is the need for pumping the async IO. I assume waiting on a async IO-wrapped socket serves that purpose?

Was curious why you didn't use m_io_context->provider->newOneWayPipe() but when writing to the AsyncOutputStream a promise is returned with KJ_WARN_UNUSED_RESULT (https://github.com/capnproto/capnproto/blob/7db701e94ad00b8db06ede2bea5f527e2a6fa3de/c%2B%2B/src/kj/async-io.h#L121), which doesn't mirror what was in the example in Google Groups.

Q: I can see there's a ::close() call for [m_]post_fd, but none for m_wait_fd? Maybe it's garbage collected once the process dies, or does wrapSocketFd() assume responsibility for closing?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re: #201 (comment)

Yes IIRC wrapSocketFd does take ownership. A lot of this code is also updated & made more generic in windows PR bitcoin/bitcoin#32387 in case you are curious.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tangent in unrelated file - regarding how easy it is to understand libmultiprocess:
TestSetup (in test/mp/test/test.cpp) is hard to read, maybe would benefit from adding comments with argument names of anonymous lambdas and explicit captures of variables. Also feeling resistance when having to learn that kj::heap returns something std::unique_ptr-like which implicit casts to raw pointer. Are the kj types only used when necessary to interface with Cap'n'Proto, and std everywhere else?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re: #201 (comment)

Tangent in unrelated file - regarding how easy it is to understand libmultiprocess: TestSetup (in test/mp/test/test.cpp) is hard to read, maybe would benefit from adding comments with argument names of anonymous lambdas and explicit captures of variables.

This is probably true and I'm currently working on test improvements where I'm adding more members to the TestSetup class and could add more comments.

I'm not sure I would always prefer explicit variable captures though. I think [&] is not just a way of capturing state and cutting down noise, but also an important way indicating that the lambda will only be called in the current scope, and is not some event handler that can be called at a later time. So I tend to like [&] for synchronous inline callbacks, and [x, y, z] for asynchronous event handlers.

Also feeling resistance when having to learn that kj::heap returns something std::unique_ptr-like which implicit casts to raw pointer. Are the kj types only used when necessary to interface with Cap'n'Proto, and std everywhere else?

I think basically yes but the word "only" is too strong. The code does tends to use kj library when cap'nproto types are involved and std library otherwise. But there are probably exceptions, and I haven't found kj types and macros to cause problems in practice. In general kj types tend to have nice debugging and safety features and have a real rationale behind them. They are not just NIH. I think kj::Own in particular has capabilities std::unique_ptr doesn't have because it doesn't bake the deleter into the type definition, making it easier to support things like memory pools and implement zero-copy techniques cap'n proto uses internally.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your policy of only using [&] for indicating that the lambda will be called in current scope is nice when it comes to function-scope - as when creating server_connection, but less so in the case of TestSetup::server_disconnect when this is captured, which can perhaps be somewhat forgiven for a relatively simple lambda.


I agree that kj has some clever quirks that are missing from std. But is not a one-way street (implicit cast from Own to raw pointer for example). Would still advise towards using std when possible to make review more approachable. This passed mptest for example:

diff --git a/include/mp/proxy-io.h b/include/mp/proxy-io.h
index a0bdf13..6358c86 100644
--- a/include/mp/proxy-io.h
+++ b/include/mp/proxy-io.h
@@ -179,7 +179,7 @@ public:
 
     //! Run function on event loop thread. Does not return until function completes.
     //! Must be called while the loop() function is active.
-    void post(kj::Function<void()> fn);
+    void post(std::function<void()> fn);
 
     //! Wrapper around EventLoop::post that takes advantage of the
     //! fact that callable will not go out of scope to avoid requirement that it
@@ -231,7 +231,7 @@ public:
     std::thread m_async_thread;
 
     //! Callback function to run on event loop thread during post() or sync() call.
-    kj::Function<void()>* m_post_fn MP_GUARDED_BY(m_mutex) = nullptr;
+    std::function<void()>* m_post_fn MP_GUARDED_BY(m_mutex) = nullptr;
 
     //! Callback functions to run on async thread.
     std::optional<CleanupList> m_async_fns MP_GUARDED_BY(m_mutex);
diff --git a/src/mp/proxy.cpp b/src/mp/proxy.cpp
index 06825c9..6611cfb 100644
--- a/src/mp/proxy.cpp
+++ b/src/mp/proxy.cpp
@@ -263,7 +263,7 @@ void EventLoop::loop()
     m_cv.notify_all();
 }
 
-void EventLoop::post(kj::Function<void()> fn)
+void EventLoop::post(std::function<void()> fn)
 {
     if (std::this_thread::get_id() == m_thread_id) {
         fn();

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re: #201 (comment)

Nice find on lambda inconsistencies, that would make sense to clean up. Would be happy to review a PR with any cleanups and improvements like this.

On kj::Function vs std::function, this was actually changed recently in 52256e7 because std::function unlike kj::Function requires function objects to be copyable, which prevents writing lambdas that capture move-only objects. So I'm actually surprised that diff compiles. Maybe it compiles because it is only changing EventLoop::post not Waiter::post? Happy to change to whatever types work though.

Original file line number Diff line number Diff line change
Expand Up @@ -66,8 +66,6 @@ struct ProxyClient<Thread> : public ProxyClientBase<Thread, ::capnp::Void>
ProxyClient(const ProxyClient&) = delete;
~ProxyClient();

void setDisconnectCallback(const std::function<void()>& fn);

//! Reference to callback function that is run if there is a sudden
//! disconnect and the Connection object is destroyed before this
//! ProxyClient<Thread> object. The callback will destroy this object and
Expand Down
42 changes: 26 additions & 16 deletions include/mp/type-context.h
Original file line number Diff line number Diff line change
Expand Up @@ -89,29 +89,39 @@ auto PassField(Priority<1>, TypeList<>, ServerContext& server_context, const Fn&
// need to update the map.
auto& thread_context = g_thread_context;
auto& request_threads = thread_context.request_threads;
auto [request_thread, inserted]{SetThread(
GuardedRef{thread_context.waiter->m_mutex, request_threads},
server.m_context.connection,
[&] { return context_arg.getCallbackThread(); })};
ConnThread request_thread;
bool inserted;
server.m_context.loop->sync([&] {
std::tie(request_thread, inserted) = SetThread(
GuardedRef{thread_context.waiter->m_mutex, request_threads}, server.m_context.connection,
[&] { return context_arg.getCallbackThread(); });
});

// If an entry was inserted into the requests_threads map,
// If an entry was inserted into the request_threads map,
// remove it after calling fn.invoke. If an entry was not
// inserted, one already existed, meaning this must be a
// recursive call (IPC call calling back to the caller which
// makes another IPC call), so avoid modifying the map.
const bool erase_thread{inserted};
KJ_DEFER(if (erase_thread) {
Lock lock(thread_context.waiter->m_mutex);
// Call erase here with a Connection* argument instead
// of an iterator argument, because the `request_thread`
// iterator may be invalid if the connection is closed
// during this function call. More specifically, the
// iterator may be invalid because SetThread adds a
// cleanup callback to the Connection destructor that
// erases the thread from the map, and also because the
// ProxyServer<Thread> destructor calls
// request_threads.clear().
request_threads.erase(server.m_context.connection);
// Erase the request_threads entry on the event loop
// thread with loop->sync(), so if the connection is
// broken there is not a race between this thread and
// the disconnect handler trying to destroy the thread
// client object.
server.m_context.loop->sync([&] {
// Look up the thread again without using existing
// iterator since entry may no longer be there after
// a disconnect. Destroy node after releasing
// Waiter::m_mutex, so the ProxyClient<Thread>
// destructor is able to use EventLoop::mutex
// without violating lock order.
ConnThreads::node_type removed;
{
Lock lock(thread_context.waiter->m_mutex);
removed = request_threads.extract(server.m_context.connection);
}
});
});
fn.invoke(server_context, args...);
}
Expand Down
32 changes: 22 additions & 10 deletions src/mp/proxy.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@

#include <atomic>
#include <capnp/capability.h>
#include <capnp/common.h> // IWYU pragma: keep
#include <capnp/rpc.h>
#include <condition_variable>
#include <functional>
Expand Down Expand Up @@ -80,6 +81,11 @@ ProxyContext::ProxyContext(Connection* connection) : connection(connection), loo

Connection::~Connection()
{
// Connection destructor is always called on the event loop thread. If this
// is a local disconnect, it will trigger I/O, so this needs to run on the
// event loop thread, and if there was a remote disconnect, this is called
// by an onDisconnect callback directly from the event loop thread.
assert(std::this_thread::get_id() == m_loop->m_thread_id);
// Shut down RPC system first, since this will garbage collect any
// ProxyServer objects that were not freed before the connection was closed.
// Typically all ProxyServer objects associated with this connection will be
Expand Down Expand Up @@ -155,6 +161,9 @@ CleanupIt Connection::addSyncCleanup(std::function<void()> fn)

void Connection::removeSyncCleanup(CleanupIt it)
{
// Require cleanup functions to be removed on the event loop thread to avoid
// needing to deal with them being removed in the middle of a disconnect.
assert(std::this_thread::get_id() == m_loop->m_thread_id);
const Lock lock(m_loop->m_mutex);
m_sync_cleanup_fns.erase(it);
}
Expand Down Expand Up @@ -306,6 +315,7 @@ bool EventLoop::done() const

std::tuple<ConnThread, bool> SetThread(GuardedRef<ConnThreads> threads, Connection* connection, const std::function<Thread::Client()>& make_thread)
{
assert(std::this_thread::get_id() == connection->m_loop->m_thread_id);
ConnThread thread;
bool inserted;
{
Expand All @@ -314,7 +324,7 @@ std::tuple<ConnThread, bool> SetThread(GuardedRef<ConnThreads> threads, Connecti
}
if (inserted) {
thread->second.emplace(make_thread(), connection, /* destroy_connection= */ false);
thread->second->setDisconnectCallback([threads, thread] {
thread->second->m_disconnect_cb = connection->addSyncCleanup([threads, thread] {
// Note: it is safe to use the `thread` iterator in this cleanup
// function, because the iterator would only be invalid if the map entry
// was removed, and if the map entry is removed the ProxyClient<Thread>
Expand All @@ -323,9 +333,10 @@ std::tuple<ConnThread, bool> SetThread(GuardedRef<ConnThreads> threads, Connecti
// Connection is being destroyed before thread client is, so reset
// thread client m_disconnect_cb member so thread client destructor does not
// try to unregister this callback after connection is destroyed.
thread->second->m_disconnect_cb.reset();

// Remove connection pointer about to be destroyed from the map
const Lock lock(threads.mutex);
thread->second->m_disconnect_cb.reset();
threads.ref.erase(thread);
});
}
Expand All @@ -338,17 +349,18 @@ ProxyClient<Thread>::~ProxyClient()
// cleanup callback that was registered to handle the connection being
// destroyed before the thread being destroyed.
if (m_disconnect_cb) {
m_context.connection->removeSyncCleanup(*m_disconnect_cb);
// Remove disconnect callback on the event loop thread with
// loop->sync(), so if the connection is broken there is not a race
// between this thread trying to remove the callback and the disconnect
// handler attempting to call it.
m_context.loop->sync([&]() {
if (m_disconnect_cb) {
m_context.connection->removeSyncCleanup(*m_disconnect_cb);
}
});
}
}

void ProxyClient<Thread>::setDisconnectCallback(const std::function<void()>& fn)
{
assert(fn);
assert(!m_disconnect_cb);
m_disconnect_cb = m_context.connection->addSyncCleanup(fn);
}

ProxyServer<Thread>::ProxyServer(ThreadContext& thread_context, std::thread&& thread)
: m_thread_context(thread_context), m_thread(std::move(thread))
{
Expand Down
Loading