-
Notifications
You must be signed in to change notification settings - Fork 37
bug: fix mptest hang, ProxyClient<Thread> deadlock in disconnect handler #201
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
4e365b0
d60db60
9b07991
ca9b380
85df964
4a269b2
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,7 +1,7 @@ | ||
| CI_DESC="CI job running ThreadSanitizer" | ||
| CI_DIR=build-sanitize | ||
| export CXX=clang++ | ||
| export CXXFLAGS="-ggdb -Werror -Wall -Wextra -Wpedantic -Wthread-safety-analysis -Wno-unused-parameter -fsanitize=thread" | ||
| export CXXFLAGS="-ggdb -Werror -Wall -Wextra -Wpedantic -Wthread-safety -Wno-unused-parameter -fsanitize=thread" | ||
| CMAKE_ARGS=() | ||
| BUILD_ARGS=(-k -j4) | ||
| BUILD_TARGETS=(mptest) | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Tangent: Why does
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. re: https://github.com/bitcoin-core/libmultiprocess/pull/201/files#r2359060452
EventLoop documentation is meant to provide some background on this and specifically the link at the bottom https://groups.google.com/d/msg/capnproto/TuQFF1eH2-M/g81sHaTAAQAJ addresses this question. Cap'n Proto doesn't use threads and assumes all code calling it runs on a single thread. There are benefits and costs that come from this but one of the costs is that external threads need to use I/O to communicate with cap'n proto's event loop. There isn't another way for threads to send signals. (Or at least there wasn't at time I wrote this, maybe newer versions of cap'n proto provide something)
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think the main thing I was missing is the need for pumping the async IO. I assume waiting on a async IO-wrapped socket serves that purpose? Was curious why you didn't use Q: I can see there's a
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. re: #201 (comment) Yes IIRC wrapSocketFd does take ownership. A lot of this code is also updated & made more generic in windows PR bitcoin/bitcoin#32387 in case you are curious.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Tangent in unrelated file - regarding how easy it is to understand libmultiprocess:
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. re: #201 (comment)
This is probably true and I'm currently working on test improvements where I'm adding more members to the TestSetup class and could add more comments. I'm not sure I would always prefer explicit variable captures though. I think
I think basically yes but the word "only" is too strong. The code does tends to use kj library when cap'nproto types are involved and std library otherwise. But there are probably exceptions, and I haven't found kj types and macros to cause problems in practice. In general kj types tend to have nice debugging and safety features and have a real rationale behind them. They are not just NIH. I think kj::Own in particular has capabilities std::unique_ptr doesn't have because it doesn't bake the deleter into the type definition, making it easier to support things like memory pools and implement zero-copy techniques cap'n proto uses internally.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Your policy of only using I agree that diff --git a/include/mp/proxy-io.h b/include/mp/proxy-io.h
index a0bdf13..6358c86 100644
--- a/include/mp/proxy-io.h
+++ b/include/mp/proxy-io.h
@@ -179,7 +179,7 @@ public:
//! Run function on event loop thread. Does not return until function completes.
//! Must be called while the loop() function is active.
- void post(kj::Function<void()> fn);
+ void post(std::function<void()> fn);
//! Wrapper around EventLoop::post that takes advantage of the
//! fact that callable will not go out of scope to avoid requirement that it
@@ -231,7 +231,7 @@ public:
std::thread m_async_thread;
//! Callback function to run on event loop thread during post() or sync() call.
- kj::Function<void()>* m_post_fn MP_GUARDED_BY(m_mutex) = nullptr;
+ std::function<void()>* m_post_fn MP_GUARDED_BY(m_mutex) = nullptr;
//! Callback functions to run on async thread.
std::optional<CleanupList> m_async_fns MP_GUARDED_BY(m_mutex);
diff --git a/src/mp/proxy.cpp b/src/mp/proxy.cpp
index 06825c9..6611cfb 100644
--- a/src/mp/proxy.cpp
+++ b/src/mp/proxy.cpp
@@ -263,7 +263,7 @@ void EventLoop::loop()
m_cv.notify_all();
}
-void EventLoop::post(kj::Function<void()> fn)
+void EventLoop::post(std::function<void()> fn)
{
if (std::this_thread::get_id() == m_thread_id) {
fn();
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. re: #201 (comment) Nice find on lambda inconsistencies, that would make sense to clean up. Would be happy to review a PR with any cleanups and improvements like this. On kj::Function vs std::function, this was actually changed recently in 52256e7 because std::function unlike kj::Function requires function objects to be copyable, which prevents writing lambdas that capture move-only objects. So I'm actually surprised that diff compiles. Maybe it compiles because it is only changing EventLoop::post not Waiter::post? Happy to change to whatever types work though. |
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
|
|
@@ -182,6 +182,17 @@ class MP_SCOPED_CAPABILITY Lock { | |||||
| std::unique_lock<std::mutex> m_lock; | ||||||
| }; | ||||||
|
|
||||||
| template<typename T> | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. meganit: Pretty sure Clang-format in main project prefers:
Suggested change
|
||||||
| struct GuardedRef | ||||||
| { | ||||||
| Mutex& mutex; | ||||||
| T& ref MP_GUARDED_BY(mutex); | ||||||
| }; | ||||||
|
|
||||||
| // CTAD for Clang 16: GuardedRef{mutex, x} -> GuardedRef<decltype(x)> | ||||||
| template <class U> | ||||||
| GuardedRef(Mutex&, U&) -> GuardedRef<U>; | ||||||
|
|
||||||
| //! Analog to std::lock_guard that unlocks instead of locks. | ||||||
| template <typename Lock> | ||||||
| struct UnlockGuard | ||||||
|
|
||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
4e365b0:
This is unrelated to the PR, but why generate debugging information here? I'm referring to the use of the
-ggdbflagThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re: #201 (comment)
I haven't tried without -ggdb, but sanitizers like ThreadSanitizer print stack traces that I think depend on having this debug information. Cap'n Proto also can use addr2line to print stack traces internally when there are errors (though I don't think we have this option enabled)