-
Notifications
You must be signed in to change notification settings - Fork 37
Made SpawnProcess() behavior safe post fork() #237
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
The next commit will use it to create an alternative to ExecProcess that's safe to call post-fork. Added a note that ExecProcess is not safe to call post-fork. Since it's only used by mpgen at build time, this is fine, and lets of print useful errors.
|
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. ReviewsSee the guideline for information on the review process.
If your review is incorrectly listed, please copy-paste ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
97aeeec to
9909f1a
Compare
Evaluate fd_to_args and build argv in the parent before fork(). In the child between fork() and execvp(), avoid throwing/allocating and use _exit(1) on error. This likely fixes the occasional deadlock in Bitcoin Core functional test, noticed in bitcoin/bitcoin#34187.
9909f1a to
4e84072
Compare
|
Fixed and pre-empted a few other style nits. Hopefully fixed both CI build failures too (include-what-you-use). |
4e84072 to
dbbe02e
Compare
|
@DrahtBot code suggestion is stale, was fixed in the last push. |
dbbe02e to
78e50fb
Compare
|
Added code comments to explain how the test works, and to point out it could trigger false negatives (missing a regression), but won't have spurious failures. |
This test fails without the previous commit.
78e50fb to
0baf75c
Compare
It wasn't stale. The LLM with this input had a 10% chance of producing this slop response, and you hit it. 🎉 |
|
@maflcko I'm prepared for next time :-) https://gist.github.com/Sjors/dbe7e7f0159c7fe00f7dc914d1db4242 |
ryanofsky
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code review ACK 0baf75c. I wouldn't have guessed that this could cause bitcoin/bitcoin#34187, but it makes sense and could easily explain the echoipc child process randomly hanging.
I'm a little doubtful about how valuable the test in third commit is since it is long and only catches fd_to_args being called incorrectly in the child process, not MakeArgv, or future bad code that could be added there. But the test is pretty clean, and could be dropped if it ever becomes a maintenance burden.
Left some suggestions below, but happy to merge this as-is if preferred
| if (close(fds[pid ? 0 : 1]) != 0) { | ||
| throw std::system_error(errno, std::system_category(), "close"); | ||
| if (pid) throw std::system_error(errno, std::system_category(), "close"); | ||
| _exit(1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "Precompute argv before fork in SpawnProcess" (9d3c2b6)
Getting rid of the throw here makes sense but it'd also be nice to have diagnostic and not just return 1. Would suggest something like:
static constexpr char msg[] = "SpawnProcess(child): close(fds[1]) failed\n";
(void)::write(STDERR_FILENO, msg, sizeof(msg) - 1);
_exit(126);| ExecProcess(fd_to_args(fds[0])); | ||
|
|
||
| execvp(argv[0], argv.data()); | ||
| perror("execvp failed"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "Precompute argv before fork in SpawnProcess" (9d3c2b6)
Keeping perror is probaby the best choice for now even though strictly speaking it might not be safe and could hang. Would suggest adding a comment like "// NOTE: perror() is not async-signal-safe; calling it here in a post-fork child may still deadlock in multithreaded parents."
Note for future todo: It would be nice to improve this later by adding an error-reporting pipe, writing errno to it so parent process can provide a better diagnostic to code calling it. This idea came up previously in #213 (comment) / 286fe46
| // must _exit(1) (post-fork child must not throw). | ||
| if (close(fds[pid ? 0 : 1]) != 0) { | ||
| throw std::system_error(errno, std::system_category(), "close"); | ||
| if (pid) throw std::system_error(errno, std::system_category(), "close"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "Precompute argv before fork in SpawnProcess" (9d3c2b6)
Note: It looks like there is a pre-existing bug here where if close(fds[0]) fails in the parent fd[1] will not be closed either and could be leaked. Maybe doesn't matter since if one close fails, not sure what chances of other one succeeding could be, but this might be good to address in followup.
|
|
||
| execvp(argv[0], argv.data()); | ||
| perror("execvp failed"); | ||
| _exit(1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "Precompute argv before fork in SpawnProcess" (9d3c2b6)
Would suggest replacing 1 with 127, since this is more distinctive and is also what bash returns in $? when a command fails to execute
Bitcoin Core functional tests for the
echoipcRPC method were occasionally timing out, see bitcoin/bitcoin#34187. A helpful LLM quickly figured out it was due to doing unsafe things afterfork().https://pubs.opengroup.org/onlinepubs/009695399/functions/fork.html
Fix it by having
SpawnProcessevaluatefd_to_argsand buildargvin the parent beforefork().SpawnProcessnow avoidsExecProcess. I intentionally did not fix the latter, not just to reduce churn, but also because it's now only used bympgenand it doesn't seem worth losing the debug errors it (unsafely) throws.The last commit adds a test which reproduces the issue.