-
Notifications
You must be signed in to change notification settings - Fork 2.1k
fix(ci): epoll on pidfd to wait for Firecracker exit #4847
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(ci): epoll on pidfd to wait for Firecracker exit #4847
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #4847 +/- ##
=======================================
Coverage 83.96% 83.96%
=======================================
Files 250 250
Lines 27756 27756
=======================================
Hits 23304 23304
Misses 4452 4452
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
e2f619f
to
b9452d1
Compare
3c6811d
to
bcf1564
Compare
Currently, we use psutil.pid_exists in a loop with a timeout of 10 seconds. This is racy and indeed some times we hit it in our CI. Substitute this mechanism with calling epoll() on the pidfd of the process instead. This should deterministically block until the process exits. If there's something else wrong, we will hit the pytest timeout. Signed-off-by: Babis Chalios <[email protected]>
bcf1564
to
3eeb00c
Compare
@ShadowCurse I had to revert back to using |
How can it hit the limit if you are waiting for 1 fd? |
It's not about the numbers of PIDs you're waiting on. It's about the maximum PID value it can handle. Reading from
|
Since we only use |
If the process with pid
I thought of doing that, but I think it makes the error handling much more complicated. |
Ok, but to avoid issue with 2 calls: |
|
Changes
Substitute this mechanism with calling epoll() on the pidfd of the process instead. This should deterministically block until the process exits. If there's something else wrong, we will hit the pytest timeout.
Reason
Currently, we use psutil.pid_exists in a loop with a timeout of 10 seconds. This is racy and indeed some times we hit it in our CI.
License Acceptance
By submitting this pull request, I confirm that my contribution is made under
the terms of the Apache 2.0 license. For more information on following Developer
Certificate of Origin and signing off your commits, please check
CONTRIBUTING.md
.PR Checklist
PR.
CHANGELOG.md
.TODO
s link to an issue.contribution quality standards.
rust-vmm
.