Skip to content

Conversation

@roypat
Copy link
Contributor

@roypat roypat commented May 15, 2025

Print uffd handler logs whenever the test framework dumps debug information in response to an unexpected exception/other failure.

License Acceptance

By submitting this pull request, I confirm that my contribution is made under
the terms of the Apache 2.0 license. For more information on following Developer
Certificate of Origin and signing off your commits, please check
CONTRIBUTING.md.

PR Checklist

  • I have read and understand CONTRIBUTING.md.
  • I have run tools/devtool checkstyle to verify that the PR passes the
    automated style checks.
  • I have described what is done in these changes, why they are needed, and
    how they are solving the problem in a clear and encompassing way.
  • I have updated any relevant documentation (both in code and in the docs)
    in the PR.
  • I have mentioned all user-facing changes in CHANGELOG.md.
  • If a specific issue led to this PR, this PR closes the issue.
  • When making API changes, I have followed the
    Runbook for Firecracker API changes.
  • I have tested all new and changed functionalities in unit tests and/or
    integration tests.
  • I have linked an issue to every new TODO.

  • This functionality cannot be added in rust-vmm.

If we fail to kill firecracker because it already die, we print a hint
about the uffd handler being potentially responsible. But this only
makes sense if a uffd handler was actually spawned, so restrict printing
of the hint to this case. While we're at it, also print the uffd logs in
this case, so that the truth of the hint can easily be verified (uffd
killed firecracker iff uffd panicked).

Signed-off-by: Patrick Roy <[email protected]>
@codecov
Copy link

codecov bot commented May 15, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 82.93%. Comparing base (55a84be) to head (74ba11c).
Report is 3 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #5211      +/-   ##
==========================================
+ Coverage   82.88%   82.93%   +0.05%     
==========================================
  Files         250      250              
  Lines       26936    26936              
==========================================
+ Hits        22325    22339      +14     
+ Misses       4611     4597      -14     
Flag Coverage Δ
5.10-c5n.metal 83.37% <ø> (+<0.01%) ⬆️
5.10-m5n.metal 83.36% <ø> (ø)
5.10-m6a.metal 82.59% <ø> (ø)
5.10-m6g.metal 79.19% <ø> (ø)
5.10-m6i.metal 83.36% <ø> (ø)
5.10-m7a.metal-48xl 82.57% <ø> (?)
5.10-m7g.metal 79.19% <ø> (ø)
5.10-m7i.metal-24xl 83.32% <ø> (?)
5.10-m7i.metal-48xl 83.32% <ø> (?)
5.10-m8g.metal-24xl 79.19% <ø> (?)
5.10-m8g.metal-48xl 79.19% <ø> (?)
6.1-c5n.metal 83.41% <ø> (+<0.01%) ⬆️
6.1-m5n.metal 83.41% <ø> (ø)
6.1-m6a.metal 82.63% <ø> (ø)
6.1-m6g.metal 79.19% <ø> (-0.01%) ⬇️
6.1-m6i.metal 83.40% <ø> (-0.01%) ⬇️
6.1-m7a.metal-48xl 82.62% <ø> (?)
6.1-m7g.metal 79.19% <ø> (ø)
6.1-m7i.metal-24xl 83.42% <ø> (?)
6.1-m7i.metal-48xl 83.42% <ø> (?)
6.1-m8g.metal-24xl 79.19% <ø> (?)
6.1-m8g.metal-48xl 79.19% <ø> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Move the snapshot creation latency test into test_snapshot_ab.py. It
won't be automatically A/B-tested (because pipeline_perf.py refers to
specific test names inside this file), but technically it could be. No
functional change intended.

Doing this because I was getting annoyed at not being able to
tab-complete `test_s` into `test_snapshot_ab`.

Signed-off-by: Patrick Roy <[email protected]>
@roypat roypat force-pushed the test-failure-debug-help branch from 655ff65 to 5845843 Compare May 15, 2025 14:21
We have two places where the test framework dumps a bunch of debug info:
When something fails via SSH, and when we fail to kill a microVM because
it's already dead. Let's unify both of these to dump the _same_ debug
information, by having both call _dump_debug_information (generalized
slightly so that it no longer hardcodes "SSH gone wrong" as the error
message, and extended to also print uffd logs).

Signed-off-by: Patrick Roy <[email protected]>
@roypat roypat force-pushed the test-failure-debug-help branch from 5845843 to 74ba11c Compare May 15, 2025 14:30
@roypat roypat added the Status: Awaiting review Indicates that a pull request is ready to be reviewed label May 15, 2025
@roypat roypat merged commit 880e146 into firecracker-microvm:main May 15, 2025
7 of 8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Status: Awaiting review Indicates that a pull request is ready to be reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants