Skip to content

Add option to redirect guest serial console output to file #5350

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 15 commits into from
Aug 18, 2025

Conversation

roypat
Copy link
Contributor

@roypat roypat commented Aug 6, 2025

Add a new logger sub-configuration to specify a file into which guest serial console output should be written (instead of stdout).

Reason

Getting serial output from our integration tests would make debugging intermittent failures where SSH didn't work / the guest became unresponsive a lot easier

License Acceptance

By submitting this pull request, I confirm that my contribution is made under
the terms of the Apache 2.0 license. For more information on following Developer
Certificate of Origin and signing off your commits, please check
CONTRIBUTING.md.

PR Checklist

  • I have read and understand CONTRIBUTING.md.
  • I have run tools/devtool checkbuild --all to verify that the PR passes
    build checks on all supported architectures.
  • I have run tools/devtool checkstyle to verify that the PR passes the
    automated style checks.
  • I have described what is done in these changes, why they are needed, and
    how they are solving the problem in a clear and encompassing way.
  • I have updated any relevant documentation (both in code and in the docs)
    in the PR.
  • I have mentioned all user-facing changes in CHANGELOG.md.
  • If a specific issue led to this PR, this PR closes the issue.
  • When making API changes, I have followed the
    Runbook for Firecracker API changes.
  • I have tested all new and changed functionalities in unit tests and/or
    integration tests.
  • I have linked an issue to every new TODO.

  • This functionality cannot be added in rust-vmm.

Copy link

codecov bot commented Aug 6, 2025

Codecov Report

❌ Patch coverage is 73.61111% with 19 lines in your changes missing coverage. Please review.
✅ Project coverage is 82.29%. Comparing base (e7504ae) to head (ee0bfd8).
⚠️ Report is 13 commits behind head on main.

Files with missing lines Patch % Lines
src/vmm/src/device_manager/mod.rs 66.66% 7 Missing ⚠️
src/vmm/src/devices/legacy/serial.rs 0.00% 4 Missing ⚠️
src/vmm/src/lib.rs 70.00% 3 Missing ⚠️
src/vmm/src/rpc_interface.rs 0.00% 3 Missing ⚠️
src/firecracker/src/api_server/request/serial.rs 87.50% 1 Missing ⚠️
src/vmm/src/resources.rs 80.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #5350      +/-   ##
==========================================
+ Coverage   82.27%   82.29%   +0.01%     
==========================================
  Files         265      266       +1     
  Lines       30585    30619      +34     
==========================================
+ Hits        25165    25199      +34     
  Misses       5420     5420              
Flag Coverage Δ
5.10-c5n.metal 82.25% <69.23%> (-0.05%) ⬇️
5.10-m5n.metal 82.25% <69.23%> (-0.05%) ⬇️
5.10-m6a.metal 81.52% <69.23%> (-0.05%) ⬇️
5.10-m6g.metal 78.86% <66.17%> (-0.02%) ⬇️
5.10-m6i.metal 82.24% <69.23%> (-0.05%) ⬇️
5.10-m7a.metal-48xl 81.50% <69.23%> (?)
5.10-m7g.metal 78.86% <66.17%> (-0.02%) ⬇️
5.10-m7i.metal-24xl 82.21% <69.23%> (?)
5.10-m7i.metal-48xl 82.21% <69.23%> (?)
5.10-m8g.metal-24xl 78.86% <66.17%> (?)
5.10-m8g.metal-48xl 78.86% <66.17%> (?)
6.1-c5n.metal 82.29% <69.23%> (-0.05%) ⬇️
6.1-m5n.metal 82.29% <69.23%> (-0.05%) ⬇️
6.1-m6a.metal 81.57% <69.23%> (-0.05%) ⬇️
6.1-m6g.metal 78.86% <66.17%> (-0.02%) ⬇️
6.1-m6i.metal 82.28% <69.23%> (-0.05%) ⬇️
6.1-m7a.metal-48xl 81.55% <69.23%> (?)
6.1-m7g.metal 78.86% <66.17%> (-0.02%) ⬇️
6.1-m7i.metal-24xl 82.30% <69.23%> (?)
6.1-m7i.metal-48xl 82.30% <69.23%> (?)
6.1-m8g.metal-24xl 78.86% <66.17%> (?)
6.1-m8g.metal-48xl 78.86% <66.17%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@roypat roypat force-pushed the more-debug branch 2 times, most recently from 15dd953 to 78977d3 Compare August 7, 2025 07:48
@roypat roypat added the Status: Awaiting review Indicates that a pull request is ready to be reviewed label Aug 7, 2025
@roypat roypat changed the base branch from main to feature/pcie August 7, 2025 09:40
@roypat roypat force-pushed the more-debug branch 4 times, most recently from 0eedf63 to c4665a6 Compare August 7, 2025 11:54
@roypat roypat force-pushed the more-debug branch 6 times, most recently from aa06cf0 to c91280a Compare August 8, 2025 08:33
@roypat roypat force-pushed the more-debug branch 2 times, most recently from 1382aca to 1255df3 Compare August 12, 2025 13:29
@roypat roypat force-pushed the more-debug branch 2 times, most recently from 7f7e4df to 5dd94ee Compare August 12, 2025 14:58
@roypat roypat force-pushed the more-debug branch 3 times, most recently from e46633c to 049d1b3 Compare August 12, 2025 15:49
@roypat roypat changed the base branch from feature/pcie to main August 12, 2025 15:49
@roypat roypat force-pushed the more-debug branch 3 times, most recently from 2b363cb to 926e8d3 Compare August 14, 2025 14:13
roypat added 13 commits August 14, 2025 15:14
This type is only useful if we are dynamically dispatching io via the
Read/Write traits, but we are not doing that (we converted everything to
static dispatch a while back, for better or for worse, and to avoid
conflicts with the PCI branch, I'm not reverting that here).

Signed-off-by: Patrick Roy <[email protected]>
The `event_observer` field in the Vmm struct is of type Option<Stdin>.
There are two problems
1. With the way the code is written, it will never be `None`
2. `Stdin` is a singleton, there is no need to store it _anywhere_.

With that in mind, we can just remove this field, and update its two
uses to just directly operate on std::io::stdin(). Since it never
`None', we can also remove the logic that matches and handles the `None`
case. Furthermore, the `Drop` impl used to print the same error message
twice in case resetting stdin to canonical mode failed, so fix that to
only print it once.

Signed-off-by: Patrick Roy <[email protected]>
We always called it right before setup_serial_device(), so just move the
call inside. This will also be helpful once setup_serial_device() can
setup the serial device to print to a file, where we dont need to mess
with stdout in the first place.

Signed-off-by: Patrick Roy <[email protected]>
Have teh aarch64 serial device restoration path use the common serial
device initialization code in setup_serial_device(). The only functional
change here is that now we do set_stdou_nonblocking() on the restore
path, although I would wager that not doing so in the past was actually
an oversight, because we do it for all booted VMs, and for restored VMs
on x86.

Signed-off-by: Patrick Roy <[email protected]>
we only write to it, so no need to pass .read(true).

Signed-off-by: Patrick Roy <[email protected]>
Add a new field, `serial_out_path`, to the logger configuration
(available both via API and CLI) which can be set to the path of a file
into which the output of the guest's serial console should be dumped.

Have the file behave identically to our log file (e.g. it must already
exist, Firecracker does not create it).

If we redirect serial output to a log file, disable serial input via
stdin.

Signed-off-by: Patrick Roy <[email protected]>
Copy the screen log file (which contains firecracker's stdout/guest
dmesg if non-daemonized / no log file specified) to the test_results
directory, to be included in artifacts, instead of explicitly reading
one file and subsequently writing another file in Python code.

Signed-off-by: Patrick Roy <[email protected]>
If SSH to guest is failing, we don't really get much useful in terms of
logs in our ci artifacts. Enable serial console so that we get guest
dmesg always, at least if we have the API server around (which is the
case for the vast majority of tests).

Since serial console is enabled by default now, this means we can stop
explicitly passing modified bootargs in a bunch of tests now.

Disable it for the tests that interact with the serial console via
stdin/stdout (here we are not daemonizing, so we are capturing serial
output anyway).

Disable pylint's "too-many-statements" lint, because it started firing
in microvm.spawn()

Signed-off-by: Patrick Roy <[email protected]>
Add a test that verifies having serial output go to a file works.

Signed-off-by: Patrick Roy <[email protected]>
In test_serial_console_login, we were implementing something that takes
three lines (send and receive data via serial) via an overcomplicated
state machine system. Eliminate the 100 lines of state machine code and
instead just... do the 3 lines of serial.tx/rx.

Signed-off-by: Patrick Roy <[email protected]>
If a test fails, take a snapshot fo the microvms involved, and upload it
as a CI artifact, for post-mortem analysis.

Signed-off-by: Patrick Roy <[email protected]>
These have only caused confusion, as they're output often makes no
sense, and we generally just end up ignoring them (but have discussions
about it every single time). Let's remove them.

Signed-off-by: Patrick Roy <[email protected]>
When A/B-testing, the "A" revision firecracker binary does not know
about the new --serial-out-path CLI argument. Thus, disable the serial
console for all A/B-tests. It gets quite ugly to do for the
vulnerability tests, because the .spawn() call is burried deep in the
fixtures.

This commit should be reverted after merging the PR

Signed-off-by: Patrick Roy <[email protected]>
@roypat roypat requested a review from bchalios August 18, 2025 11:20
Copy link
Contributor

@bchalios bchalios left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great to me. Just a comment regarding the state machine dropping

@roypat roypat enabled auto-merge (rebase) August 18, 2025 12:44
@roypat roypat merged commit a2d1c9e into firecracker-microvm:main Aug 18, 2025
6 of 7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Awaiting review Indicates that a pull request is ready to be reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants