Skip to content

Conversation

Manciukic
Copy link
Contributor

@Manciukic Manciukic commented Jul 9, 2025

Changes

Fix miscellaneous bugs introduced by PCI:

  • PCI devices could only use 24 IRQ lines on x86, greatly limiting the amount of possible devices on the bus
  • virtio-pci devices couldn't be patched
  • virtio-pci balloon device couldn't be found
  • virtio-pci network couldn't be renamed on restore

Also, add additional test coverage that would have caught the bugs above by moving the PCI configuration to the VM factory so that we can have uvm_plain* fixtures also run the tests on PCIe kernels.

Additionally, also run more tests on PCI: ideally all tests should be run on PCI, unless there's a specific reason to.

Reason

Not all tests were running on PCIe, hiding some issues.

License Acceptance

By submitting this pull request, I confirm that my contribution is made under
the terms of the Apache 2.0 license. For more information on following Developer
Certificate of Origin and signing off your commits, please check
CONTRIBUTING.md.

PR Checklist

  • I have read and understand CONTRIBUTING.md.
  • I have run tools/devtool checkstyle to verify that the PR passes the
    automated style checks.
  • I have described what is done in these changes, why they are needed, and
    how they are solving the problem in a clear and encompassing way.
  • I have updated any relevant documentation (both in code and in the docs)
    in the PR.
  • I have mentioned all user-facing changes in CHANGELOG.md.
  • If a specific issue led to this PR, this PR closes the issue.
  • When making API changes, I have followed the
    Runbook for Firecracker API changes.
  • I have tested all new and changed functionalities in unit tests and/or
    integration tests.
  • I have linked an issue to every new TODO.

  • This functionality cannot be added in rust-vmm.

Copy link

codecov bot commented Jul 9, 2025

Codecov Report

Attention: Patch coverage is 47.32143% with 59 lines in your changes missing coverage. Please review.

Project coverage is 80.16%. Comparing base (19f2790) to head (f4e07c7).
Report is 16 commits behind head on feature/pcie.

Files with missing lines Patch % Lines
src/vmm/src/device_manager/mod.rs 0.00% 21 Missing ⚠️
src/vmm/src/lib.rs 0.00% 16 Missing ⚠️
src/vmm/src/persist.rs 0.00% 16 Missing ⚠️
src/vmm/src/arch/aarch64/gic/gicv2/mod.rs 0.00% 3 Missing ⚠️
...c/vmm/src/arch/aarch64/gic/gicv2/regs/dist_regs.rs 0.00% 2 Missing ⚠️
src/vmm/src/device_manager/mmio.rs 94.11% 1 Missing ⚠️
Additional details and impacted files
@@               Coverage Diff                @@
##           feature/pcie    #5300      +/-   ##
================================================
- Coverage         80.20%   80.16%   -0.05%     
================================================
  Files               265      265              
  Lines             30832    30863      +31     
================================================
+ Hits              24730    24741      +11     
- Misses             6102     6122      +20     
Flag Coverage Δ
5.10-c5n.metal 79.90% <43.15%> (-0.05%) ⬇️
5.10-m5n.metal 79.90% <43.15%> (-0.06%) ⬇️
5.10-m6a.metal 79.09% <43.15%> (-0.07%) ⬇️
5.10-m6g.metal 76.49% <43.80%> (-0.05%) ⬇️
5.10-m6i.metal 79.90% <43.15%> (-0.05%) ⬇️
5.10-m7a.metal-48xl 79.08% <43.15%> (-0.06%) ⬇️
5.10-m7g.metal 76.49% <43.80%> (-0.05%) ⬇️
5.10-m7i.metal-24xl 79.86% <43.15%> (-0.06%) ⬇️
5.10-m7i.metal-48xl 79.87% <43.15%> (-0.05%) ⬇️
5.10-m8g.metal-24xl 76.49% <43.80%> (-0.05%) ⬇️
5.10-m8g.metal-48xl 76.49% <43.80%> (-0.05%) ⬇️
6.1-c5n.metal 79.94% <43.15%> (-0.06%) ⬇️
6.1-m5n.metal 79.94% <43.15%> (-0.06%) ⬇️
6.1-m6a.metal 79.14% <43.15%> (-0.06%) ⬇️
6.1-m6g.metal 76.49% <43.80%> (-0.05%) ⬇️
6.1-m6i.metal 79.94% <43.15%> (-0.05%) ⬇️
6.1-m7a.metal-48xl 79.13% <43.15%> (-0.06%) ⬇️
6.1-m7g.metal 76.49% <43.80%> (-0.05%) ⬇️
6.1-m7i.metal-24xl 79.95% <43.15%> (-0.05%) ⬇️
6.1-m7i.metal-48xl 79.95% <43.15%> (-0.06%) ⬇️
6.1-m8g.metal-24xl 76.49% <43.80%> (-0.05%) ⬇️
6.1-m8g.metal-48xl 76.49% <43.80%> (-0.05%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@Manciukic Manciukic force-pushed the pcie/run-all-tests-on-pcie branch 7 times, most recently from 9f992ac to fcaff05 Compare July 9, 2025 16:26
@Manciukic Manciukic mentioned this pull request Jul 10, 2025
10 tasks
@Manciukic Manciukic force-pushed the pcie/run-all-tests-on-pcie branch 3 times, most recently from 104d1e1 to f1dd2b4 Compare July 10, 2025 13:33
The vmm was only checking the mmio device manager for finding the device
to update. Use the generic device manager instead.

Also update unit tests that expect a specific string.

Signed-off-by: Riccardo Mancini <[email protected]>
The code managing the balloon logic is only looking at the mmio device
manager. Make it use the generic device manager to find the device.

Signed-off-by: Riccardo Mancini <[email protected]>
The device rename wasn't working on PCI devices because the code only
checked the MMIO state.
Fix the bug by looking for the device to rename in both the mmio and pci
states.

Signed-off-by: Riccardo Mancini <[email protected]>
@Manciukic Manciukic force-pushed the pcie/run-all-tests-on-pcie branch 8 times, most recently from 56bb419 to 43484d5 Compare July 15, 2025 15:22
@Manciukic Manciukic marked this pull request as ready for review July 15, 2025 16:03
@Manciukic Manciukic added the Status: Awaiting review Indicates that a pull request is ready to be reviewed label Jul 15, 2025
@Manciukic Manciukic force-pushed the pcie/run-all-tests-on-pcie branch from 43484d5 to e6e1683 Compare July 16, 2025 08:49
Currently, we're limited to 24 GSI lines, which is too little for PCI
devices. Keep the current ranges as "legacy GSI", and create a new range
for "MSI GSI" that goes up to the kvm theoretical maximum of 4096 lines.

Signed-off-by: Riccardo Mancini <[email protected]>
To have a more consistent naming, it's best to use GSI instead of IRQ,
at least in places where it's meant just as an abstract index.

Signed-off-by: Riccardo Mancini <[email protected]>
Manciukic added 11 commits July 16, 2025 10:23
This patch makes 2 changes to make the test work on PCI:
 - simplify logic to find device address to be generic irrespective of
   ACPI/no-ACPI, PCI/no-PCI
 - move config offset from within the C program to the python test, as
   it's different between MMIO (0x100) and PCI (0x4000)

Signed-off-by: Riccardo Mancini <[email protected]>
Tell systemd not to use "predictable names" for network devices (eg
enp0s1), but keep the ethN set by the kernel.
This is equivalent to passing net.ifnames=0 to the kernel command line.

Signed-off-by: Riccardo Mancini <[email protected]>
pci=off is just an optimization to skip the probing, it shouldn't matter
to the functionality of the tests. Dropping it to allow them to run with
PCI.

Signed-off-by: Riccardo Mancini <[email protected]>
All tests using uvm_plain or uvm_plain_any will start using PCI as well,
allowing more coverage for the PCI code.
This requires moving the PCI configuration to the VM factory from the
spawn method.

Signed-off-by: Riccardo Mancini <[email protected]>
This patch updates all the places in the code with a uvm_plain* fixture
when that was equivalent to the previous behaviour.

In particular:
 - microvm_factory.build(guest_kernel_linux_5_10, rootfs) => uvm_plain
 - microvm_factory.build(guest_kernel, rootfs) => uvm_plain_any

Signed-off-by: Riccardo Mancini <[email protected]>
Simplify the test code by introducing two new fixtures that are used in
a few places in the code.
This will also allow these tests to run on PCI.

Signed-off-by: Riccardo Mancini <[email protected]>
Run the test_run_concurrency with PCI enabled as well.

Signed-off-by: Riccardo Mancini <[email protected]>
Refactor the code to use common fixtures and run all the tests with PCI
enabled as well.

Signed-off-by: Riccardo Mancini <[email protected]>
Run the initrd tests also with PCI enabled to verify everything is still
working correctly.

Signed-off-by: Riccardo Mancini <[email protected]>
Run test_memory_overhead performance test also with PCI enabled.

Signed-off-by: Riccardo Mancini <[email protected]>
Run the restore latency tests also with PCI enabled to verify there is
no change.

Signed-off-by: Riccardo Mancini <[email protected]>
@Manciukic Manciukic force-pushed the pcie/run-all-tests-on-pcie branch from e6e1683 to f4e07c7 Compare July 16, 2025 09:23
@Manciukic Manciukic changed the title Run all tests on PCIe kernel Miscellaneous PCI fixes and more test coverage Jul 16, 2025
Copy link
Contributor

@roypat roypat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am really looking forward to the day where we can drop the non-acpi/pci versions of these fixtures.

@roypat roypat enabled auto-merge (rebase) July 17, 2025 08:44
@roypat roypat merged commit ba962ba into firecracker-microvm:feature/pcie Jul 17, 2025
6 of 7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Status: Awaiting review Indicates that a pull request is ready to be reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants