Skip to content

Conversation

@roypat
Copy link
Contributor

@roypat roypat commented Jun 20, 2025

As part of #5260 we noticed that parts of our virtio queue validation logic was duplicated between Queue::is_valid and Queue::initialize, resulting in most queue invariants being checked twice needlessly. This PR fixed this by merging the two functions, and having all validation happen right before activation (refusing to activate the device if validation of any of its queues fails).

License Acceptance

By submitting this pull request, I confirm that my contribution is made under
the terms of the Apache 2.0 license. For more information on following Developer
Certificate of Origin and signing off your commits, please check
CONTRIBUTING.md.

PR Checklist

  • I have read and understand CONTRIBUTING.md.
  • I have run tools/devtool checkstyle to verify that the PR passes the
    automated style checks.
  • I have described what is done in these changes, why they are needed, and
    how they are solving the problem in a clear and encompassing way.
  • I have updated any relevant documentation (both in code and in the docs)
    in the PR.
  • I have mentioned all user-facing changes in CHANGELOG.md.
  • If a specific issue led to this PR, this PR closes the issue.
  • When making API changes, I have followed the
    Runbook for Firecracker API changes.
  • I have tested all new and changed functionalities in unit tests and/or
    integration tests.
  • I have linked an issue to every new TODO.

  • This functionality cannot be added in rust-vmm.

Just call initialize() again. It does some needless alignment checking,
but that's not really harmful.

Signed-off-by: Patrick Roy <[email protected]>
@roypat roypat force-pushed the vring-cleanup branch 3 times, most recently from 6fbb59c to f8cff4e Compare June 23, 2025 08:51
@codecov
Copy link

codecov bot commented Jun 23, 2025

Codecov Report

Attention: Patch coverage is 87.50000% with 4 lines in your changes missing coverage. Please review.

Project coverage is 82.91%. Comparing base (ea6a553) to head (375549b).
Report is 5 commits behind head on main.

Files with missing lines Patch % Lines
src/vmm/src/devices/virtio/device.rs 0.00% 3 Missing ⚠️
src/vmm/src/persist.rs 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #5276      +/-   ##
==========================================
+ Coverage   82.83%   82.91%   +0.08%     
==========================================
  Files         250      250              
  Lines       26968    26902      -66     
==========================================
- Hits        22338    22306      -32     
+ Misses       4630     4596      -34     
Flag Coverage Δ
5.10-c5n.metal 83.35% <87.50%> (+0.04%) ⬆️
5.10-m5n.metal 83.34% <87.50%> (+0.03%) ⬆️
5.10-m6a.metal 82.56% <87.50%> (+0.04%) ⬆️
5.10-m6g.metal 79.17% <87.50%> (+0.02%) ⬆️
5.10-m6i.metal 83.34% <87.50%> (+0.03%) ⬆️
5.10-m7a.metal-48xl 82.55% <87.50%> (?)
5.10-m7g.metal 79.17% <87.50%> (+0.02%) ⬆️
5.10-m7i.metal-24xl 83.31% <87.50%> (?)
5.10-m7i.metal-48xl 83.30% <87.50%> (?)
5.10-m8g.metal-24xl 79.17% <87.50%> (?)
5.10-m8g.metal-48xl 79.17% <87.50%> (?)
6.1-c5n.metal 83.40% <87.50%> (+0.04%) ⬆️
6.1-m5n.metal 83.40% <87.50%> (+0.03%) ⬆️
6.1-m6a.metal 82.61% <87.50%> (+0.03%) ⬆️
6.1-m6g.metal 79.17% <87.50%> (+0.02%) ⬆️
6.1-m6i.metal 83.38% <87.50%> (+0.02%) ⬆️
6.1-m7a.metal-48xl 82.59% <87.50%> (?)
6.1-m7g.metal 79.17% <87.50%> (+0.02%) ⬆️
6.1-m7i.metal-24xl 83.41% <87.50%> (?)
6.1-m7i.metal-48xl 83.41% <87.50%> (?)
6.1-m8g.metal-24xl 79.16% <87.50%> (?)
6.1-m8g.metal-48xl 79.17% <87.50%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

roypat added 3 commits June 23, 2025 10:01
Factor out the alignment checks on the vring components into
get_slice_ptr, instead of writing them out 3 times in initialize().
While we're at it, also explain why its okay to only alignment check the
GPA and not the HVAs as well.

Signed-off-by: Patrick Roy <[email protected]>
Vring validation was a bit awkwardly split across two functions which
did overlapping sets of checks: Queue::initialize verified alignment and
memory accesses, while Queue::is_valid additionally checked Queue::ready
and Queue::size. However, on the activation path, both were called,
meanign we checked alignment twice (.initialize() is called in
.activate(), but we only call .activate() if .is_valid() returned true).
This is confusing at best, and at worst made us potentially virtio spec
incompliant: If the quest tried to activate a virtio device, but
this failed because some vring was not valid (in terms of
Queue::is_valid), then Firecracker would silently ignore the activation
request. Now, it instead marks the device as needing reset, and notifies
the guest of its failure to properly configure the vrings.

While we're at it, also remove some duplicated checks from the vring
restoration code: .initialize() is called for activated devices, so
there's no need to later validate the size specifically again, and also
no need for the additional call to is_valid().

Fix up some unit tests that activate virtio devices where some queues do
not satisfy the old Queue::is_valid() checks, as now these checks must
pass for activation to succeed. The only interesting fix here is in
test_virtiodev_sanity_checks in virtio/persist.rs, which can be seen as
a symptom of a bug fix: Previously, restoration code refused to load
snapshots that had their queue size set to a value larger than
Queue::max_size, even if a device was not activated. This is arguably
wrong, as The guest can configure a queue to have a size greater than
max size no problem, and never activate the device for example, in which
case prior to this commit Firecracker would refuse to resume snapshots
taken of such VMs.

Signed-off-by: Patrick Roy <[email protected]>
verify_size only had assertions about our mocks, which is not very
useful (in fact, the second assertion was trivially true, no matter what
we did). So let's just remove it.

Signed-off-by: Patrick Roy <[email protected]>
@roypat roypat marked this pull request as ready for review June 23, 2025 09:35
@roypat roypat added the Status: Awaiting review Indicates that a pull request is ready to be reviewed label Jun 23, 2025
@Manciukic Manciukic enabled auto-merge (rebase) June 24, 2025 15:01
@Manciukic Manciukic merged commit ca044db into firecracker-microvm:main Jun 24, 2025
6 of 7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Status: Awaiting review Indicates that a pull request is ready to be reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants