Refactor: Move `GuestMemoryMmap` into `struct Vm` #5125

roypat · 2025-03-31T10:30:26Z

Changes

Move the GuestMemoryMmap that keeps track of a Firecracker VM's memory regions into struct Vm, from struct Vmm, and then update some related functionality (such as various snapshot helper functions) to operate on struct Vm instead of struct Vmm.

Reason

So far, Firecracker always dealt with a single collection of homogeneous guest memory regions. However, #5108 introduces a second type of memory regions, which makes the set of memory regions heterogeneous, and other features, such as virtio-pmem support, will come with similar "special" memory regions. By moving the guest memory management into Vm, we encapsule this heterogeneity and allow incremental building of guest memory configuration.

Admittedly, there's nothing right now that would benefit from this refactor (it doesn't even reduce LoC 😢), but it will most certainly make life on the feature branch easier to not have to continuously rebase this 😅

License Acceptance

By submitting this pull request, I confirm that my contribution is made under
the terms of the Apache 2.0 license. For more information on following Developer
Certificate of Origin and signing off your commits, please check
CONTRIBUTING.md.

PR Checklist

I have read and understand CONTRIBUTING.md.
I have run tools/devtool checkstyle to verify that the PR passes the
automated style checks.
I have described what is done in these changes, why they are needed, and
how they are solving the problem in a clear and encompassing way.
I have updated any relevant documentation (both in code and in the docs)
in the PR.
I have mentioned all user-facing changes in CHANGELOG.md.
If a specific issue led to this PR, this PR closes the issue.
When making API changes, I have followed the
Runbook for Firecracker API changes.
I have tested all new and changed functionalities in unit tests and/or
integration tests.
I have linked an issue to every new TODO.

This functionality cannot be added in rust-vmm.

The explicit calls to GuestMemoryMmap::create ended up being touched in like 5 of the following commits, so let's reduce the churn by putting it into a function. Signed-off-by: Patrick Roy <[email protected]>

Have all of them operate on some sort of list of (GuestAddress, usize) tuples, for consistency. Signed-off-by: Patrick Roy <[email protected]>

This struct will hold all the fields that are common between the x86_64 and aarch64 definition of `struct ArchVm`. For now, this is only the `VmFd`, but this will grow. Signed-off-by: Patrick Roy <[email protected]>

This is to prepare for memory regions being registered to a VM incrementally. While we're at it, add a unit test for it. It completes in roughly 50s on my machine, and will become more useful later once we actually allow building mmap regions incrementally. Signed-off-by: Patrick Roy <[email protected]>

Semantically, this is where it belong - the memory is associated with the virtual machine, not the virtual machine manager. But independently of this, it is needed for borrowchk reasons: In the future, each Vm will have two guest memory regions associated with it. One for traditional guest memory, and another that is dedicated to I/O (swiotlb region). Storing these two GuestMemoryMmap in `struct Vm` allows for a function as follows: fn io_memory(&self) -> &GuestMemoryMmap { self.io_memory.as_ref().unwrap_or(&self.normal_memory) } Such a function cannot live in `struct Vmm`, as that would mean borrowing the guest memory implies borrowing the entire `Vmm` object (because rustc is not intelligent enough to project split borrows past function calls - e.g. it doesn't know the references returned by io_memory() really only borrows the io_memory and normal_memory fields, not everything else.). However, this means it cannot be used in callsites where such splitting of borrows is required (e.g. everywhere where both a guest memory reference _and_ another Vmm field are passed by reference to a function). The `Option`-ness of the the guest_memory field is transitory. Signed-off-by: Patrick Roy <[email protected]>

This is in preparation of guest memory itself only getting set up after the call to create_vmm_and_vcpus(). Signed-off-by: Patrick Roy <[email protected]>

Add `Vm::register_memory_region[s]`, which allow adding new memory regions to a Vm incrementally (e.g. without overwriting the previously registered regions). To use this, have the functions in the `memory` module hand out Vec<GuestMemoryRegion> instead of GuestMemoryMmap, because vm_memory does not offer APIs for extracing the regions of a GuestMemoryMmap again. Signed-off-by: Patrick Roy <[email protected]>

Since the guest memory is now stored in `struct Vm`, we can more easily store the guest memory state inside the VmState. Signed-off-by: Patrick Roy <[email protected]>

Now that guest memory is tracked inside struct Vm instead of struct Vmm, the dirty bitmap tracking functions can also start living in struct Vm. While we're touching it anyway, update the types used to kvm memslot numbers to be the native u32. This saves us a few ugly usize->u32 cast. Also change the error type as to not drag in VmmError into the vm module, where it really doesn't belong. Signed-off-by: Patrick Roy <[email protected]>

Do this in create_snapshot() instead of snapshot_memory_to_file(), as it was the only operation inside snapshot_memory_to_file() that required access to a struct Vmm, opening the gates for move the rest of snapshot_memory_to_file() into struct Vm. Signed-off-by: Patrick Roy <[email protected]>

struct Vm encapsulates the memory, so having this function live there makes sense. It also prepares us for the future where multiple GuestMemoryMmap objects need to be managed. Signed-off-by: Patrick Roy <[email protected]>

Avoid some .map_err() calls using #[from] and the auto-From::from calling of the question mark operator. Signed-off-by: Patrick Roy <[email protected]>

We can just immediately propagate errors to the caller using `?` syntax, which saves us the "if (...).is_ok()" check. Signed-off-by: Patrick Roy <[email protected]>

Otherwise it's impossible to figure out what precisely went wrong during queue restoration. Signed-off-by: Patrick Roy <[email protected]>

codecov · 2025-03-31T12:22:41Z

Codecov Report

Attention: Patch coverage is 85.94249% with 44 lines in your changes missing coverage. Please review.

Project coverage is 83.03%. Comparing base (1f1c5c0) to head (8055e0a).
Report is 18 commits behind head on main.

Files with missing lines	Patch %	Lines
src/vmm/src/vstate/memory.rs	77.02%	17 Missing ⚠️
src/vmm/src/persist.rs	58.33%	10 Missing ⚠️
src/vmm/src/arch/x86_64/mod.rs	75.00%	6 Missing ⚠️
src/vmm/src/arch/aarch64/mod.rs	84.00%	4 Missing ⚠️
src/vmm/src/vstate/vm.rs	97.32%	3 Missing ⚠️
src/vmm/src/resources.rs	66.66%	2 Missing ⚠️
src/vmm/src/builder.rs	94.11%	1 Missing ⚠️
src/vmm/src/lib.rs	0.00%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #5125      +/-   ##
==========================================
- Coverage   83.07%   83.03%   -0.05%     
==========================================
  Files         250      250              
  Lines       26901    26911      +10     
==========================================
- Hits        22349    22345       -4     
- Misses       4552     4566      +14

Flag	Coverage Δ
5.10-c5n.metal	`83.55% <85.76%> (+0.01%)`	⬆️
5.10-m5n.metal	`83.55% <85.76%> (+0.01%)`	⬆️
5.10-m6a.metal	`82.71% <85.76%> (+<0.01%)`	⬆️
5.10-m6g.metal	`79.37% <85.92%> (-0.05%)`	⬇️
5.10-m6i.metal	`83.54% <85.76%> (+0.01%)`	⬆️
5.10-m7a.metal-48xl	`82.71% <85.76%> (?)`
5.10-m7g.metal	`79.37% <85.92%> (-0.05%)`	⬇️
6.1-c5n.metal	`83.60% <85.76%> (+0.01%)`	⬆️
6.1-m5n.metal	`83.59% <85.76%> (+0.01%)`	⬆️
6.1-m6a.metal	`82.76% <85.76%> (+0.01%)`	⬆️
6.1-m6g.metal	`79.37% <85.92%> (-0.05%)`	⬇️
6.1-m6i.metal	`83.59% <85.76%> (+0.01%)`	⬆️
6.1-m7a.metal-48xl	`82.76% <85.76%> (?)`
6.1-m7g.metal	`79.36% <85.92%> (-0.06%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copilot

Pull Request Overview

This pull request refactors the guest memory management by moving GuestMemoryMmap from Vmm into Vm and updating all corresponding usages in production and test code. Key changes include:

Replacing vmm.guest_memory with vmm.vm.guest_memory() in various modules.
Updating test helpers and signatures in builder, architecture, and device modules to accommodate the new guest memory API.
Minor improvements to error message formatting in snapshot persistence.

Reviewed Changes

Copilot reviewed 29 out of 29 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
src/vmm/src/gdb/*	Updated memory read/write calls to use vm.guest_memory()
src/vmm/src/devices/virtio/*	Adjusted guest memory calls and test module visibility
src/vmm/src/builder.rs	Updated memory initialization and snapshot restore
src/vmm/src/arch/x86_64/*	Updated arch_memory_regions and guest memory handling adjustments
src/vmm/src/arch/aarch64/*	Updated guest memory calls and VM state save/restore
src/vmm/benches/memory_access.rs	Minor improvement in accessing the first memory region

Comments suppressed due to low confidence (3)

src/vmm/src/devices/virtio/vhost_user.rs:462

[nitpick] The test module's visibility has been changed from private to pub(crate). If this change is intentional for test accessibility purposes, consider adding a comment to document the rationale behind exposing the tests.

pub(crate) mod tests {

src/vmm/src/devices/virtio/persist.rs:25

[nitpick] The error message for QueueConstruction now includes a placeholder. Please verify that this formatting meets your intended error reporting style and matches all related formatting conventions.

/// Could not restore queue: {0}

src/vmm/benches/memory_access.rs:16

[nitpick] Switching from iter().next() to first() improves clarity; please ensure that this change preserves the intended ordering and behavior of the memory regions.

let ptr = memory.first().unwrap().as_ptr();

src/vmm/src/arch/x86_64/mod.rs

src/vmm/src/vstate/memory.rs

src/vmm/src/arch/aarch64/mod.rs

src/vmm/src/arch/x86_64/mod.rs

src/vmm/src/persist.rs

bchalios · 2025-04-02T12:38:39Z

All in all, this is a great change I think. To answer your initial question, It definitely helps massively in case we want to build virtio-pmem at some point.

src/vmm/src/arch/aarch64/mod.rs

This allows having regions that start somewhere other than guest physical address 0. Useful in case not all regions are allocated at once, and each batch needs to be placed after all already allocated regions. Since the resulting code has some intricacies (and I actually got it wrong the first time around), write a kani harness to validate that we're computing the regions correctly. On ARM, add a warning in case we hit some questionable logic around truncation of memory sizes if they exceed architectural bounds (although if anyone is hitting this, they're already doing something wrong anyway, because it'd require a VM with more than 1022GiB of memory). Signed-off-by: Patrick Roy <[email protected]>

This way the non-snapshot path doesnt have to deal with passing in `None`. Signed-off-by: Patrick Roy <[email protected]>

Only enabling it when a balloon device is present doesn't really gain us anything, because the only time this feature is checked in the kernel is as part of madvise handling - but without balloon devices we wont ever call madvise, so it makes no difference whether we enabled the feature or not (it is unconditionally available since linux 4.10). Signed-off-by: Patrick Roy <[email protected]>

Dropping the VM object closes the VM FD, which can cause the test to hang. Signed-off-by: Patrick Roy <[email protected]>

roypat force-pushed the guest-mem-in-vm branch 3 times, most recently from d9de063 to 7933d66 Compare March 31, 2025 11:06

roypat marked this pull request as ready for review March 31, 2025 11:18

roypat force-pushed the guest-mem-in-vm branch from 7933d66 to 84e3021 Compare March 31, 2025 11:18

roypat added 14 commits March 31, 2025 13:17

refactor(test): add helper function for creating mem in vhost tests

307e77b

The explicit calls to GuestMemoryMmap::create ended up being touched in like 5 of the following commits, so let's reduce the churn by putting it into a function. Signed-off-by: Patrick Roy <[email protected]>

refactor: Have all memory creation method take similar arguments

134e22a

Have all of them operate on some sort of list of (GuestAddress, usize) tuples, for consistency. Signed-off-by: Patrick Roy <[email protected]>

refactor: introduce struct VmCommon

a7755b1

This struct will hold all the fields that are common between the x86_64 and aarch64 definition of `struct ArchVm`. For now, this is only the `VmFd`, but this will grow. Signed-off-by: Patrick Roy <[email protected]>

refactor: load kernel/initrd after create_vmm_and_vcpus()

ce33334

This is in preparation of guest memory itself only getting set up after the call to create_vmm_and_vcpus(). Signed-off-by: Patrick Roy <[email protected]>

refactor: move GuestMemoryState into VmState

4aea148

Since the guest memory is now stored in `struct Vm`, we can more easily store the guest memory state inside the VmState. Signed-off-by: Patrick Roy <[email protected]>

refactor: move snapshot_memory_to_file into struct Vm

88cd6bf

struct Vm encapsulates the memory, so having this function live there makes sense. It also prepares us for the future where multiple GuestMemoryMmap objects need to be managed. Signed-off-by: Patrick Roy <[email protected]>

refactor: #[from]-ify CreateSnapshotError

93a38c0

Avoid some .map_err() calls using #[from] and the auto-From::from calling of the question mark operator. Signed-off-by: Patrick Roy <[email protected]>

refactor: cleanup error propagation in snapshot_emory_to_file

8817e2f

We can just immediately propagate errors to the caller using `?` syntax, which saves us the "if (...).is_ok()" check. Signed-off-by: Patrick Roy <[email protected]>

fix: print inner error in case of PersistError::QueueConstruction

50b7ace

Otherwise it's impossible to figure out what precisely went wrong during queue restoration. Signed-off-by: Patrick Roy <[email protected]>

roypat force-pushed the guest-mem-in-vm branch from 84e3021 to 80ae913 Compare March 31, 2025 12:17

roypat added the Status: Awaiting review Indicates that a pull request is ready to be reviewed label Mar 31, 2025

kalyazin requested a review from Copilot April 1, 2025 13:04

Copilot AI reviewed Apr 1, 2025

View reviewed changes

src/vmm/src/arch/x86_64/mod.rs Show resolved Hide resolved

kalyazin previously approved these changes Apr 1, 2025

View reviewed changes

roypat requested review from JackThomson2 and bchalios April 1, 2025 13:59

JackThomson2 previously approved these changes Apr 1, 2025

View reviewed changes

bchalios reviewed Apr 2, 2025

View reviewed changes

roypat dismissed stale reviews from JackThomson2 and kalyazin via d9474b4 April 2, 2025 12:44

roypat force-pushed the guest-mem-in-vm branch 4 times, most recently from 2c66a1b to 66f9d7f Compare April 2, 2025 12:56

bchalios requested changes Apr 2, 2025

View reviewed changes

src/vmm/src/arch/aarch64/mod.rs Show resolved Hide resolved

roypat added 4 commits April 2, 2025 15:59

refactor: drop uffd parameter from create_vmm_and_vcpus

bbf3cd8

This way the non-snapshot path doesnt have to deal with passing in `None`. Signed-off-by: Patrick Roy <[email protected]>

fix(test): do not drop Vm object before end of test

1fc262b

Dropping the VM object closes the VM FD, which can cause the test to hang. Signed-off-by: Patrick Roy <[email protected]>

roypat force-pushed the guest-mem-in-vm branch from fb880e7 to 1fc262b Compare April 2, 2025 15:01

Merge branch 'main' into guest-mem-in-vm

8055e0a

roypat requested review from JackThomson2, bchalios and kalyazin April 2, 2025 15:01

bchalios approved these changes Apr 2, 2025

View reviewed changes

kalyazin approved these changes Apr 2, 2025

View reviewed changes

roypat enabled auto-merge (rebase) April 2, 2025 15:22

roypat merged commit 45fc21b into firecracker-microvm:main Apr 2, 2025
6 of 7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Refactor: Move `GuestMemoryMmap` into `struct Vm` #5125

Refactor: Move `GuestMemoryMmap` into `struct Vm` #5125

roypat commented Mar 31, 2025

Uh oh!

codecov bot commented Mar 31, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

bchalios commented Apr 2, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Refactor: Move GuestMemoryMmap into struct Vm #5125

Refactor: Move GuestMemoryMmap into struct Vm #5125

Conversation

roypat commented Mar 31, 2025

Changes

Reason

License Acceptance

PR Checklist

Uh oh!

codecov bot commented Mar 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

bchalios commented Apr 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Refactor: Move `GuestMemoryMmap` into `struct Vm` #5125

Refactor: Move `GuestMemoryMmap` into `struct Vm` #5125

codecov bot commented Mar 31, 2025 •

edited

Loading

bchalios commented Apr 2, 2025 •

edited

Loading