-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Refactor: Move GuestMemoryMmap
into struct Vm
#5125
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
d9de063
to
7933d66
Compare
The explicit calls to GuestMemoryMmap::create ended up being touched in like 5 of the following commits, so let's reduce the churn by putting it into a function. Signed-off-by: Patrick Roy <[email protected]>
Have all of them operate on some sort of list of (GuestAddress, usize) tuples, for consistency. Signed-off-by: Patrick Roy <[email protected]>
This struct will hold all the fields that are common between the x86_64 and aarch64 definition of `struct ArchVm`. For now, this is only the `VmFd`, but this will grow. Signed-off-by: Patrick Roy <[email protected]>
This is to prepare for memory regions being registered to a VM incrementally. While we're at it, add a unit test for it. It completes in roughly 50s on my machine, and will become more useful later once we actually allow building mmap regions incrementally. Signed-off-by: Patrick Roy <[email protected]>
Semantically, this is where it belong - the memory is associated with the virtual machine, not the virtual machine manager. But independently of this, it is needed for borrowchk reasons: In the future, each Vm will have two guest memory regions associated with it. One for traditional guest memory, and another that is dedicated to I/O (swiotlb region). Storing these two GuestMemoryMmap in `struct Vm` allows for a function as follows: fn io_memory(&self) -> &GuestMemoryMmap { self.io_memory.as_ref().unwrap_or(&self.normal_memory) } Such a function cannot live in `struct Vmm`, as that would mean borrowing the guest memory implies borrowing the entire `Vmm` object (because rustc is not intelligent enough to project split borrows past function calls - e.g. it doesn't know the references returned by io_memory() really only borrows the io_memory and normal_memory fields, not everything else.). However, this means it cannot be used in callsites where such splitting of borrows is required (e.g. everywhere where both a guest memory reference _and_ another Vmm field are passed by reference to a function). The `Option`-ness of the the guest_memory field is transitory. Signed-off-by: Patrick Roy <[email protected]>
This is in preparation of guest memory itself only getting set up after the call to create_vmm_and_vcpus(). Signed-off-by: Patrick Roy <[email protected]>
Add `Vm::register_memory_region[s]`, which allow adding new memory regions to a Vm incrementally (e.g. without overwriting the previously registered regions). To use this, have the functions in the `memory` module hand out Vec<GuestMemoryRegion> instead of GuestMemoryMmap, because vm_memory does not offer APIs for extracing the regions of a GuestMemoryMmap again. Signed-off-by: Patrick Roy <[email protected]>
Since the guest memory is now stored in `struct Vm`, we can more easily store the guest memory state inside the VmState. Signed-off-by: Patrick Roy <[email protected]>
Now that guest memory is tracked inside struct Vm instead of struct Vmm, the dirty bitmap tracking functions can also start living in struct Vm. While we're touching it anyway, update the types used to kvm memslot numbers to be the native u32. This saves us a few ugly usize->u32 cast. Also change the error type as to not drag in VmmError into the vm module, where it really doesn't belong. Signed-off-by: Patrick Roy <[email protected]>
Do this in create_snapshot() instead of snapshot_memory_to_file(), as it was the only operation inside snapshot_memory_to_file() that required access to a struct Vmm, opening the gates for move the rest of snapshot_memory_to_file() into struct Vm. Signed-off-by: Patrick Roy <[email protected]>
struct Vm encapsulates the memory, so having this function live there makes sense. It also prepares us for the future where multiple GuestMemoryMmap objects need to be managed. Signed-off-by: Patrick Roy <[email protected]>
Avoid some .map_err() calls using #[from] and the auto-From::from calling of the question mark operator. Signed-off-by: Patrick Roy <[email protected]>
We can just immediately propagate errors to the caller using `?` syntax, which saves us the "if (...).is_ok()" check. Signed-off-by: Patrick Roy <[email protected]>
Otherwise it's impossible to figure out what precisely went wrong during queue restoration. Signed-off-by: Patrick Roy <[email protected]>
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #5125 +/- ##
==========================================
- Coverage 83.07% 83.03% -0.05%
==========================================
Files 250 250
Lines 26901 26911 +10
==========================================
- Hits 22349 22345 -4
- Misses 4552 4566 +14
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This pull request refactors the guest memory management by moving GuestMemoryMmap from Vmm into Vm and updating all corresponding usages in production and test code. Key changes include:
- Replacing vmm.guest_memory with vmm.vm.guest_memory() in various modules.
- Updating test helpers and signatures in builder, architecture, and device modules to accommodate the new guest memory API.
- Minor improvements to error message formatting in snapshot persistence.
Reviewed Changes
Copilot reviewed 29 out of 29 changed files in this pull request and generated 1 comment.
Show a summary per file
File | Description |
---|---|
src/vmm/src/gdb/* | Updated memory read/write calls to use vm.guest_memory() |
src/vmm/src/devices/virtio/* | Adjusted guest memory calls and test module visibility |
src/vmm/src/builder.rs | Updated memory initialization and snapshot restore |
src/vmm/src/arch/x86_64/* | Updated arch_memory_regions and guest memory handling adjustments |
src/vmm/src/arch/aarch64/* | Updated guest memory calls and VM state save/restore |
src/vmm/benches/memory_access.rs | Minor improvement in accessing the first memory region |
Comments suppressed due to low confidence (3)
src/vmm/src/devices/virtio/vhost_user.rs:462
- [nitpick] The test module's visibility has been changed from private to pub(crate). If this change is intentional for test accessibility purposes, consider adding a comment to document the rationale behind exposing the tests.
pub(crate) mod tests {
src/vmm/src/devices/virtio/persist.rs:25
- [nitpick] The error message for QueueConstruction now includes a placeholder. Please verify that this formatting meets your intended error reporting style and matches all related formatting conventions.
/// Could not restore queue: {0}
src/vmm/benches/memory_access.rs:16
- [nitpick] Switching from iter().next() to first() improves clarity; please ensure that this change preserves the intended ordering and behavior of the memory regions.
let ptr = memory.first().unwrap().as_ptr();
All in all, this is a great change I think. To answer your initial question, It definitely helps massively in case we want to build |
2c66a1b
to
66f9d7f
Compare
This allows having regions that start somewhere other than guest physical address 0. Useful in case not all regions are allocated at once, and each batch needs to be placed after all already allocated regions. Since the resulting code has some intricacies (and I actually got it wrong the first time around), write a kani harness to validate that we're computing the regions correctly. On ARM, add a warning in case we hit some questionable logic around truncation of memory sizes if they exceed architectural bounds (although if anyone is hitting this, they're already doing something wrong anyway, because it'd require a VM with more than 1022GiB of memory). Signed-off-by: Patrick Roy <[email protected]>
This way the non-snapshot path doesnt have to deal with passing in `None`. Signed-off-by: Patrick Roy <[email protected]>
Only enabling it when a balloon device is present doesn't really gain us anything, because the only time this feature is checked in the kernel is as part of madvise handling - but without balloon devices we wont ever call madvise, so it makes no difference whether we enabled the feature or not (it is unconditionally available since linux 4.10). Signed-off-by: Patrick Roy <[email protected]>
Dropping the VM object closes the VM FD, which can cause the test to hang. Signed-off-by: Patrick Roy <[email protected]>
Changes
Move the
GuestMemoryMmap
that keeps track of a Firecracker VM's memory regions intostruct Vm
, fromstruct Vmm
, and then update some related functionality (such as various snapshot helper functions) to operate onstruct Vm
instead ofstruct Vmm
.Reason
So far, Firecracker always dealt with a single collection of homogeneous guest memory regions. However, #5108 introduces a second type of memory regions, which makes the set of memory regions heterogeneous, and other features, such as virtio-pmem support, will come with similar "special" memory regions. By moving the guest memory management into
Vm
, we encapsule this heterogeneity and allow incremental building of guest memory configuration.Admittedly, there's nothing right now that would benefit from this refactor (it doesn't even reduce LoC 😢), but it will most certainly make life on the feature branch easier to not have to continuously rebase this 😅
License Acceptance
By submitting this pull request, I confirm that my contribution is made under
the terms of the Apache 2.0 license. For more information on following Developer
Certificate of Origin and signing off your commits, please check
CONTRIBUTING.md
.PR Checklist
tools/devtool checkstyle
to verify that the PR passes theautomated style checks.
how they are solving the problem in a clear and encompassing way.
in the PR.
CHANGELOG.md
.Runbook for Firecracker API changes.
integration tests.
TODO
.rust-vmm
.