-
Notifications
You must be signed in to change notification settings - Fork 2.1k
[virtio-pmem] Implementation #5463
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #5463 +/- ##
==========================================
- Coverage 82.79% 82.56% -0.23%
==========================================
Files 263 269 +6
Lines 27223 27736 +513
==========================================
+ Hits 22538 22899 +361
- Misses 4685 4837 +152
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
d8f9547
to
5970613
Compare
a8bedbb
to
1d2aeb2
Compare
1d2aeb2
to
7d83503
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- we should update
docs/device-api.md
. - changelog entry
- any performance tests? we could check how fast we can read or write the entire pmem or maybe we can integrate it with the block tests using fio
} | ||
} | ||
|
||
fn write_config(&mut self, _offset: u64, _data: &[u8]) {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could log unexpected attempts to write
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't log such writes in any device, so don't think we should do it here.
\"root_device\": true, | ||
\"read_only\": false | ||
}" | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should probably mention snapshot/restore behaviour as well
and also security considerations about sharing memory (which we do not recommend).
We can also mention performance considerations: ie that even though pages are in memory, the guest still needs to exit to the kernel to set up the pagetable mappings. Using hugetlbfs to back the file would be faster (but will consume memory).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added snapshot, security and performance sections
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding hugetlbfs: its main usage is to make sharable memory region and it does not support writes (e.g. you cannot copy file to the hugetblfs, only create and resize them). So I don't think we need to explicitly mention it as a backing for pmem since the main use we expect is to use actual files as backing storage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, we can leave that aside I believe as long as we mention faulting as a cost to pay. What I actually had in mind was tmpfs
with huge=always
that internally uses THP
to back the files in memory,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added mention of tmpfs
and hugetblfs
docs/pmem.md
Outdated
> `DAX` support is not uniform for all file systems. Check the documentation for | ||
> the file system you want to use before enabling `DAX`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it works on ext4, right? does it need any specific options (ie 4096 block size) or just works?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, fs need to have block size to be equal to host page size. Added a link to the kernel docs about DAX support.
7d83503
to
efd93ea
Compare
9a554b4
to
4b4779b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Went half way through. Here's an initial set of comments
|
||
/// Adds an existing pmem device in the builder. | ||
pub fn add_device(&mut self, device: Arc<Mutex<Pmem>>) { | ||
self.devices.push(device); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't we add this to the corresponding index?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In any case, could you also add a unit test for this one as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unit test for what?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a unit test which ensures that add_device
does what you think it's doing. But back to my initial question, shouldn't add_device
add device
in the correct place in self.devices
, according to the device index?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a unit test which ensures that add_device does what you think it's doing.
is a single line: self.devices.push(device);
so elusive to need a unit test?
order of deivces onlt matters during VM boot if any of them is a root device. Otherwise order is not important. The add_device
only used during snapshot restore and even in that case the order is preserved since configs for devices are stored in the same order as they are during VM boot (with configs
function)
7eff29f
to
4a19190
Compare
4f309d6
to
de6031e
Compare
msync is used by virtio-pmem device to trigger sync of mmaped file content to the underlying file. Signed-off-by: Egor Lazarchuk <[email protected]>
Add implementations of device, event handling, metrics. Add device config and builder types for API use. Signed-off-by: Egor Lazarchuk <[email protected]>
Update VmResources type with virtio-pmem configuration field to allow virtio-pmem devices be configured through config files and later through API calls. Signed-off-by: Egor Lazarchuk <[email protected]>
Both virtio-block and virtio-pmem can act as root devices for a VM. Add a check to prevent specifing more than 1 root device for a VM. Signed-off-by: Egor Lazarchuk <[email protected]>
Add /pmem/id PUT request for virtio-pmem configuration. Add corresponding metrics. Signed-off-by: Egor Lazarchuk <[email protected]>
Virtio-pmem devices need to allocate a memory region in guest physical memory. The safe place to do this is past 64bit MMIO region. Signed-off-by: Egor Lazarchuk <[email protected]>
Add a counter for KVM slot ids into VmCommon struct. This is done because virtio-pmem device needs to obtain it's KVM slot id independently from number of slots in GuestMemoryMmap. Signed-off-by: Egor Lazarchuk <[email protected]>
Add methods to attach virtio-pmem devices to Vmm. Add methods to create KVM memory slot for virtio-pmem devices. Signed-off-by: Egor Lazarchuk <[email protected]>
Add logic to store and restore virtio-pmem device information in a snapshot. Signed-off-by: Egor Lazarchuk <[email protected]>
Add functional and API tests for virtio-pmem device and its configuration fields Signed-off-by: Egor Lazarchuk <[email protected]>
Expose virtio-pmem metrics in the logger, so they are exported in metrics.json. Update integration tests to expect new metrics. Signed-off-by: Egor Lazarchuk <[email protected]>
Add description of pmem APIs in swagger file and device-api.md Signed-off-by: Egor Lazarchuk <[email protected]>
Add new document about virtio-pmem configuration and usage. Signed-off-by: Egor Lazarchuk <[email protected]>
Add a note about addition of virtio-pmem device. Signed-off-by: Egor Lazarchuk <[email protected]>
de6031e
to
d8c695a
Compare
Changes
Add
virtio-pmem
device support.Closes #5448
License Acceptance
By submitting this pull request, I confirm that my contribution is made under
the terms of the Apache 2.0 license. For more information on following Developer
Certificate of Origin and signing off your commits, please check
CONTRIBUTING.md
.PR Checklist
tools/devtool checkbuild --all
to verify that the PR passesbuild checks on all supported architectures.
tools/devtool checkstyle
to verify that the PR passes theautomated style checks.
how they are solving the problem in a clear and encompassing way.
in the PR.
CHANGELOG.md
.Runbook for Firecracker API changes.
integration tests.
TODO
.rust-vmm
.