fix(test): cleanup snapshot files better #5126

roypat · 2025-03-31T16:41:26Z

It turns out that some of our snapshot tests are not exactly exemplary at cleaning up snapshot memory files after they're done. This is partly because Microvm.restore_from_snapshot was largely oblivious to the fact that if a snapshot is uffd-restored, it doesn't need to be copied into the jail again for mmaping that will never happen, and also due to me messing up the build_n_from_snapshot function and not cleaning up snapshots in "incremental" mode.

License Acceptance

By submitting this pull request, I confirm that my contribution is made under
the terms of the Apache 2.0 license. For more information on following Developer
Certificate of Origin and signing off your commits, please check
CONTRIBUTING.md.

PR Checklist

I have read and understand CONTRIBUTING.md.
I have run tools/devtool checkstyle to verify that the PR passes the
automated style checks.
I have described what is done in these changes, why they are needed, and
how they are solving the problem in a clear and encompassing way.
I have updated any relevant documentation (both in code and in the docs)
in the PR.
I have mentioned all user-facing changes in CHANGELOG.md.
If a specific issue led to this PR, this PR closes the issue.
When making API changes, I have followed the
Runbook for Firecracker API changes.
I have tested all new and changed functionalities in unit tests and/or
integration tests.
I have linked an issue to every new TODO.

This functionality cannot be added in rust-vmm.

The restore_from_snapshot function did not integrate well with uffd-based snapshot restore: Even if a UFFD path was specified, it still created a copy of the snapshot memory file inside the chroot, even though the UFFD handler set this up long ago in space_pf_handler. Fix this, and while we're at it, also remove the need for passing in uffd handler and snapshot file explicitly when using uffd-based restore, as the spawn_pf_handler sets the uffd_handler field of the microvm object, and can also easily be made to actually contain the snapshot from which page faults are being served. Signed-off-by: Patrick Roy <[email protected]>

codecov · 2025-03-31T16:50:45Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 83.09%. Comparing base (b38ec33) to head (b5652f5).
Report is 2 commits behind head on main.

Additional details and impacted files

@@           Coverage Diff           @@
##             main    #5126   +/-   ##
=======================================
  Coverage   83.09%   83.09%           
=======================================
  Files         250      250           
  Lines       26920    26920           
=======================================
  Hits        22368    22368           
  Misses       4552     4552

Flag	Coverage Δ
5.10-c5n.metal	`83.55% <ø> (+<0.01%)`	⬆️
5.10-m5n.metal	`83.55% <ø> (-0.01%)`	⬇️
5.10-m6a.metal	`82.72% <ø> (-0.01%)`	⬇️
5.10-m6g.metal	`79.43% <ø> (ø)`
5.10-m6i.metal	`83.54% <ø> (ø)`
5.10-m7a.metal-48xl	`82.71% <ø> (?)`
5.10-m7g.metal	`79.43% <ø> (ø)`
6.1-c5n.metal	`83.60% <ø> (ø)`
6.1-m5n.metal	`83.59% <ø> (ø)`
6.1-m6a.metal	`82.76% <ø> (ø)`
6.1-m6g.metal	`79.43% <ø> (+<0.01%)`	⬆️
6.1-m6i.metal	`83.59% <ø> (+<0.01%)`	⬆️
6.1-m7a.metal-48xl	`82.76% <ø> (?)`
6.1-m7g.metal	`79.43% <ø> (+<0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

If build_n_from_snapshot is asked to build many snapshots incrementally, it doesn't clean up after itself properly: Each iteration ends by creating a new snapshot of the VM, and this snapshot is passed to iteration n+1. Here, a copy is created inside the new VMs chroot, and iteration n+1 ends by creating a new snapshot for iteration n+2, and deleting the copy of the snapshot inside the chroot. However, we never delete the snapshot created in iteration n, and so with each iteration more snapshots accumulate. Fix this by having the function delete the snapshot created in iteration n after iteration n+2 finished successfully. The idea here is that in case of failure, we will have the snapshot created in iteration n+1 (the one which caused a failure in n+2), and also the snapshot created in n (which was the last known snapshot to successfully go through a test iteration, namely iteration n+1). Signed-off-by: Patrick Roy <[email protected]>

roypat · 2025-04-01T10:14:06Z

force pushed to fix a typo and a left cover commented debug statement from testing

roypat force-pushed the snapshot-test-cleanup branch from 565afda to eca928f Compare April 1, 2025 08:15

roypat added the Status: Awaiting review Indicates that a pull request is ready to be reviewed label Apr 1, 2025

pb8o previously approved these changes Apr 1, 2025

View reviewed changes

roypat dismissed pb8o’s stale review via 698edb2 April 1, 2025 09:38

roypat force-pushed the snapshot-test-cleanup branch from eca928f to 698edb2 Compare April 1, 2025 09:38

roypat force-pushed the snapshot-test-cleanup branch from 698edb2 to b5652f5 Compare April 1, 2025 09:39

roypat requested a review from pb8o April 1, 2025 10:13

kalyazin approved these changes Apr 1, 2025

View reviewed changes

zulinx86 approved these changes Apr 1, 2025

View reviewed changes

zulinx86 merged commit 85622c6 into firecracker-microvm:main Apr 1, 2025
6 of 7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(test): cleanup snapshot files better #5126

fix(test): cleanup snapshot files better #5126

Uh oh!

roypat commented Mar 31, 2025

Uh oh!

codecov bot commented Mar 31, 2025 •

edited

Loading

Uh oh!

roypat commented Apr 1, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

fix(test): cleanup snapshot files better #5126

fix(test): cleanup snapshot files better #5126

Uh oh!

Conversation

roypat commented Mar 31, 2025

License Acceptance

PR Checklist

Uh oh!

codecov bot commented Mar 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

roypat commented Apr 1, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

codecov bot commented Mar 31, 2025 •

edited

Loading