Skip to content

Commit 4c8e328

Browse files
committed
fix(doc): Improve accurancy of snapshot documentation
Fix various minor errors: - Drop some specifics on the cgroups v1 disclaimer, because all supported host kernel versions are "5.4+" - Do not claim that creating a snapshot has no effect on the running VM, because that's not true. - Cut down on some repeated and confusing information / examples near the end. Signed-off-by: Patrick Roy <[email protected]>
1 parent 5225680 commit 4c8e328

File tree

1 file changed

+18
-33
lines changed

1 file changed

+18
-33
lines changed

docs/snapshotting/snapshot-support.md

Lines changed: 18 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -122,7 +122,7 @@ the feature can be combined with guest_memfd support in Firecracker.
122122

123123
### Limitations
124124

125-
- High snapshot latency on 5.4+ host kernels due to cgroups V1. We strongly
125+
- High snapshot restoration latency when cgroups V1 are in use. We strongly
126126
recommend to deploy snapshots on cgroups V2 enabled hosts for the implied
127127
kernel versions -
128128
[related issue](https://github.com/firecracker-microvm/firecracker/issues/2129).
@@ -145,10 +145,11 @@ the feature can be combined with guest_memfd support in Firecracker.
145145
resumed from snapshot load memory on-demand from the snapshot and
146146
copy-on-write to anonymous memory.
147147
- Resuming from a snapshot is optimized for speed, while taking a snapshot
148-
involves some extra CPU cycles for synchronously writing dirty memory pages to
149-
the memory snapshot file. Taking a snapshot of a fresh microVM, on which dirty
150-
pages tracking is not enabled, results in the full contents of guest memory
151-
being written to the snapshot.
148+
involves some extra CPU cycles for synchronously writing memory pages to the
149+
memory snapshot file. Taking a full snapshot of a microVM, on which dirty page
150+
tracking is not enabled, results in the full contents of guest memory being
151+
written to the snapshot, and particularly, in all guest memory being faulted
152+
in.
152153
- The _memory file_ and _microVM state file_ are generated by Firecracker on
153154
snapshot creation. The disk contents are _not_ explicitly flushed to their
154155
backing files.
@@ -356,10 +357,12 @@ Enabling this support enables KVM dirty page tracking, so it comes at a cost
356357
(which consists of CPU cycles spent by KVM accounting for dirtied pages); it
357358
should only be used when needed.
358359

359-
Creating a snapshot will **not** influence state, will **not** stop or end the
360-
microVM, it can be used as before, so the microVM can be resumed if you still
361-
want to use it. At this point, in case you plan to continue using the current
362-
microVM, you should make sure to also copy the disk backing files.
360+
Creating a snapshot has some minor effects on the currently running microVM:
361+
362+
- The vsock device is [reset](#vsock-device-reset), causing the driver to
363+
terminate connection on resumption.
364+
- On x86_64, a notification for KVM-clock is injected to notify the guest about
365+
being paused.
363366

364367
### Resuming the microVM
365368

@@ -384,8 +387,8 @@ ignored (microVM remains in the running state). **Effects**:
384387
### Loading snapshots
385388

386389
If you want to load a snapshot, you can do that only **before** the microVM is
387-
configured (the only resources that can be configured prior are the Logger and
388-
the Metrics systems) by sending the following API command:
390+
configured (the only resources that can be configured prior are the logger and
391+
the metrics systems) by sending the following API command:
389392

390393
```bash
391394
curl --unix-socket /tmp/firecracker.socket -i \
@@ -472,28 +475,10 @@ to the new Firecracker process as they were to the original one.
472475
- _on failure_: A specific error is reported and then the current Firecracker
473476
process is ended (as it might be in an invalid state).
474477

475-
*Notes*: Please, keep in mind that only by setting to true
476-
`enable_diff_snapshots`, when loading a snapshot, or `track_dirty_pages`, when
477-
configuring the machine on a fresh microVM, you can then create a `diff`
478-
snapshot. Also, `track_dirty_pages` is not saved when creating a snapshot, so
479-
you need to explicitly set `enable_diff_snapshots` when sending
480-
`LoadSnapshot`command if you want to be able to do diff snapshots from a loaded
481-
microVM. Another thing that you should be aware of is the following: if a fresh
482-
microVM can create diff snapshots, then if you create a **full** snapshot, the
483-
memory file contains the whole guest memory, while if you create a **diff** one,
484-
that file is sparse and only contains the guest dirtied pages. With these in
485-
mind, some possible snapshotting scenarios are the following:
486-
487-
- `Boot from a fresh microVM` -> `Pause` -> `Create snapshot` -> `Resume` ->
488-
`Pause` -> `Create snapshot` -> ... ;
489-
- `Boot from a fresh microVM` -> `Pause` -> `Create snapshot` -> `Resume` ->
490-
`Pause` -> `Resume` -> ... -> `Pause` -> `Create snapshot` -> ... ;
491-
- `Load snapshot` -> `Resume` -> `Pause` -> `Create snapshot` -> `Resume` ->
492-
`Pause` -> `Create snapshot` -> ... ;
493-
- `Load snapshot` -> `Resume` -> `Pause` -> `Create snapshot` -> `Resume` ->
494-
`Pause` -> `Resume` -> ... -> `Pause` -> `Create snapshot` -> ... ; where
495-
`Create snapshot` can refer to either a full or a diff snapshot for all the
496-
aforementioned flows.
478+
*Notes*: The `track_dirty_pages` configuration is not saved when creating a
479+
snapshot, so you need to explicitly set `track_dirty_pages` again when sending
480+
the `LoadSnapshot` command if you want to be able to do dirty page tracking
481+
based diff snapshots from a loaded microVM.
497482

498483
It is also worth knowing, a microVM that is restored from snapshot will be
499484
resumed with the guest OS wall-clock continuing from the moment of the snapshot

0 commit comments

Comments
 (0)