Skip to content

Commit 74a5161

Browse files
Manciukicbchalios
authored andcommitted
doc(vsock): clarify vsock support on snapshot/restore
The current documentation is not clear, mentioning that the device could break if active at time of snapshot. It also mentions that this is fixed by resetting the device at snapshot time, closing all open connections. This patch clarifies that vsock connections are explicitly closed by a transport reset, in order to avoid the aforementioned issue. This doesn't impact listening sockets as they are able to accept new connections after restore. Signed-off-by: Riccardo Mancini <[email protected]>
1 parent b4cb9d3 commit 74a5161

File tree

1 file changed

+15
-24
lines changed

1 file changed

+15
-24
lines changed

docs/snapshotting/snapshot-support.md

Lines changed: 15 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -57,7 +57,10 @@ creation process).
5757

5858
Both network and vsock packet loss can be expected on guests that are resumed
5959
from snapshots in another Firecracker process. It is also not guaranteed that
60-
the state of the network connections survives the process.
60+
the state of the network connections survives the process. Furthermore, vsock
61+
connections that are open when the snapshot is taken are closed, but existing
62+
vsock listen sockets in the guest still remain active and can accept new
63+
connections after resume (see [Vsock device reset](#vsock-device-reset)).
6164

6265
In order to make restoring possible, Firecracker snapshots save the full state
6366
of the following resources:
@@ -141,8 +144,6 @@ The snapshot functionality is still in developer preview due to the following:
141144
- Guest network connectivity is not guaranteed to be preserved after resume. For
142145
recommendations related to guest network connectivity for clones please see
143146
[Network connectivity for clones](network-for-clones.md).
144-
- Vsock device does not have full snapshotting support. Please see
145-
[Vsock device limitation](#vsock-device-limitation).
146147
- Snapshotting on arm64 works for both GICv2 and GICv3 enabled guests. However,
147148
restoring between different GIC version is not possible.
148149
- If a [CPU template](../cpu_templates/cpu-templates.md) is not used on x86_64,
@@ -606,29 +607,19 @@ identifiers, cached random numbers, cryptographic tokens, etc **will** still be
606607
replicated across multiple microVMs resumed from the same snapshot. Users need
607608
to implement mechanisms for ensuring de-duplication of such state, where needed.
608609

609-
## Vsock device limitation
610+
## Vsock device reset
610611

611-
Vsock must be inactive during snapshot. Vsock device can break if snapshotted
612-
while having active connections. Firecracker snapshots do not capture any
613-
inflight network or vsock (through the linux unix domain socket backend) traffic
614-
that has left or not yet entered Firecracker.
615-
616-
The above, coupled with the fact that Vsock control protocol is not resilient to
617-
vsock packet loss, leads to Vsock device breakage when doing a snapshot while
618-
there are active Vsock connections.
619-
620-
As a solution to the above issue, active Vsock connections prior to snapshotting
621-
the VM are forcibly closed by sending a specific event called
622-
`VIRTIO_VSOCK_EVENT_TRANSPORT_RESET`. The event is sent on `SnapshotCreate`. On
612+
The vsock device is reset across snapshot/restore to avoid inconsistent state
613+
between device and driver leading to breakage
614+
([#2218](https://github.com/firecracker-microvm/firecracker/issues/2218)). This
615+
is done by sending a `VIRTIO_VSOCK_EVENT_TRANSPORT_RESET` event to the guest
616+
driver during `SnapshotCreate`
617+
([#2562](https://github.com/firecracker-microvm/firecracker/pull/2562)). On
623618
`SnapshotResume`, when the VM becomes active again, the vsock driver closes all
624-
existing connections. Listen sockets still remain active. Users wanting to build
625-
vsock applications that use the snapshot capability have to take this into
626-
consideration. More details about this event can be found in the official Virtio
627-
document [here](https://docs.oasis-open.org/virtio/virtio/v1.1/virtio-v1.1.pdf),
628-
section 5.10.6.6 Device Events.
629-
630-
Firecracker handles sending the `reset` event to the vsock driver, thus the
631-
customers are no longer responsible for closing active connections.
619+
existing connections. Existing listen sockets still remain active, but their CID
620+
is updated to reflect the current `guest_cid`. More details about this event can
621+
be found in the official Virtio document
622+
[here](https://docs.oasis-open.org/virtio/virtio/v1.1/csprd01/virtio-v1.1-csprd01.html#x1-4080006).
632623

633624
## VMGenID device limitation
634625

0 commit comments

Comments
 (0)