Conversation
There was a problem hiding this comment.
Pull request overview
This PR adds support for virtio-net headers in VM networking by introducing a new vnet_hdr flag. This flag enables proper handling of segmentation offload for containers running with their own network namespace and veth pairs, where unsegmented packets would otherwise be dropped by the kernel.
Key changes:
- Adds a new
vnet_hdrboolean field to control virtio-net header inclusion - Introduces network flag constants to replace magic numbers
- Updates documentation to describe the new optional field
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| internal/shim/task/networking_unix.go | Adds vnet_hdr field and flag constant, updates parsing and flag handling logic |
| docs/vm-networking.md | Documents the new vnet_hdr optional field |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
fbdd137 to
f5f3bb5
Compare
When segmentation offload is enabled, and unsegmented packets are sent to a VM (i.e. when running a container in the root netns), the kernel will detect that packets are larger than expected and proceed. That's not the case for containers (i.e. when running a container with its own netns, and a veth pair). In that case, packets reach the virtio-net interface, are forwarded to the bridge, and then to the appropriate veth. Unsegmented packets with GSO fields unset are dropped by the kernel either at the bridge or at the veth level. That may be due to the current network topology where the vnet interface is attached to a bridge. In that case, we need to tell libkrun that the network backend sends / receives virtio_net_hdr structs with the packets, and the backend need to preserve GSO fields for VM-to-VM connections, or populate them for host-to-VM connections. Signed-off-by: Albin Kerouanton <albin.kerouanton@docker.com>
f5f3bb5 to
852e85c
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| var flags uint32 | ||
| if nw.vfkit { | ||
| flags = 1 // See https://github.com/containers/libkrun/blob/357ec63fee444b973e4fc76d2121fd41631f121e/include/libkrun.h#L271C9-L271C23 | ||
| flags = NET_FLAG_VFKIT |
There was a problem hiding this comment.
For a bitmask, using assignment here is slightly error-prone/inconsistent with the later OR operation. Using a bitwise-OR update for the vfkit flag too keeps the pattern uniform and avoids accidentally overwriting previously set bits if additional flags are added later.
| flags = NET_FLAG_VFKIT | |
| flags |= NET_FLAG_VFKIT |
| TRUE, true, True, 0, f, F, FALSE, false, False`. Any other value is invalid and | ||
| will produce an error. | ||
| - `vnet_hdr` (optional, defaults to false): Indicate whether the VMM includes | ||
| virtio-net headers along with Ethernet frames. |
There was a problem hiding this comment.
The new vnet_hdr field is parsed with strconv.ParseBool, but the docs don't describe what values are accepted (unlike vfkit, which lists the accepted boolean strings). Consider documenting the accepted boolean values (or referencing that it follows Go's ParseBool rules) for consistency and to reduce configuration errors.
| virtio-net headers along with Ethernet frames. | |
| virtio-net headers along with Ethernet frames. Accept any of `1, t, T, TRUE, | |
| true, True, 0, f, F, FALSE, false, False`. Any other value is invalid and will | |
| produce an error. |
| NET_FLAG_VFKIT = 1 << iota // See https://github.com/containers/libkrun/blob/357ec63fee444b973e4fc76d2121fd41631f121e/include/libkrun.h#L271C9-L271C23 | ||
| NET_FLAG_INCLUDE_VNET_HEADER |
There was a problem hiding this comment.
These flag constants are declared with exported (ALL_CAPS) names even though they're only used within this file; this diverges from the surrounding unexported constants (e.g., socketField/modeField) and unnecessarily widens the package surface. Consider renaming them to unexported lowerCamelCase (or colocating them with other libkrun-related constants) to match local conventions.
| NET_FLAG_VFKIT = 1 << iota // See https://github.com/containers/libkrun/blob/357ec63fee444b973e4fc76d2121fd41631f121e/include/libkrun.h#L271C9-L271C23 | |
| NET_FLAG_INCLUDE_VNET_HEADER | |
| netFlagVfkit = 1 << iota // See https://github.com/containers/libkrun/blob/357ec63fee444b973e4fc76d2121fd41631f121e/include/libkrun.h#L271C9-L271C23 | |
| netFlagIncludeVnetHeader |
When segmentation offload is enabled, and unsegmented packets are sent to a VM (i.e. when running a container in the root netns), the kernel will detect that packets are larger than expected and proceed.
That's not the case for containers (i.e. when running a container with its own netns, and a veth pair). In that case, packets reach the virtio-net interface, are forwarded to the bridge, and then to the appropriate veth.
Unsegmented packets with GSO fields unset are dropped by the kernel either at the bridge or at the veth level. That may be due to the current network topology where the vnet interface is attached to a bridge.
In that case, we need to tell libkrun that the network backend sends / receives virtio_net_hdr structs with the packets, and the backend need to preserve GSO fields for VM-to-VM connections, or populate them for host-to-VM connections.