verity volumes: pre-load docker images and data into CVMs#752
Open
h4x3rotab wants to merge 4 commits into
Open
verity volumes: pre-load docker images and data into CVMs#752h4x3rotab wants to merge 4 commits into
h4x3rotab wants to merge 4 commits into
Conversation
61dd38d to
abaaa84
Compare
Add read-only verity volumes -- extra virtio-blk disks a CVM can mount
instead of pulling and unpacking their contents. A volume is declared in
app-compose.json as `{ verity_root, target }`, and attached at deploy time
with `--volume <name>`: the vmm looks the name up under cvm.volumes_dir and
attaches it read-only. Because verity_root is part of the measured compose,
the guest can check the bytes it's handed against it.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Just before dockerd starts, dstack-prepare.sh runs the seeding helper. For each volume declared in the compose it finds the matching disk by opening it with veritysetup against the measured verity_root, then either seeds docker's overlay2 store (target "docker") so the images are already present, or mounts the volume at a path (a data volume). It's fail-safe throughout: a volume that's missing or doesn't verify is skipped, and its images just pull. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
`dstack verity <image>...` builds a squashfs + dm-verity volume that pre-extracts docker images (or, with `--dir`, a plain directory) so a CVM can start without pulling or unpacking them. It fetches images itself through oci-client -- no docker daemon -- and lays out the overlay2 store deterministically (each layer's directory id is its chain-id, with a fixed timestamp and salt). The same inputs always produce the same verity_root, so anyone can recompute it from the pinned image digests and confirm what's in the volume without trusting the builder. `dstack deploy --volume` attaches the result. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What this does
Starting a CVM is slow mostly because of image extraction — decompressing layers and writing millions of files to the encrypted disk (~30 s for a 3 GB image, a couple of minutes for 7 GB). A verity volume skips that: a read-only, dm-verity-protected disk, built once, whose layers are already extracted. The CVM mounts it and verifies blocks lazily as the app reads them — no pull, no unpacking — and one volume can back many CVMs.
How it fits together
dstack verity <image>…builds the volume. It fetches the images itself (viaoci-client, no docker daemon) and lays out docker's overlay2 store deterministically, then seals it with squashfs + dm-verity. The same inputs give the sameverity_root, byte for byte, so anyone can recompute it from the pinned digests and confirm what's inside.--dirpacks a plain directory (e.g. model weights) instead.verity_rootdeclared inapp-compose.json. It's fail-safe — a missing or mismatched volume falls back to a normal pull.dstack deploy --volume), resolved undervolumes_dir.Why it's safe
verity_rootlives inapp-compose.json, so it's measured intoapp_id. The guest trusts the root, not the host or the builder, and dm-verity rejects anything that doesn't match. The build needs no TEE, so it can run in CI and be attested with SLSA provenance on top.Design and trade-offs are in
docs/verity-volumes.md. Validated end to end on Intel TDX — first boot, reboot of the same instance, two volumes in one CVM — and reproducible across independent builds.🤖 Generated with Claude Code