|
| 1 | +--- |
| 2 | +title: TPM Mux for System VMs |
| 3 | +description: Architecture, implementation, and test plan for shared hardware TPM access in Ghaf system VMs |
| 4 | +--- |
| 5 | + |
| 6 | +## Overview |
| 7 | + |
| 8 | +Ghaf system VMs can share one hardware TPM through a host-side mux layer instead of direct passthrough. This design keeps the trust anchor in hardware while avoiding direct concurrent access from multiple guests. |
| 9 | + |
| 10 | +The current non-riscv64 system VM profile uses this model for: |
| 11 | + |
| 12 | +- `admin-vm` |
| 13 | +- `audio-vm` |
| 14 | +- `gui-vm` |
| 15 | +- `net-vm` |
| 16 | + |
| 17 | +On riscv64, system VMs continue to use emulated TPM. |
| 18 | + |
| 19 | +## Why This Exists |
| 20 | + |
| 21 | +Direct passthrough from multiple guests to one TPM can lead to command contention, unreliable startup sequencing, and lockup-like timeout behavior under load. The TPM mux architecture addresses this by introducing controlled fan-out on the host: |
| 22 | + |
| 23 | +- one backend hardware TPM resource manager device (`/dev/tpmrm0` by default) |
| 24 | +- one per-VM forwarder process |
| 25 | +- one per-VM proxy endpoint exposed to QEMU |
| 26 | + |
| 27 | +This preserves VM isolation while making startup and runtime behavior more predictable. |
| 28 | + |
| 29 | +## Architecture |
| 30 | + |
| 31 | +The system is composed of four layers: |
| 32 | + |
| 33 | +1. **Host TPM backend** |
| 34 | + - `tpm2-abrmd` and host kernel TPM stack |
| 35 | + - hardware device path defaults to `/dev/tpmrm0` |
| 36 | +2. **Per-VM mux forwarder** |
| 37 | + - service `ghaf-vtpm-forwarder-<vm>.service` |
| 38 | + - binary `vtpm-abrmd-forwarder` |
| 39 | +3. **QEMU guest TPM device wiring** |
| 40 | + - `-tpmdev passthrough,id=tpm0,path=/run/ghaf-vtpm/<vm>.tpm,cancel-path=/tmp/cancel` |
| 41 | + - `-device tpm-tis` (x86_64) or `tpm-tis-device` (aarch64) |
| 42 | +4. **Guest userspace consumers** |
| 43 | + - storage encryption helpers |
| 44 | + - SPIFFE DevID provisioning and attestation |
| 45 | + |
| 46 | +### Boot Ordering |
| 47 | + |
| 48 | +Host systemd ordering enforces forwarder readiness before VM launch: |
| 49 | + |
| 50 | +- `ghaf-vtpm-forwarder-<vm>.service` starts before `microvm@<vm>.service` |
| 51 | +- `microvm@<vm>.service` requires the corresponding forwarder service |
| 52 | +- forwarders use `Type=notify` and become ready only after link endpoint setup |
| 53 | + |
| 54 | +## Implementation Details |
| 55 | + |
| 56 | +### Host Module |
| 57 | + |
| 58 | +Host orchestration is defined in `modules/microvm/host/tpm-mux.nix`: |
| 59 | + |
| 60 | +- enables host TPM stack and `tpm2-abrmd` |
| 61 | +- loads kernel module `tpm_vtpm_proxy` |
| 62 | +- creates runtime directory (`/run/ghaf-vtpm` by default) |
| 63 | +- creates one forwarder service per mux-enabled VM |
| 64 | +- auto-discovers VM list when `ghaf.virtualization.microvm-host.tpmMux.vms = [ ]` |
| 65 | + |
| 66 | +### Guest Module |
| 67 | + |
| 68 | +Guest TPM mode wiring is in `modules/microvm/common/vm-tpm.nix`: |
| 69 | + |
| 70 | +- exactly one TPM mode can be enabled (`passthrough`, `muxed`, or `emulated`) |
| 71 | +- mux mode exports host path as QEMU passthrough backend |
| 72 | +- guest `tpm0` permissions are configured for TPM userspace components |
| 73 | + |
| 74 | +### System VM Base Integration |
| 75 | + |
| 76 | +System VM defaults now select mux mode on non-riscv64 when storage encryption is enabled: |
| 77 | + |
| 78 | +- `modules/microvm/sysvms/adminvm-base.nix` |
| 79 | +- `modules/microvm/sysvms/audiovm-base.nix` |
| 80 | +- `modules/microvm/sysvms/guivm-base.nix` |
| 81 | +- `modules/microvm/sysvms/netvm-base.nix` |
| 82 | + |
| 83 | +The laptop profile host mux config currently sets an explicit system VM list: |
| 84 | + |
| 85 | +- `modules/profiles/laptop-x86.nix` |
| 86 | + |
| 87 | +## SPIFFE / DevID Considerations |
| 88 | + |
| 89 | +The TPM mux layer is transparent to SPIFFE from a consumer perspective (`/dev/tpm0` inside guests), but it affects timing and reliability characteristics. For TPM DevID flows, keep these guardrails: |
| 90 | + |
| 91 | +- ensure provisioning and attestation are resilient to transient TPM retries/timeouts |
| 92 | +- validate DevID cert public key matches VM TPM key before accepting cached certs |
| 93 | +- prefer restart-safe provisioning behavior over one-shot assumptions |
| 94 | + |
| 95 | +## Testing Strategy |
| 96 | + |
| 97 | +Use a staged test plan for each change. |
| 98 | + |
| 99 | +### 1. Static and Policy Checks |
| 100 | + |
| 101 | +- `nix fmt -- --fail-on-change` |
| 102 | +- `nix develop --command reuse lint` |
| 103 | + |
| 104 | +### 2. Build and Evaluation |
| 105 | + |
| 106 | +- evaluate affected target(s) |
| 107 | +- build image/closure that includes updated VM bases and host mux module |
| 108 | + |
| 109 | +### 3. Boot and Service Readiness |
| 110 | + |
| 111 | +On host: |
| 112 | + |
| 113 | +- verify `tpm2-abrmd.service` is active |
| 114 | +- verify `ghaf-vtpm-forwarder-<vm>.service` is active for each system VM |
| 115 | +- verify `microvm@<vm>.service` starts after the matching forwarder |
| 116 | + |
| 117 | +In each VM (`admin-vm`, `audio-vm`, `gui-vm`, `net-vm`): |
| 118 | + |
| 119 | +- verify `/dev/tpm0` exists and has expected ownership/mode |
| 120 | +- run basic command smoke test, for example `tpm2_getrandom 8` |
| 121 | + |
| 122 | +### 4. Concurrency and Stress |
| 123 | + |
| 124 | +Run concurrent TPM command loops in all system VMs and monitor: |
| 125 | + |
| 126 | +- VM command success rate |
| 127 | +- forwarder restarts and error counters |
| 128 | +- host kernel TPM timeout messages |
| 129 | + |
| 130 | +Gate condition: sustained test window with no forwarder crashes and acceptable command success ratio. |
| 131 | + |
| 132 | +### 5. SPIFFE End-to-End |
| 133 | + |
| 134 | +For TPM DevID-enabled VMs: |
| 135 | + |
| 136 | +- restart `spire-devid-provision` and `spire-agent` |
| 137 | +- verify successful node attestation in agent logs |
| 138 | +- verify matching successful request completion in server logs |
| 139 | +- verify agent restarts load existing SVID without re-entering failure loops |
| 140 | + |
| 141 | +## Troubleshooting Checklist |
| 142 | + |
| 143 | +If a VM shows TPM failures: |
| 144 | + |
| 145 | +1. Confirm forwarder service for that VM is active on host. |
| 146 | +2. Confirm VM started after forwarder (systemd ordering). |
| 147 | +3. Check host logs for `tpm tpm2: Operation Timed out` or retry loops. |
| 148 | +4. Check forwarder logs for backend receive latency and proxy write/read errors. |
| 149 | +5. For SPIFFE failures, verify DevID cert/public-key match before retrying attestation. |
| 150 | + |
| 151 | +## Known Constraints |
| 152 | + |
| 153 | +- Shared hardware TPM is still a serialized resource; under heavy multi-VM load, latency spikes can occur. |
| 154 | +- Some TPM commands are more sensitive to contention and timeout behavior. |
| 155 | +- Operational reliability depends on both mux correctness and consumer retry/backoff behavior. |
| 156 | + |
| 157 | +## Related Documents |
| 158 | + |
| 159 | +- [Virtualized TPM for guests](/ghaf/overview/arch/guest-tpm) |
| 160 | +- [Ghaf Architecture Overview](/ghaf/overview/arch/system-architecture) |
0 commit comments