Skip to content

Commit 09e1404

Browse files
authored
Merge pull request #331 from a-mccarthy/confidential-containers
Add new confidential container docs
2 parents 41120ae + f3686c9 commit 09e1404

File tree

4 files changed

+536
-0
lines changed

4 files changed

+536
-0
lines changed
Lines changed: 373 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,373 @@
1+
.. _confidential-containers-deploy:
2+
3+
*******************************************************
4+
Deploy Confidential Containers with NVIDIA GPU Operator
5+
*******************************************************
6+
7+
This page describes how to deploy Confidential Containers using the NVIDIA GPU Operator.
8+
For an overview of Confidential Containers, refer to :ref:`early-access-gpu-operator-confidential-containers-kata`.
9+
10+
.. note::
11+
12+
Early Access features are not supported in production environments and are not functionally complete. Early Access features provide a preview of upcoming product features, enabling customers to test functionality and provide feedback during the development process. These releases may not have complete documentation, and testing is limited. Additionally, API and architectural designs are not final and may change in the future.
13+
14+
.. _coco-prerequisites:
15+
16+
Prerequisites
17+
=============
18+
19+
* You are using a supported platform for confidential containers. For more information, refer to :ref:`supported-platforms`. In particular:
20+
21+
* You selected and configured your hardware and BIOS to support confidential computing.
22+
* You installed and configured Ubuntu 25.10 as host OS with its default kernel to support confidential computing.
23+
* You validated that the Linux kernel is SNP-aware.
24+
25+
* Your hosts are configured to enable hardware virtualization and Access Control Services (ACS). With some AMD CPUs and BIOSes, ACS might be grouped under Advanced Error Reporting (AER). Enabling these features is typically performed by configuring the host BIOS.
26+
* Your hosts are configured to support IOMMU.
27+
28+
* If the output from running ``ls /sys/kernel/iommu_groups`` includes 0, 1, and so on, then your host is configured for IOMMU.
29+
* If the host is not configured or you are unsure, add the ``intel_iommu=on`` Linux kernel command-line argument. For most Linux distributions, you add the argument to the ``/etc/default/grub`` file, for instance::
30+
31+
...
32+
GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on modprobe.blacklist=nouveau"
33+
...
34+
35+
* Run ``sudo update-grub`` after making the change to configure the bootloader. Reboot the host after configuring the bootloader.
36+
37+
* You have a Kubernetes cluster and you have cluster administrator privileges. For this cluster, you are using containerd 2.1 and Kubernetes version v1.34. These versions have been validated with the kata-containers project and are recommended. You use a ``runtimeRequestTimeout`` of more than 5 minutes in your `kubelet configuration <https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/>`_ (the current method to pull container images within the confidential container may exceed the two minute default timeout in case of using large container images).
38+
39+
.. _installation-and-configuration:
40+
41+
Installation and Configuration
42+
===============================
43+
44+
Overview
45+
--------
46+
47+
Installing and configuring your cluster to support the NVIDIA GPU Operator with confidential containers is as follows:
48+
49+
1. Label the worker nodes that you want to use with confidential containers.
50+
51+
This step ensures that you can continue to run traditional container workloads with GPU or vGPU workloads on some nodes in your cluster. Alternatively, you can set a default sandbox workload parameter to vm-passthrough to run confidential containers on all worker nodes when you install the GPU Operator.
52+
53+
2. Install the latest Kata Containers helm chart (minimum version: 3.24.0).
54+
55+
This step installs all required components from the Kata Containers project including the Kata Containers runtime binary, runtime configuration, UVM kernel and initrd that NVIDIA uses for confidential containers and native Kata containers.
56+
57+
3. Install the latest version of the NVIDIA GPU Operator (minimum version: v25.10.0).
58+
59+
You install the Operator and specify options to deploy the operands that are required for confidential containers.
60+
61+
After installation, you can change the confidential computing mode and run a sample GPU workload in a confidential container.
62+
63+
Label nodes and install the Kata Containers Helm Chart
64+
-------------------------------------------------------
65+
66+
Perform the following steps to install and verify the Kata Containers Helm Chart:
67+
68+
1. Label the nodes on which you intend to run confidential containers as follows::
69+
70+
$ kubectl label node <node-name> nvidia.com/gpu.workload.config=vm-passthrough
71+
72+
2. Use the 3.24.0 Kata Containers version and chart in environment variables::
73+
74+
$ export VERSION="3.24.0"
75+
$ export CHART="oci://ghcr.io/kata-containers/kata-deploy-charts/kata-deploy"
76+
77+
3. Install the Chart::
78+
79+
$ helm install kata-deploy \
80+
--namespace kata-system \
81+
--create-namespace \
82+
-f "https://raw.githubusercontent.com/kata-containers/kata-containers/refs/tags/${VERSION}/tools/packaging/kata-deploy/helm-chart/kata-deploy/try-kata-nvidia-gpu.values.yaml" \
83+
--set nfd.enabled=false \
84+
--set shims.qemu-nvidia-gpu-tdx.enabled=false \
85+
--wait --timeout 10m --atomic \
86+
"${CHART}" --version "${VERSION}"
87+
88+
*Example Output*::
89+
90+
Pulled: ghcr.io/kata-containers/kata-deploy-charts/kata-deploy:3.24.0
91+
Digest: sha256:d87e4f3d93b7d60eccdb3f368610f2b5ca111bfcd7133e654d08cfd192fb3351
92+
NAME: kata-deploy
93+
LAST DEPLOYED: Wed Dec 17 20:01:53 2025
94+
NAMESPACE: kata-system
95+
STATUS: deployed
96+
REVISION: 1
97+
TEST SUITE: None
98+
99+
4. Optional: View the pod in the kata-system namespace and ensure it is running::
100+
101+
$ kubectl get pod,svc -n kata-system
102+
103+
*Example Output*::
104+
105+
NAME READY STATUS RESTARTS AGE
106+
pod/kata-deploy-4f658 1/1 Running 0 21s
107+
108+
Wait a few minutes for kata-deploy to create the base runtime classes.
109+
110+
5. Verify that the kata-qemu-nvidia-gpu and kata-qemu-nvidia-gpu-snp runtime classes are available::
111+
112+
$ kubectl get runtimeclass
113+
114+
*Example Output*::
115+
116+
NAME HANDLER AGE
117+
kata-qemu-nvidia-gpu kata-qemu-nvidia-gpu 40s
118+
kata-qemu-nvidia-gpu-snp kata-qemu-nvidia-gpu-snp 40s
119+
120+
Install the NVIDIA GPU Operator
121+
--------------------------------
122+
123+
Perform the following steps to install the Operator for use with confidential containers:
124+
125+
1. Add and update the NVIDIA Helm repository::
126+
127+
$ helm repo add nvidia https://helm.ngc.nvidia.com/nvidia \
128+
&& helm repo update
129+
130+
2. Specify at least the following options when you install the Operator. If you want to run Confidential Containers by default on all worker nodes, also specify ``--set sandboxWorkloads.defaultWorkload=vm-passthrough``::
131+
132+
$ helm install --wait --generate-name \
133+
-n gpu-operator --create-namespace \
134+
nvidia/gpu-operator \
135+
--set sandboxWorkloads.enabled=true \
136+
--set kataManager.enabled=true \
137+
--set kataManager.config.runtimeClasses=null \
138+
--set kataManager.repository=nvcr.io/nvidia/cloud-native \
139+
--set kataManager.image=k8s-kata-manager \
140+
--set kataManager.version=v0.2.4 \
141+
--set ccManager.enabled=true \
142+
--set ccManager.defaultMode=on \
143+
--set ccManager.repository=nvcr.io/nvidia/cloud-native \
144+
--set ccManager.image=k8s-cc-manager \
145+
--set ccManager.version=v0.2.0 \
146+
--set sandboxDevicePlugin.repository=nvcr.io/nvidia/cloud-native \
147+
--set sandboxDevicePlugin.image=nvidia-sandbox-device-plugin \
148+
--set sandboxDevicePlugin.version=v0.0.1 \
149+
--set 'sandboxDevicePlugin.env[0].name=P_GPU_ALIAS' \
150+
--set 'sandboxDevicePlugin.env[0].value=pgpu' \
151+
--set nfd.enabled=true \
152+
--set nfd.nodefeaturerules=true
153+
154+
*Example Output*::
155+
156+
NAME: gpu-operator-1766001809
157+
LAST DEPLOYED: Wed Dec 17 20:03:29 2025
158+
NAMESPACE: gpu-operator
159+
STATUS: deployed
160+
REVISION: 1
161+
TEST SUITE: None
162+
163+
3. Verify that all GPU Operator pods, especially the Kata Manager, Confidential Computing Manager, Sandbox Device Plugin and VFIO Manager operands, are running::
164+
165+
$ kubectl get pods -n gpu-operator
166+
167+
*Example Output*::
168+
169+
NAME READY STATUS RESTARTS AGE
170+
gpu-operator-1766001809-node-feature-discovery-gc-75776475sxzkp 1/1 Running 0 86s
171+
gpu-operator-1766001809-node-feature-discovery-master-6869lxq2g 1/1 Running 0 86s
172+
gpu-operator-1766001809-node-feature-discovery-worker-mh4cv 1/1 Running 0 86s
173+
gpu-operator-f48fd66b-vtfrl 1/1 Running 0 86s
174+
nvidia-cc-manager-7z74t 1/1 Running 0 61s
175+
nvidia-kata-manager-k8ctm 1/1 Running 0 62s
176+
nvidia-sandbox-device-plugin-daemonset-d5rvg 1/1 Running 0 30s
177+
nvidia-sandbox-validator-6xnzc 1/1 Running 1 30s
178+
nvidia-vfio-manager-h229x 1/1 Running 0 62s
179+
180+
4. If the nvidia-cc-manager is *not* running, you need to label your CC-capable node(s) by hand. The node labelling capabilities in the early access version are not complete. To label your node(s), run::
181+
182+
$ kubectl label node <nodename> nvidia.com/cc.capable=true
183+
184+
5. Optional: If you have host access to the worker node, you can perform the following validation steps:
185+
186+
a. Confirm that the host uses the vfio-pci device driver for GPUs::
187+
188+
$ lspci -nnk -d 10de:
189+
190+
*Example Output*::
191+
192+
65:00.0 3D controller [0302]: NVIDIA Corporation xxxxxxx [xxx] [10de:xxxx] (rev xx)
193+
Subsystem: NVIDIA Corporation xxxxxxx [xxx] [10de:xxxx]
194+
Kernel driver in use: vfio-pci
195+
Kernel modules: nvidiafb, nouveau
196+
197+
b. Confirm that the kata-deploy functionality installed the kata-qemu-nvidia-gpu-snp and kata-qemu-nvidia-gpu runtime class files::
198+
199+
$ ls -l /opt/kata/share/defaults/kata-containers/ | grep nvidia
200+
201+
*Example Output*::
202+
203+
-rw-r--r-- 1 root root 3333 Dec 17 20:01 configuration-qemu-nvidia-gpu-snp.toml
204+
-rw-r--r-- 1 root root 30812 Dec 12 17:41 configuration-qemu-nvidia-gpu-tdx.toml
205+
-rw-r--r-- 1 root root 30279 Dec 12 17:41 configuration-qemu-nvidia-gpu.toml
206+
207+
c. Confirm that the kata-deploy functionality installed the UVM components::
208+
209+
$ ls -l /opt/kata/share/kata-containers/ | grep nvidia
210+
211+
*Example Output*::
212+
213+
lrwxrwxrwx 1 root root 58 Dec 17 20:01 kata-containers-initrd-nvidia-gpu-confidential.img -> kata-ubuntu-noble-nvidia-gpu-confidential-580.95.05.initrd
214+
lrwxrwxrwx 1 root root 45 Dec 17 20:01 kata-containers-initrd-nvidia-gpu.img -> kata-ubuntu-noble-nvidia-gpu-580.95.05.initrd
215+
lrwxrwxrwx 1 root root 57 Dec 17 20:01 kata-containers-nvidia-gpu-confidential.img -> kata-ubuntu-noble-nvidia-gpu-confidential-580.95.05.image
216+
lrwxrwxrwx 1 root root 44 Dec 17 20:01 kata-containers-nvidia-gpu.img -> kata-ubuntu-noble-nvidia-gpu-580.95.05.image
217+
lrwxrwxrwx 1 root root 42 Dec 17 20:01 vmlinux-nvidia-gpu-confidential.container -> vmlinux-6.16.7-173-nvidia-gpu-confidential
218+
lrwxrwxrwx 1 root root 30 Dec 17 20:01 vmlinux-nvidia-gpu.container -> vmlinux-6.12.47-173-nvidia-gpu
219+
220+
Run a Sample Workload
221+
----------------------
222+
223+
A pod manifest for a confidential computing GPU workload requires the following:
224+
225+
1. Create a file, such as the following cuda-vectoradd-kata.yaml sample, specifying the kata-qemu-nvidia-gpu-snp runtime class:
226+
227+
.. code-block:: yaml
228+
229+
apiVersion: v1
230+
kind: Pod
231+
metadata:
232+
name: cuda-vectoradd-kata
233+
namespace: default
234+
annotations:
235+
io.katacontainers.config.hypervisor.kernel_params: "nvrc.smi.srs=1"
236+
spec:
237+
runtimeClassName: kata-qemu-nvidia-gpu-snp
238+
restartPolicy: Never
239+
containers:
240+
- name: cuda-vectoradd
241+
image: "nvcr.io/nvidia/k8s/cuda-sample:vectoradd-cuda12.5.0-ubuntu22.04"
242+
resources:
243+
limits:
244+
nvidia.com/pgpu: "1"
245+
memory: 16Gi
246+
247+
2. Create the pod::
248+
249+
$ kubectl apply -f cuda-vectoradd-kata.yaml
250+
251+
3. View the logs from pod after the container was started::
252+
253+
$ kubectl logs -n default cuda-vectoradd-kata
254+
255+
*Example Output*::
256+
257+
[Vector addition of 50000 elements]
258+
Copy input data from the host memory to the CUDA device
259+
CUDA kernel launch with 196 blocks of 256 threads
260+
Copy output data from the CUDA device to the host memory
261+
Test PASSED
262+
Done
263+
264+
4. Delete the pod::
265+
266+
$ kubectl delete -f cuda-vectoradd-kata.yaml
267+
268+
.. _managing-confidential-computing-mode:
269+
270+
Managing the Confidential Computing Mode
271+
=========================================
272+
273+
You can set the default confidential computing mode of the NVIDIA GPUs by setting the ``ccManager.defaultMode=<on|off|devtools>`` option. The default value is off. You can set this option when you install NVIDIA GPU Operator or afterward by modifying the cluster-policy instance of the ClusterPolicy object.
274+
275+
When you change the mode, the manager performs the following actions:
276+
277+
* Evicts the other GPU Operator operands from the node.
278+
279+
However, the manager does not drain user workloads. You must make sure that no user workloads are running on the node before you change the mode.
280+
281+
* Unbinds the GPU from the VFIO PCI device driver.
282+
* Changes the mode and resets the GPU.
283+
* Reschedules the other GPU Operator operands.
284+
285+
Three modes are supported:
286+
287+
* ``on`` -- Enable confidential computing.
288+
* ``off`` -- Disable confidential computing.
289+
* ``devtools`` -- Development mode for software development and debugging.
290+
291+
You can set a cluster-wide default mode and you can set the mode on individual nodes. The mode that you set on a node has higher precedence than the cluster-wide default mode.
292+
293+
Setting a Cluster-Wide Default Mode
294+
------------------------------------
295+
296+
To set a cluster-wide mode, specify the ccManager.defaultMode field like the following example::
297+
298+
$ kubectl patch clusterpolicies.nvidia.com/cluster-policy \
299+
--type=merge \
300+
-p '{"spec": {"ccManager": {"defaultMode": "on"}}}'
301+
302+
Setting a Node-Level Mode
303+
--------------------------
304+
305+
To set a node-level mode, apply the ``nvidia.com/cc.mode=<on|off|devtools>`` label like the following example::
306+
307+
$ kubectl label node <node-name> nvidia.com/cc.mode=on --overwrite
308+
309+
The mode that you set on a node has higher precedence than the cluster-wide default mode.
310+
311+
Verifying a Mode Change
312+
------------------------
313+
314+
To verify that changing the mode was successful, a cluster-wide or node-level change, view the nvidia.com/cc.mode and nvidia.com/cc.mode.state node labels::
315+
316+
$ kubectl get node <node-name> -o json | \
317+
jq '.items[0].metadata.labels | with_entries(select(.key | startswith("nvidia.com/cc.mode")))'
318+
319+
Example output when CC mode is disabled:
320+
321+
.. code-block:: json
322+
323+
{
324+
"nvidia.com/cc.mode": "off",
325+
"nvidia.com/cc.mode.state": "on"
326+
}
327+
328+
Example output when CC mode is enabled:
329+
330+
.. code-block:: json
331+
332+
{
333+
"nvidia.com/cc.mode": "on",
334+
"nvidia.com/cc.mode.state": "on"
335+
}
336+
337+
The "nvidia.com/cc.mode.state" variable is either "off" or "on", with "off" meaning that mode state transition is still ongoing and "on" meaning mode state transition completed.
338+
339+
.. _attestation:
340+
341+
Attestation
342+
===========
343+
344+
Confidential Containers has remote attestation support for the CPU and GPU built-in. Attestation allows a workload owner to cryptographically validate the guest TCB. This process is facilitated by components inside the guest rootfs. When a secret resource is required inside the confidential guest (to decrypt a container image, or to decrypt a model, for instance), the guest components detect which CPU and GPU enclaves are in use and collect hardware evidence from them. This evidence is sent to a remote verifier/broker known as Trustee, which evaluates the evidence and conditionally releases secrets. Features that depend on secrets depend on attestation. These features include, pulling encrypted images, authenticated registry support, sealed secrets, direct workload requests for secrets, and more. To use these features, Trustee must first be provisioned in some trusted environment.
345+
346+
Trustee can be set up following `upstream documentation <https://confidentialcontainers.org/docs/attestation/installation/>`_, with one key requirement for attesting NVIDIA devices. Specifically, Trustee must be configured to use the remote NVIDIA verifier, which uses NRAS to evaluate the evidence. This is not enabled by default. Enabling the remote verifier assumes that the user has entered into a `licensing agreement <https://docs.nvidia.com/attestation/cloud-services/latest/license.html>`_ covering NVIDIA attestation services.
347+
348+
To enable the remote verifier, add the following lines to the Trustee configuration file::
349+
350+
[attestation_service.verifier_config.nvidia_verifier]
351+
type = "Remote"
352+
353+
If you are using the docker compose Trustee deployment, add the verifier type to kbs/config/as-config.json prior to starting Trustee.
354+
355+
Per upstream documentation, add the following annotation to the workload to point the guest components to Trustee::
356+
357+
io.katacontainers.config.hypervisor.kernel_params: "agent.aa_kbc_params=cc_kbc::http://<kbs-ip>:<kbs-port>"
358+
359+
Now, the guest can be used with attestation. For more information on how to provision Trustee with resources and policies, see the upstream documentation.
360+
361+
During attestation, the GPU will be set to ready. As such, when running a workload that does attestation, it is not necessary to set the nvrc.smi.srs=1 kernel parameter.
362+
363+
If attestation does not succeed, debugging is best done via the Trustee log. Debug mode can be enabled by setting RUST_LOG=debug in the Trustee environment.
364+
365+
.. _additional-resources:
366+
367+
Additional Resources
368+
====================
369+
370+
* NVIDIA Confidential Computing documentation is available at https://docs.nvidia.com/confidential-computing.
371+
* Trustee Upstream Documentation: https://confidentialcontainers.org/docs/attestation/
372+
* Trustee NVIDIA Verifier Documentation: https://github.com/confidential-containers/trustee/blob/main/deps/verifier/src/nvidia/README.md
373+

0 commit comments

Comments
 (0)