The NVIDIA Device Plugin for Kubernetes exposes GPUs to the Kubernetes scheduler, allowing pods to request nvidia.com/gpu resources.
- NVIDIA Device Plugin: 0.17.1
helm repo add nvdp https://nvidia.github.io/k8s-device-plugin
helm repo update
helm install nvdp nvdp/nvidia-device-plugin \
--version 0.17.1 \
--namespace nvidia-device-plugin \
--create-namespaceInitially tried the full NVIDIA GPU Operator, which bundles the driver, container runtime, device plugin, and monitoring into a single install. It didn't work reliably on this setup (Debian 13 with manually installed drivers), so it was uninstalled:
helm uninstall gpu-operator -n gpu-operator
kubectl delete namespace gpu-operator --force --grace-period=0The standalone device plugin via Helm is simpler and more predictable when you already have:
- NVIDIA drivers installed at the OS level
- containerd configured with the NVIDIA runtime
- CDI (Container Device Interface) generated
# Check the device plugin daemonset is running
kubectl -n nvidia-device-plugin get pods
# Check GPUs are visible to the scheduler
kubectl get nodes -o jsonpath='{.items[*].status.allocatable.nvidia\.com/gpu}'
# Expected output: 3
# List GPU devices
kubectl -n nvidia-device-plugin logs ds/nvdp-nvidia-device-plugin | grep "Device found"If pods show CreateContainerConfigError with "endpoint not found in cache":
kubectl -n nvidia-device-plugin rollout restart daemonset nvdp-nvidia-device-pluginSee Known Issues for more details.