Skip to content

Remove bash/shell dependencies from the final image to reduce security risks #912

@leiyiz

Description

@leiyiz

FR description

Currently, the k8s-dra-driver-gpu container image relies on several bash scripts during runtime:

  1. hack/kubelet-plugin-prestart.sh (Used as the entrypoint for the init-kubelet-plugin initContainer).
  2. scripts/bind_to_driver.sh (Executed via os/exec in the Go kubelet plugin).
  3. scripts/unbind_from_driver.sh (Executed via os/exec in the Go kubelet plugin).

Because of these runtime dependencies, the Dockerfile has to explicitly download and copy a static bash binary (ghcr.io/nvidia/k8s-dra-driver-gpu:v25.12.0-dev-839e966a) into the final image.

This reliance on a shell environment prevents the project from adopting pure, zero-shell distroless base images (like gcr.io/distroless/static or gcr.io/distroless/base). In strict, highly-regulated enterprise Kubernetes environments (where supply chain security and minimizing the attack surface are critical), deploying containers that contain a shell is increasingly flagged by security policies.

Describe the solution you'd like

I propose we refactor the logic currently contained in these bash scripts directly into the Go codebase. This would allow us to completely eliminate the bash dependency from the final container image.

Proposed Implementation Path:

  1. bind_to_driver.sh & unbind_from_driver.sh:

    • These scripts primarily handle reading and writing to sysfs (/sys/bus/pci/...) and procfs (/proc/driver/nvidia/...).
    • We can replace the exec.Command calls in cmd/gpu-kubelet-plugin/vfio-device.go with native Go file I/O (os.WriteFile,
      os.ReadFile, etc.). This will also improve error handling, as we won't have to parse stdout/stderr from a subprocess.
  2. kubelet-plugin-prestart.sh: (less important)

    • This script acts as an init container, looping to check for the presence and health of nvidia-smi and libnvidia-ml.so.1 on the host mount.
    • We can implement this logic as a new subcommand in the existing gpu-kubelet-plugin binary (e.g., gpu-kubelet-plugin prestart-init) or as a tiny, separate Go binary built alongside the others.

Additional thoughts

I'd also love to hear if there were some special considerations behind the reason for using shell scripts over golang code.

Metadata

Metadata

Assignees

Labels

debuggabilityissue/pr related to the ability to debug the systemsecurity

Type

No type

Projects

Status

Backlog

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions