VGPU Token Operator

Kubernetes native way to handle VGPU tokens and license validation.

Motivation

NVIDIA vGPUs fetch licenses through JWTs via the gridd service. While the NVIDIA GPU Operator facilitates licensing for its own driver deployments via mounted ConfigMaps, it doesn't solve the problem for systems with pre-installed vGPU drivers. In these environments, there's no inherent mechanism to automatically update expired JWTs, forcing manual intervention on every node. This project addresses this specific gap by offering a Kubernetes-native solution to streamline the token update process for these pre-installed driver setups.

Development

Install tools

Devbox
Direnv
Container Runtime for your Operating System

Deploying the operator

VGPU token operator is deployed using a helm chart which can be found in charts/

Prerequisites

Absolutely Required:

Kubernetes cluster
GPU operator deployed

Things you probably have if you're looking at this project.

Valid vgpu token with a corresponding license server
VGPU drivers on host hyper-visor
VM Image with VGPU driver installed

To deploy helm chart to your cluster during development run

NOTE: set OCI_REPOSITORYto a repository that you have push access to

make helm-install-snapshot

Creating the resources

Create the secret

NOTE: It is critical that the key for the token value is client_configuration_token.tok. Otherwise the mounts for the daemonset will fail.

apiVersion: v1
kind: Secret
metadata:
  name: client-config-token
  namespace: vgpu-system
stringData:
  client_configuration_token.tok: "${VGPU_TOKEN_VALUE}"

Create the VGPUToken object, setting tokenSecretRef to the same name as the secret created above

apiVersion: vgpu-token.nutanix.com/v1alpha1
kind: VGPUToken
metadata:
  name: vgpu-token
  namespace: vgpu-system
spec:
  tokenSecretRef:
    name: client-config-token

After creating these resources the token secret should mount on the host at /etc/nvidia/ClientConfigToken/client_configuration_token.tok

Finally, we can verify the license status by creating a pod to verify the license status by running nvidia-smi and checking the license status

apiVersion: v1
kind: Pod
metadata:
  generateName: gpu-pod-
  labels:
    test: gpu-pod
spec:
  restartPolicy: OnFailure
  containers:
  - name: gpu-pod
    image: nvcr.io/nvidia/k8s/cuda-sample:vectoradd-cuda12.5.0
    command: ["nvidia-smi", "-q"]
    resources:
      limits:
        nvidia.com/gpu: 1
  nodeSelector:
    "nvidia.com/gpu.present": "true"
  tolerations:
  - key: "nvidia.com/gpu"
    operator: "Exists"
    effect: "NoSchedule"

In the pod logs, you should see the product is Licensed.

    vGPU Software Licensed Product
        Product Name                      : NVIDIA Virtual Compute Server
        License Status                    : Licensed (Expiry: 2025-6-5 15:22:24 GMT)

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
.devcontainer		.devcontainer
.github		.github
api/v1alpha1		api/v1alpha1
charts		charts
cmd		cmd
docker		docker
docs		docs
hack		hack
internal		internal
make		make
nix-packages		nix-packages
pkg		pkg
test/e2e		test/e2e
.dockerignore		.dockerignore
.envrc		.envrc
.gitignore		.gitignore
.golangci.yml		.golangci.yml
.goreleaser.yaml		.goreleaser.yaml
.markdownlint.json		.markdownlint.json
.pre-commit-config.yaml		.pre-commit-config.yaml
.release-please-manifest.json		.release-please-manifest.json
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
Makefile		Makefile
NOTICES		NOTICES
PROJECT		PROJECT
README.md		README.md
devbox.json		devbox.json
devbox.lock		devbox.lock
go.mod		go.mod
go.sum		go.sum
release-please-config.json		release-please-config.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

VGPU Token Operator

Motivation

Development

Deploying the operator

Prerequisites

Creating the resources

About

Uh oh!

Releases 8

Packages

Uh oh!

Contributors 5

Uh oh!

Languages

License

nutanix-cloud-native/vgpu-token-operator

Folders and files

Latest commit

History

Repository files navigation

VGPU Token Operator

Motivation

Development

Deploying the operator

Prerequisites

Creating the resources

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 8

Packages 0

Uh oh!

Contributors 5

Uh oh!

Languages

Packages