Examples

Common usage patterns and scenarios for Mock NVML.

Basic Usage

Default Configuration (8x A100)

cd pkg/gpu/mocknvml
make
LD_LIBRARY_PATH=. nvidia-smi

With A100 YAML Profile

LD_LIBRARY_PATH=. MOCK_NVML_CONFIG=configs/mock-nvml-config-a100.yaml nvidia-smi

With GB200 YAML Profile

LD_LIBRARY_PATH=. MOCK_NVML_CONFIG=configs/mock-nvml-config-gb200.yaml nvidia-smi

nvidia-smi Commands

List GPUs

LD_LIBRARY_PATH=. nvidia-smi -L

Output:

GPU 0: NVIDIA A100-SXM4-40GB (UUID: GPU-12345678-1234-1234-1234-123456780000)
GPU 1: NVIDIA A100-SXM4-40GB (UUID: GPU-12345678-1234-1234-1234-123456780001)
...

Full Query

LD_LIBRARY_PATH=. MOCK_NVML_CONFIG=configs/mock-nvml-config-a100.yaml nvidia-smi -q

The mock library supports all fields in nvidia-smi -q output including temperature, power, clocks, ECC, PCIe, memory, utilization, persistence mode, GSP firmware, and energy consumption. Fields backed by configuration return profile values; unimplemented fields show "N/A".

XML Query

LD_LIBRARY_PATH=. nvidia-smi -x -q > gpu-report.xml

Produces valid XML output parseable by automated tools. Useful for testing GPU monitoring pipelines that consume nvidia-smi -x -q.

Query Specific Details

# Memory only
LD_LIBRARY_PATH=. nvidia-smi -q -d MEMORY

# Temperature only
LD_LIBRARY_PATH=. nvidia-smi -q -d TEMPERATURE

# Power only
LD_LIBRARY_PATH=. nvidia-smi -q -d POWER

# Clocks only
LD_LIBRARY_PATH=. nvidia-smi -q -d CLOCK

# ECC only
LD_LIBRARY_PATH=. nvidia-smi -q -d ECC

# PCIe only
LD_LIBRARY_PATH=. nvidia-smi -q -d PCIE

# Utilization only
LD_LIBRARY_PATH=. nvidia-smi -q -d UTILIZATION

CSV Output

LD_LIBRARY_PATH=. nvidia-smi --query-gpu=index,name,uuid,memory.total,power.draw,temperature.gpu --format=csv

Output:

index, name, uuid, memory.total [MiB], power.draw [W], temperature.gpu
0, NVIDIA A100-SXM4-40GB, GPU-12345678-1234-1234-1234-123456780000, 40960 MiB, 72.00 W, 33
1, NVIDIA A100-SXM4-40GB, GPU-12345678-1234-1234-1234-123456780001, 40960 MiB, 72.00 W, 34
...

Query Specific GPU

LD_LIBRARY_PATH=. nvidia-smi -i 0 -q

Custom GPU Profiles

Single GPU Configuration

# single-gpu.yaml
version: "1.0"

system:
  driver_version: "550.163.01"
  nvml_version: "12.550.163.01"
  cuda_version: "12.4"
  cuda_version_major: 12
  cuda_version_minor: 4

device_defaults:
  name: "NVIDIA RTX 4090"
  architecture: "ada"
  memory:
    total_bytes: 25769803776  # 24 GiB
  power:
    default_limit_mw: 450000
    current_draw_mw: 50000
  thermal:
    temperature_gpu_c: 45

devices:
  - index: 0
    uuid: "GPU-RTX4090-0000-0000-0000-000000000000"
    pci:
      bus_id: "00000000:01:00.0"

LD_LIBRARY_PATH=. MOCK_NVML_CONFIG=single-gpu.yaml nvidia-smi

Mixed GPU Configuration

# mixed-gpus.yaml
version: "1.0"

system:
  driver_version: "550.163.01"
  cuda_version_major: 12
  cuda_version_minor: 4

device_defaults:
  name: "NVIDIA A100"
  architecture: "ampere"
  memory:
    total_bytes: 42949672960

devices:
  - index: 0
    uuid: "GPU-A100-0000"
    name: "NVIDIA A100-SXM4-80GB"
    memory:
      total_bytes: 85899345920  # 80 GiB
    pci:
      bus_id: "00000000:07:00.0"
  
  - index: 1
    uuid: "GPU-H100-0001"
    name: "NVIDIA H100"
    architecture: "hopper"
    memory:
      total_bytes: 85899345920
    pci:
      bus_id: "00000000:0F:00.0"

Simulating GPU Under Load

# gpu-under-load.yaml
version: "1.0"

system:
  driver_version: "550.163.01"
  cuda_version_major: 12
  cuda_version_minor: 4

device_defaults:
  name: "NVIDIA A100-SXM4-40GB"
  memory:
    total_bytes: 42949672960
    used_bytes: 32212254720    # 30 GiB used
    free_bytes: 10737418240
  power:
    current_draw_mw: 350000    # 350W (near limit)
  thermal:
    temperature_gpu_c: 72      # Hot
  utilization:
    gpu: 95                    # 95% GPU utilization
    memory: 75                 # 75% memory bandwidth
  clocks:
    graphics_current: 1350
    sm_current: 1350
  clocks_throttle_reasons:
    sw_power_cap: true

devices:
  - index: 0
    uuid: "GPU-LOAD-0000"
    pci:
      bus_id: "00000000:07:00.0"
    processes:
      - pid: 12345
        type: "C"
        name: "python"
        used_memory_mib: 30720

Testing Scenarios

Testing NVIDIA Device Plugin

# Build mock library
make -C pkg/gpu/mocknvml

# Set up environment
export LD_LIBRARY_PATH=$(pwd)/pkg/gpu/mocknvml
export MOCK_NVML_CONFIG=$(pwd)/pkg/gpu/mocknvml/configs/mock-nvml-config-a100.yaml

# Run device plugin
./k8s-device-plugin --mps-root=""

Testing DCGM

export LD_LIBRARY_PATH=$(pwd)/pkg/gpu/mocknvml
export MOCK_NVML_CONFIG=$(pwd)/pkg/gpu/mocknvml/configs/mock-nvml-config-a100.yaml

# Run DCGM exporter
dcgm-exporter

CI/CD Pipeline Example

# .github/workflows/gpu-tests.yaml
name: GPU Tests

on: [push, pull_request]

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      
      - name: Set up Go
        uses: actions/setup-go@v5
        with:
          go-version: '1.25'
      
      - name: Build Mock NVML
        run: make -C pkg/gpu/mocknvml docker-build
      
      - name: Run GPU Tests
        env:
          LD_LIBRARY_PATH: ./pkg/gpu/mocknvml
          MOCK_NVML_CONFIG: ./pkg/gpu/mocknvml/configs/mock-nvml-config-a100.yaml
        run: go test ./... -tags=gpu -v
      
      - name: Verify nvidia-smi
        env:
          LD_LIBRARY_PATH: ./pkg/gpu/mocknvml
          MOCK_NVML_CONFIG: ./pkg/gpu/mocknvml/configs/mock-nvml-config-a100.yaml
        run: nvidia-smi -L

Docker Integration

# Dockerfile
FROM golang:1.25

# Copy mock NVML library
COPY pkg/gpu/mocknvml/libnvidia-ml.so* /usr/local/lib/
COPY pkg/gpu/mocknvml/configs/mock-nvml-config-a100.yaml /etc/mock-nvml-config.yaml

# Set environment
ENV LD_LIBRARY_PATH=/usr/local/lib
ENV MOCK_NVML_CONFIG=/etc/mock-nvml-config.yaml

# Your application
COPY myapp /app/myapp
CMD ["/app/myapp"]

docker build -t myapp-with-mock-gpu .
docker run myapp-with-mock-gpu nvidia-smi

Kubernetes ConfigMap

apiVersion: v1
kind: ConfigMap
metadata:
  name: mock-nvml-config
data:
  config.yaml: |
    version: "1.0"
    system:
      driver_version: "550.163.01"
      cuda_version_major: 12
      cuda_version_minor: 4
    device_defaults:
      name: "NVIDIA A100-SXM4-40GB"
      memory:
        total_bytes: 42949672960
    devices:
      - index: 0
        uuid: "GPU-TEST-0000"
        pci:
          bus_id: "00000000:00:00.0"
---
apiVersion: v1
kind: Pod
metadata:
  name: gpu-test
spec:
  containers:
    - name: test
      image: myapp:latest
      env:
        - name: LD_LIBRARY_PATH
          value: /mock-nvml
        - name: MOCK_NVML_CONFIG
          value: /config/config.yaml
      volumeMounts:
        - name: mock-nvml
          mountPath: /mock-nvml
        - name: config
          mountPath: /config
  volumes:
    - name: mock-nvml
      hostPath:
        path: /path/to/mock-nvml
    - name: config
      configMap:
        name: mock-nvml-config

Debugging

Enable Debug Logging

MOCK_NVML_DEBUG=1 LD_LIBRARY_PATH=. nvidia-smi

Output includes:

[CONFIG] Loaded YAML config: 8 devices, driver 550.163.01
[ENGINE] Creating devices from YAML config
[DEVICE 0] Created: name=NVIDIA A100-SXM4-40GB uuid=GPU-12345678-...
[NVML] nvmlInit_v2
[NVML] nvmlDeviceGetCount
[NVML] nvmlDeviceGetHandleByIndex(0)
[NVML] nvmlDeviceGetName -> NVIDIA A100-SXM4-40GB
[NVML] nvmlDeviceGetTemperature(sensor=0) -> 33
...

Verify Library Loading

# Check library is being loaded
LD_DEBUG=libs LD_LIBRARY_PATH=. nvidia-smi 2>&1 | grep nvml

# List exported symbols
nm -D pkg/gpu/mocknvml/libnvidia-ml.so | grep nvml | head -20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Examples

Basic Usage

Default Configuration (8x A100)

With A100 YAML Profile

With GB200 YAML Profile

nvidia-smi Commands

List GPUs

Full Query

XML Query

Query Specific Details

CSV Output

Query Specific GPU

Custom GPU Profiles

Single GPU Configuration

Mixed GPU Configuration

Simulating GPU Under Load

Testing Scenarios

Testing NVIDIA Device Plugin

Testing DCGM

CI/CD Pipeline Example

Docker Integration

Kubernetes ConfigMap

Debugging

Enable Debug Logging

Verify Library Loading

FilesExpand file tree

examples.md

Latest commit

History

examples.md

File metadata and controls

Examples

Basic Usage

Default Configuration (8x A100)

With A100 YAML Profile

With GB200 YAML Profile

nvidia-smi Commands

List GPUs

Full Query

XML Query

Query Specific Details

CSV Output

Query Specific GPU

Custom GPU Profiles

Single GPU Configuration

Mixed GPU Configuration

Simulating GPU Under Load

Testing Scenarios

Testing NVIDIA Device Plugin

Testing DCGM

CI/CD Pipeline Example

Docker Integration

Kubernetes ConfigMap

Debugging

Enable Debug Logging

Verify Library Loading