changes in response to kevin's last review

jgehrcke · jgehrcke · commit c9442f16176e · 2025-07-23T17:19:59.000+02:00
Signed-off-by: Dr. Jan-Philip Gehrcke &lt;jgehrcke@nvidia.com&gt;
diff --git a/gpu-operator/dra-cds.rst b/gpu-operator/dra-cds.rst
@@ -18,13 +18,13 @@ Motivation
 NVIDIA's `GB200 NVL72 <https://www.nvidia.com/en-us/data-center/gb200-nvl72/>`_ and comparable systems are designed specifically around Multi-Node NVLink (`MNNVL <https://docs.nvidia.com/multi-node-nvlink-systems/mnnvl-user-guide/overview.html>`_) to turn a rack of GPU machines -- each with a small number of GPUs -- into a supercomputer with a large number of GPUs communicating at high bandwidth (1.8 TB/s chip-to-chip, and over `130 TB/s cumulative bandwidth <https://docs.nvidia.com/multi-node-nvlink-systems/multi-node-tuning-guide/overview.html#fifth-generation-nvlink>`_ on a GB200 NVL72).
 
 NVIDIA's DRA Driver for GPUs enables MNNVL for Kubernetes workloads by introducing a new concept -- the **ComputeDomain**:
-when workload requests a ComputeDomain, NVIDIA's DRA Driver for GPUs performs all the heavy lifting required for sharing GPU memory **securely** via NVLink among all pods that comprise the workload.
+when a workload requests a ComputeDomain, NVIDIA's DRA Driver for GPUs performs all the heavy lifting required for sharing GPU memory **securely** via NVLink among all pods that comprise the workload.
 
 .. note::
 
    Users may appreciate to know that -- under the hood -- NVIDIA Internode Memory Exchange (`IMEX <https://docs.nvidia.com/multi-node-nvlink-systems/mnnvl-user-guide/overview.html#internode-memory-exchange-service>`_) primitives need to be orchestrated for mapping GPU memory over NVLink *securely*: IMEX provides an access control system to lock down GPU memory even between GPUs on the same NVLink partition.
 
-   A design goal of this DRA driver is to make IMEX, as much as possible, an implementation detail that workload authors and cluster operators do not need to be concerned with: the driver launches and/or reconfigures IMEX daemons and establishes and injects IMEX channels into containers as needed.
+   A design goal of this DRA driver is to make IMEX, as much as possible, an implementation detail that workload authors and cluster operators do not need to be concerned with: the driver launches and/or reconfigures IMEX daemons and establishes and injects `IMEX channels <https://docs.nvidia.com/multi-node-nvlink-systems/imex-guide/imexchannels.html>`_ into containers as needed.
 
 
 .. _dra-docs-cd-guarantees:
diff --git a/gpu-operator/dra-intro-install.rst b/gpu-operator/dra-intro-install.rst
@@ -49,8 +49,8 @@ Prerequisites
 
 - Kubernetes v1.32 or newer.
 - DRA and corresponding API groups must be enabled (`see Kubernetes docs <https://kubernetes.io/docs/concepts/scheduling-eviction/dynamic-resource-allocation/#enabling-dynamic-resource-allocation>`_).
-- GPU Driver 565 or later.
-- NVIDIA's GPU Operator v25.3.0 or later, installed with CDI enabled (use the ``--set cdi.enabled=true`` commandline argument during ``helm install``). For reference, please refer to the GPU Operator `installation documentation <https://docs.nvidia.com/datacenter/cloud-native/gpu-operator/latest/getting-started.html#common-chart-customization-options>`__.
+- NVIDIA GPU Driver 565 or later.
+- While not strictly required, we recommend using NVIDIA's GPU Operator v25.3.0 or later, installed with CDI enabled (use the ``--set cdi.enabled=true`` commandline argument during ``helm install``). For reference, please refer to the GPU Operator `installation documentation <https://docs.nvidia.com/datacenter/cloud-native/gpu-operator/latest/getting-started.html#common-chart-customization-options>`__.
 
 ..
   For convenience, the following example shows how to enable CDI upon GPU Operator installation:
@@ -80,15 +80,25 @@ Configure and Helm-install the driver
       $ helm repo add nvidia https://helm.ngc.nvidia.com/nvidia \
           && helm repo update
 
-#. Install the driver, providing install-time configuration parameters. Example:
+#. Install the DRA driver, providing install-time configuration parameters.
+
+   Example for *Operator-provided* GPU driver:
 
    .. code-block:: console
 
       $ helm install nvidia-dra-driver-gpu nvidia/nvidia-dra-driver-gpu \
           --version="25.3.0-rc.4" \
-          --create-namespace \
-          --namespace nvidia-dra-driver-gpu \
+          --create-namespace --namespace nvidia-dra-driver-gpu \
+          --set resources.gpus.enabled=false \
           --set nvidiaDriverRoot=/run/nvidia/driver \
+
+   Example for *host-provided* GPU driver:
+
+   .. code-block:: console
+
+      $ helm install nvidia-dra-driver-gpu nvidia/nvidia-dra-driver-gpu \
+          --version="25.3.0-rc.4" \
+          --create-namespace --namespace nvidia-dra-driver-gpu \
           --set resources.gpus.enabled=false
 
 All install-time configuration parameters can be listed by running ``helm show values nvidia/nvidia-dra-driver-gpu``.