update openshift docs (#174)

a-mccarthy · web-flow · commit bae04dcc1df3 · 2025-04-29T06:57:09.000-04:00
* update vgpu and gpu pass through example in gpu/openshift docs, update openshift docs for clarity


Signed-off-by: Abigail McCarthy &lt;20771501+a-mccarthy@users.noreply.github.com&gt;
diff --git a/gpu-operator/gpu-operator-kubevirt.rst b/gpu-operator/gpu-operator-kubevirt.rst
@@ -224,7 +224,9 @@ The following example shows how to permit the A10 GPU device and A10-24Q vGPU de
                 Subsystem: NVIDIA Corporation GA102GL [A10] [10de:1482]
                 Kernel modules: nvidiafb, nouveau
 
-#. Modify the ``KubeVirt`` custom resource like the following partial example:
+#. Modify the ``KubeVirt`` custom resource like the following partial example. 
+
+   Add 
 
    .. code-block:: yaml
 
@@ -235,20 +237,24 @@ The following example shows how to permit the A10 GPU device and A10-24Q vGPU de
             featureGates:
             - GPU
             - DisableMDEVConfiguration
-          permittedHostDevices:
-            pciHostDevices:
+          permittedHostDevices: # Defines VM devices to import.
+            pciHostDevices: # Include for GPU passthrough
             - externalResourceProvider: true
               pciVendorSelector: 10DE:2236
               resourceName: nvidia.com/GA102GL_A10
-            mediatedDevices:
+            mediatedDevices: # Include for vGPU 
             - externalResourceProvider: true
               mdevNameSelector: NVIDIA A10-24Q
               resourceName: nvidia.com/NVIDIA_A10-24Q
       ...
 
    Replace the values in the YAML as follows:
 
-   * ``pciDeviceSelector`` and ``resourceName`` under ``pciHostDevices`` to correspond to your GPU model.
+   * Include ``permittedHostDevices`` for GPU passthrough.
+
+   * Include ``mediatedDevices`` for vGPU.
+
+   * ``pciVendorSelector`` and ``resourceName`` under ``pciHostDevices`` to correspond to your GPU model.
 
    * ``mdevNameSelector`` and ``resourceName`` under ``mediatedDevices`` to correspond to your vGPU type.
 
diff --git a/openshift/openshift-virtualization.rst b/openshift/openshift-virtualization.rst
@@ -96,6 +96,9 @@ Prerequisites
      hyperconverged.hco.kubevirt.io/kubevirt-hyperconverged patched
 
 
+* If planning to use NVIDIA vGPU, SR-IOV must be enabled in the BIOS if your GPUs are based on the NVIDIA Ampere architecture or later. Refer to the `NVIDIA vGPU Documentation <https://docs.nvidia.com/grid/latest/grid-vgpu-user-guide/index.html#prereqs-vgpu>`_ to ensure you have met all of the prerequisites for using NVIDIA vGPU.
+
+
 **********************************
 Enabling the IOMMU driver on hosts
 **********************************
@@ -297,14 +300,20 @@ Create the cluster policy using the CLI:
 
 #. Modify the ``clusterpolicy.json`` file as follows:
 
-   .. note:: The ``vgpuManager`` options are only required if you want to use the NVIDIA vGPU. If you are only using GPU passthrough, these options should not be set.
-
    * sandboxWorloads.enabled=true
    * vgpuManager.enabled=true
    * vgpuManager.repository=<path to private repository>
    * vgpuManager.image=vgpu-manager
    * vgpuManager.version=<driver version>
    * vgpuManager.imagePullSecrets={<name of image pull secret>}
+   
+
+   The ``vgpuManager`` options are only required if you want to use the NVIDIA vGPU. If you are only using GPU passthrough, these options should not be set.
+
+   In general, the flag ``sandboxWorkloads.enabled`` in ``ClusterPolicy`` controls whether the GPU Operator can provision GPU worker nodes for virtual machine workloads, in addition to container workloads. This flag is disabled by default, meaning all nodes get provisioned with the same software which enables container workloads, and the ``nvidia.com/gpu.workload.config`` node label is not used.
+
+   The term ``sandboxing`` refers to running software in a separate isolated environment, typically for added security (i.e. a virtual machine). We use the term ``sandbox workloads`` to signify workloads that run in a virtual machine, irrespective of the virtualization technology used.
+
 
 #. Apply the changes:
 
@@ -370,19 +379,23 @@ The following example permits the A10 GPU device and A10-24Q vGPU device.
       spec:
         featureGates:
           disableMDevConfiguration: true
-        permittedHostDevices:
-          pciHostDevices:
+        permittedHostDevices: # Defines VM devices to import.
+          pciHostDevices: # Include for GPU passthrough
           - externalResourceProvider: true
             pciDeviceSelector: 10DE:2236
             resourceName: nvidia.com/GA102GL_A10
-          mediatedDevices:
+          mediatedDevices: # Include for vGPU
           - externalResourceProvider: true
             mdevNameSelector: NVIDIA A10-24Q
             resourceName: nvidia.com/NVIDIA_A10-24Q
       ...
 
    Replace the values in the YAML as follows:
 
+   * Include ``permittedHostDevices`` for GPU passthrough.
+
+   * Include ``mediatedDevices`` for vGPU.
+
    * ``pciDeviceSelector`` and ``resourceName`` under ``pciHostDevices`` to correspond to your GPU model.
 
    * ``mdevNameSelector`` and ``resourceName`` under ``mediatedDevices`` to correspond to your vGPU type.