diff --git a/openshift/graphics/cluster_policy_configure_vgpu.png b/openshift/graphics/cluster_policy_configure_vgpu.png new file mode 100644 index 000000000..9f82ae458 Binary files /dev/null and b/openshift/graphics/cluster_policy_configure_vgpu.png differ diff --git a/openshift/graphics/cluster_policy_enable_sandbox_workloads.png b/openshift/graphics/cluster_policy_enable_sandbox_workloads.png new file mode 100644 index 000000000..b074dbe7c Binary files /dev/null and b/openshift/graphics/cluster_policy_enable_sandbox_workloads.png differ diff --git a/openshift/graphics/cluster_policy_vGPU_confg.png b/openshift/graphics/cluster_policy_vGPU_confg.png new file mode 100644 index 000000000..98f5343ca Binary files /dev/null and b/openshift/graphics/cluster_policy_vGPU_confg.png differ diff --git a/openshift/graphics/create_cluster_policy.png b/openshift/graphics/create_cluster_policy.png new file mode 100644 index 000000000..81cb481ed Binary files /dev/null and b/openshift/graphics/create_cluster_policy.png differ diff --git a/openshift/graphics/navigate_to_cluster_policy.png b/openshift/graphics/navigate_to_cluster_policy.png new file mode 100644 index 000000000..d205f3bfd Binary files /dev/null and b/openshift/graphics/navigate_to_cluster_policy.png differ diff --git a/openshift/openshift-virtualization.rst b/openshift/openshift-virtualization.rst index d5e9dfd79..bb42695ee 100644 --- a/openshift/openshift-virtualization.rst +++ b/openshift/openshift-virtualization.rst @@ -51,12 +51,14 @@ Node B receives the following software components: * ``VFIO Manager`` - Optional. To load vfio-pci and bind it to all GPUs on the node. * ``Sandbox Device Plugin`` - Optional. To discover and advertise the passthrough GPUs to the kubelet. +* ``Sandbox Validator`` -Optional. Validates that Sandbox Device Plugin is working. Node C receives the following software components: * ``NVIDIA vGPU Manager`` - To install the driver. * ``NVIDIA vGPU Device Manager`` - To create vGPU devices on the node. * ``Sandbox Device Plugin`` -Optional. To discover and advertise the vGPU devices to kubelet. +* ``Sandbox Validator`` -Optional. Validates that Sandbox Device Plugin is working. ****************************************** @@ -286,8 +288,8 @@ private container registry). .. _install-cluster-policy-vGPU: -Creating a ClusterPolicy for the GPU Operator -============================================= +Creating a ClusterPolicy for the GPU Operator using the OpenShift Container Platform CLI +========================================================================================= As a cluster administrator, you can create a ClusterPolicy using the OpenShift Container Platform CLI. Create the cluster policy using the CLI: @@ -330,6 +332,52 @@ Without additional configuration, the GPU Operator creates a default set of devi To learn more about how the vGPU Device Manager and configure which types of vGPU devices get created in your cluster, refer to :ref:`vGPU Device Configuration`. +Creating a ClusterPolicy for the GPU Operator using the OpenShift Container Platform Web Console +================================================================================================ + +As a cluster administrator, you can create a ClusterPolicy using the OpenShift Container Platform web console. + +#. Navigate to **Operators** > **Installed Operators** and find your installed NVIDIA GPU Operator. + +#. Under *Provided APIs*, click **ClusterPolicy**. + + + .. image:: graphics/navigate_to_cluster_policy.png + + +#. Click **Create ClusterPolicy**. + + .. image:: graphics/create_cluster_policy.png + +#. Expand the **NVIDIA GPU/vGPU Driver config** section. + +#. Expand the **Sandbox Workloads config** section and select the checkbox to enable sandbox workloads. + + In general, when sandbox workloads are enabled, ``ClusterPolicy`` controls whether the GPU Operator can provision GPU worker nodes for virtual machine workloads, in addition to container workloads. This flag is disabled by default, meaning all nodes get provisioned with the same software which enables container workloads, and the ``nvidia.com/gpu.workload.config`` node label is not used. + + The term ``sandboxing`` refers to running software in a separate isolated environment, typically for added security (i.e. a virtual machine). We use the term ``sandbox workloads`` to signify workloads that run in a virtual machine, irrespective of the virtualization technology used. + * Click **Create** to create the ClusterPolicy. + + .. image:: graphics/cluster_policy_enable_sandbox_workloads.png + +#. If you are planning to use NVIDIA vGPU, expand the **NVIDIA vGPU Manager config** section and fill in your desired configuration settings, including: + + * Select the **enabled** checkbox to enable the NVIDIA vGPU Manager. + * Add your **imagePullSecrets**. + * Under *driverManager*, fill in **repository** with the path to your private repository. + * Under *env*, fill in **image** with ``vgpu-manager`` and the **version** with your driver version. + + If you are only using GPU passthrough, you dont need to fill this section out. + + .. image:: graphics/cluster_policy_configure_vgpu.png + +#. Click **Create** to create the ClusterPolicy. + + The vGPU Device Manager, deployed by the GPU Operator, automatically creates vGPU devices which can be assigned to KubeVirt VMs. + Without additional configuration, the GPU Operator creates a default set of devices on all GPUs. + To learn more about the vGPU Device Manager and how to configure which types of vGPU devices get created in your cluster, refer to :ref:`vGPU Device Configuration`. + + ******************************************************* Add GPU Resources to the HyperConverged Custom Resource ******************************************************* @@ -404,7 +452,6 @@ The following example permits the A10 GPU device and A10-24Q vGPU device. Refer to the `KubeVirt user guide `_ for more information on the configuration options. - About Mediated Devices ======================