Skip to content

Commit 7e862ab

Browse files
authored
Merge pull request #65985 from ousleyp/cnv-24223
CNV#24223: updating mediated devices assembly for NVIDIA operator
2 parents ac90b48 + 363f6d3 commit 7e862ab

21 files changed

+280
-271
lines changed

_topic_maps/_topic_map.yml

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -3803,10 +3803,8 @@ Topics:
38033803
File: virt-schedule-vms
38043804
- Name: Configuring PCI passthrough
38053805
File: virt-configuring-pci-passthrough
3806-
- Name: Configuring vGPU passthrough
3807-
File: virt-configuring-vgpu-passthrough
3808-
- Name: Configuring mediated devices
3809-
File: virt-configuring-mediated-devices
3806+
- Name: Configuring virtual GPUs
3807+
File: virt-configuring-virtual-gpus
38103808
- Name: Enabling descheduler evictions on virtual machines
38113809
File: virt-enabling-descheduler-evictions
38123810
- Name: About high availability for virtual machines

modules/about-using-gpu-operator.adoc

Lines changed: 3 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1,21 +1,11 @@
11
// Module included in the following assemblies:
22
//
3-
// * virt/virtual_machines/advanced_vm_management/virt-configuring-mediated-devices.adoc
4-
3+
// * virt/virtual_machines/advanced_vm_management/virt-configuring-virtual-gpus.adoc
54

65
:_content-type: CONCEPT
76
[id="about-using-nvidia-gpu_{context}"]
87
= About using the NVIDIA GPU Operator
98

10-
The NVIDIA GPU Operator manages NVIDIA GPU resources in a {product-title} cluster and automates tasks related to bootstrapping GPU nodes. Because the GPU is a special resource in the cluster, you must install some components before you can deploy application workloads to the GPU. These components include the NVIDIA drivers that enable the compute unified device architecture (CUDA), Kubernetes device plugin, container runtime, and other features such as automatic node labeling, monitoring, and more.
11-
12-
[NOTE]
13-
====
14-
The NVIDIA GPU Operator is supported only by NVIDIA. For more information about obtaining support from NVIDIA, see link:https://access.redhat.com/solutions/5174941[Obtaining Support from NVIDIA].
15-
====
16-
17-
There are two ways to enable GPUs with {product-title} {VirtProductName}: the {product-title}-native way described here and by using the NVIDIA GPU Operator.
18-
19-
The NVIDIA GPU Operator is a Kubernetes Operator that uses {product-title} {VirtProductName} to provision GPUs for virtualized workloads running on {product-title}. With the Operator, you can easily provision and manage GPU-enabled virtual machines to run complex artificial intelligence/machine learning (AI/ML) workloads on the same platform as their other workloads. The Operator also provides an easy way to scale the GPU capacity of their infrastructure, enabling rapid growth of GPU-based workloads.
9+
You can use the NVIDIA GPU Operator with {VirtProductName} to rapidly provision worker nodes for running GPU-enabled virtual machines (VMs). The NVIDIA GPU Operator manages NVIDIA GPU resources in an {product-title} cluster and automates tasks that are required when preparing nodes for GPU workloads.
2010

21-
For more information about using the NVIDIA GPU Operator to provision worker nodes for running GPU-accelerated VMs, see link:https://docs.nvidia.com/datacenter/cloud-native/gpu-operator/openshift/openshift-virtualization.html[NVIDIA GPU Operator with OpenShift Virtualization].
11+
Before you can deploy application workloads to a GPU resource, you must install components such as the NVIDIA drivers that enable the compute unified device architecture (CUDA), Kubernetes device plugin, container runtime, and other features, such as automatic node labeling and monitoring. By automating these tasks, you can quickly scale the GPU capacity of your infrastructure. The NVIDIA GPU Operator can especially facilitate provisioning complex artificial intelligence and machine learning (AI/ML) workloads.

modules/using-mediated-devices.adoc

Lines changed: 0 additions & 9 deletions
This file was deleted.
Lines changed: 6 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,23 +1,20 @@
11
// Module included in the following assemblies:
22
//
3-
// * virt/virtual_machines/advanced_vm_management/virt-configuring-mediated-devices.adoc
3+
// * virt/virtual_machines/advanced_vm_management/virt-configuring-virtual-gpus.adoc
44

55
:_content-type: CONCEPT
6-
76
[id="about-changing-removing-mediated-devices_{context}"]
87
= About changing and removing mediated devices
98

10-
The cluster's mediated device configuration can be updated with {VirtProductName} by:
9+
You can reconfigure or remove mediated devices in several ways:
1110

12-
* Editing the `HyperConverged` CR and change the contents of the `mediatedDevicesTypes` stanza.
11+
* Edit the `HyperConverged` CR and change the contents of the `mediatedDeviceTypes` stanza.
1312
14-
* Changing the node labels that match the `nodeMediatedDeviceTypes` node selector.
13+
* Change the node labels that match the `nodeMediatedDeviceTypes` node selector.
1514
16-
* Removing the device information from the `spec.mediatedDevicesConfiguration` and `spec.permittedHostDevices` stanzas of the `HyperConverged` CR.
15+
* Remove the device information from the `spec.mediatedDevicesConfiguration` and `spec.permittedHostDevices` stanzas of the `HyperConverged` CR.
1716
+
1817
[NOTE]
1918
====
2019
If you remove the device information from the `spec.permittedHostDevices` stanza without also removing it from the `spec.mediatedDevicesConfiguration` stanza, you cannot create a new mediated device type on the same node. To properly remove mediated devices, remove the device information from both stanzas.
21-
====
22-
23-
Depending on the specific changes, these actions cause {VirtProductName} to reconfigure mediated devices or remove them from the cluster nodes.
20+
====

modules/virt-about-using-virtual-gpus.adoc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
// Module included in the following assemblies:
22
//
3-
// * virt/virtual_machines/advanced_vm_management/virt-configuring-mediated-devices.adoc
3+
// * virt/virtual_machines/advanced_vm_management/virt-configuring-virtual-gpus.adoc
44

55
:_content-type: CONCEPT
66
[id="virt-about-using-virtual-gpus_{context}"]

modules/virt-add-remove-mediated-devices.adoc

Lines changed: 0 additions & 9 deletions
This file was deleted.

modules/virt-adding-kernel-arguments-enable-iommu.adoc

Lines changed: 7 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,19 +1,22 @@
11
// Module included in the following assemblies:
22
//
33
// * virt/virtual_machines/advanced_vm_management/configuring-pci-passthrough.adoc
4+
// * virt/virtual_machines/advanced_vm_management/virt-configuring-virtual-gpus.adoc
45

56
:_content-type: PROCEDURE
67
[id="virt-adding-kernel-arguments-enable-IOMMU_{context}"]
78
= Adding kernel arguments to enable the IOMMU driver
89

9-
To enable the IOMMU (Input-Output Memory Management Unit) driver in the kernel, create the `MachineConfig` object and add the kernel arguments.
10+
To enable the IOMMU driver in the kernel, create the `MachineConfig` object and add the kernel arguments.
1011

1112
.Prerequisites
12-
* Administrative privilege to a working {product-title} cluster.
13-
* Intel or AMD CPU hardware.
14-
* Intel Virtualization Technology for Directed I/O extensions or AMD IOMMU in the BIOS (Basic Input/Output System) is enabled.
13+
14+
* You have cluster administrator permissions.
15+
* Your CPU hardware is Intel or AMD.
16+
* You enabled Intel Virtualization Technology for Directed I/O extensions or AMD IOMMU in the BIOS.
1517
1618
.Procedure
19+
1720
. Create a `MachineConfig` object that identifies the kernel argument. The following example shows a kernel argument for an Intel CPU.
1821

1922
+

modules/virt-assign-vgpu-passthrough-to-vm.adoc

Lines changed: 0 additions & 31 deletions
This file was deleted.

modules/virt-assigning-mediated-device-virtual-machine.adoc renamed to modules/virt-assigning-vgpu-vm-cli.adoc

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,16 +1,17 @@
11
// Module included in the following assemblies:
22
//
3-
// * virt/virtual_machines/advanced_vm_management/virt-configuring-mediated-devices.adoc
3+
// * virt/virtual_machines/advanced_vm_management/virt-configuring-virtual-gpus.adoc
44

55
:_content-type: PROCEDURE
6-
[id="virt-assigning-mediated-device-virtual-machine_{context}"]
7-
= Assigning a mediated device to a virtual machine
6+
[id="virt-assigning-mdev-vm-cli_{context}"]
7+
= Assigning a vGPU to a VM by using the CLI
88

9-
Assign mediated devices such as virtual GPUs (vGPUs) to virtual machines.
9+
Assign mediated devices such as virtual GPUs (vGPUs) to virtual machines (VMs).
1010

1111
.Prerequisites
1212

1313
* The mediated device is configured in the `HyperConverged` custom resource.
14+
* The VM is stopped.
1415
1516
.Procedure
1617

@@ -27,7 +28,7 @@ spec:
2728
gpus:
2829
- deviceName: nvidia.com/TU104GL_Tesla_T4 <1>
2930
name: gpu1 <2>
30-
- deviceName: nvidia.com/GRID_T4-1Q
31+
- deviceName: nvidia.com/GRID_T4-2Q
3132
name: gpu2
3233
----
3334
<1> The resource name associated with the mediated device.
Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,31 @@
1+
// Module included in the following assemblies:
2+
//
3+
// * virt/virtual_machines/advanced_vm_management/virt-configuring-virtual-gpus.adoc
4+
5+
[id="virt-assigning-vgpu-vm-web_{context}"]
6+
= Assigning a vGPU to a VM by using the web console
7+
8+
You can assign virtual GPUs to virtual machines by using the {product-title} web console.
9+
[NOTE]
10+
====
11+
You can add hardware devices to virtual machines created from customized templates or a YAML file. You cannot add devices to pre-supplied boot source templates for specific operating systems.
12+
====
13+
14+
.Prerequisites
15+
16+
* The vGPU is configured as a mediated device in your cluster.
17+
** To view the devices that are connected to your cluster, click *Compute* -> *Hardware Devices* from the side menu.
18+
* The VM is stopped.
19+
20+
.Procedure
21+
22+
. In the {product-title} web console, click *Virtualization* -> *VirtualMachines* from the side menu.
23+
. Select the VM that you want to assign the device to.
24+
. On the *Details* tab, click *GPU devices*.
25+
. Click *Add GPU device*.
26+
. Enter an identifying value in the *Name* field.
27+
. From the *Device name* list, select the device that you want to add to the VM.
28+
. Click *Save*.
29+
30+
.Verification
31+
* To confirm that the devices were added to the VM, click the *YAML* tab and review the `VirtualMachine` configuration. Mediated devices are added to the `spec.domain.devices` stanza.

0 commit comments

Comments
 (0)