Skip to content

Commit 73575d6

Browse files
authored
GPU Operator 24.9.1 (#134)
* GPU Operator 24.9.1 Signed-off-by: Christopher Desiniotis <[email protected]> * Address review comments Signed-off-by: Christopher Desiniotis <[email protected]> --------- Signed-off-by: Christopher Desiniotis <[email protected]>
1 parent 49af9fb commit 73575d6

File tree

6 files changed

+57
-18
lines changed

6 files changed

+57
-18
lines changed

gpu-operator/life-cycle-policy.rst

Lines changed: 4 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -91,21 +91,20 @@ Refer to :ref:`Upgrading the NVIDIA GPU Operator` for more information.
9191
* - NVIDIA GPU Driver
9292
- | `565.57.01 <https://docs.nvidia.com/datacenter/tesla/tesla-release-notes-565-57-01/index.html>`_
9393
| `560.35.03 <https://docs.nvidia.com/datacenter/tesla/tesla-release-notes-560-35-03/index.html>`_
94-
| `550.127.08 <https://docs.nvidia.com/datacenter/tesla/tesla-release-notes-550-127-08/index.html>`_ (recommended),
95-
| `550.127.05 <https://docs.nvidia.com/datacenter/tesla/tesla-release-notes-550-127-05/index.html>`_ (default),
94+
| `550.127.08 <https://docs.nvidia.com/datacenter/tesla/tesla-release-notes-550-127-08/index.html>`_ (default),
9695
| `535.216.03 <https://docs.nvidia.com/datacenter/tesla/tesla-release-notes-535-216-03/index.html>`_
9796
9897
* - NVIDIA Driver Manager for Kubernetes
9998
- `v0.7.0 <https://ngc.nvidia.com/catalog/containers/nvidia:cloud-native:k8s-driver-manager>`__
10099

101100
* - NVIDIA Container Toolkit
102-
- `1.17.0 <https://github.com/NVIDIA/nvidia-container-toolkit/releases>`__
101+
- `1.17.3 <https://github.com/NVIDIA/nvidia-container-toolkit/releases>`__
103102

104103
* - NVIDIA Kubernetes Device Plugin
105104
- `0.17.0 <https://github.com/NVIDIA/k8s-device-plugin/releases>`__
106105

107106
* - DCGM Exporter
108-
- `3.3.8-3.6.0 <https://github.com/NVIDIA/dcgm-exporter/releases>`__
107+
- `3.3.9-3.6.1 <https://github.com/NVIDIA/dcgm-exporter/releases>`__
109108

110109
* - Node Feature Discovery
111110
- v0.16.6
@@ -118,7 +117,7 @@ Refer to :ref:`Upgrading the NVIDIA GPU Operator` for more information.
118117
- `0.10.0 <https://github.com/NVIDIA/mig-parted/tree/main/deployments/gpu-operator>`__
119118

120119
* - DCGM
121-
- `3.3.8-1 <https://docs.nvidia.com/datacenter/dcgm/latest/release-notes/changelog.html>`__
120+
- `3.3.9-1 <https://docs.nvidia.com/datacenter/dcgm/latest/release-notes/changelog.html>`__
122121

123122
* - Validator for NVIDIA GPU Operator
124123
- ${version}

gpu-operator/platform-support.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -471,7 +471,7 @@ Support for GPUDirect RDMA
471471

472472
Supported operating systems and NVIDIA GPU Drivers with GPUDirect RDMA.
473473

474-
- Ubuntu 20.04 and 22.04 LTS with Network Operator 24.7.0
474+
- Ubuntu 20.04 and 22.04 LTS with Network Operator 24.10.0
475475
- Red Hat OpenShift 4.12 and higher with Network Operator 23.10.0
476476

477477
For information about configuring GPUDirect RDMA, refer to :doc:`gpu-operator-rdma`.
@@ -482,7 +482,7 @@ Support for GPUDirect Storage
482482

483483
Supported operating systems and NVIDIA GPU Drivers with GPUDirect Storage.
484484

485-
- Ubuntu 20.04 and 22.04 LTS with Network Operator 24.7.0
485+
- Ubuntu 20.04 and 22.04 LTS with Network Operator 24.10.0
486486
- Red Hat OpenShift Container Platform 4.12 and higher
487487

488488
.. note::

gpu-operator/release-notes.rst

Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,46 @@ See the :ref:`GPU Operator Component Matrix` for a list of software components a
3434

3535
----
3636

37+
.. _v24.9.1:
38+
39+
24.9.1
40+
======
41+
42+
.. _v24.9.1-new-features:
43+
44+
New Features
45+
------------
46+
47+
* Added support for the NVIDIA Data Center GPU Driver versions 550.127.08 and 535.216.03.
48+
Refer to the :ref:`GPU Operator Component Matrix`
49+
on the platform support page.
50+
51+
* Added support for the following software component versions:
52+
53+
- NVIDIA Container Toolkit v1.17.3
54+
- NVIDIA DCGM v3.3.9-1
55+
- NVIDIA DCGM Exporter v3.3.9-3.6.1
56+
57+
* Added support for NVIDIA Network Operator v24.10.0.
58+
Refer to :ref:`Support for GPUDirect RDMA` and :ref:`Support for GPUDirect Storage`.
59+
60+
* Added an ``all-balanced`` MIG profile for H200 NVL which creates the following GPU instances:
61+
62+
* ``1g.18gb`` :math:`\times` 2
63+
* ``2g.35gb`` :math:`\times` 1
64+
* ``3g.71gb`` :math:`\times` 1
65+
66+
.. _v24.9.1-fixed-issues:
67+
68+
Fixed Issues
69+
------------
70+
71+
* Fixed an issue where NVIDIA Container Toolkit would fail to start on Rancher RKE2, K3s, and Canonical MicroK8s.
72+
Refer to Github `issue #1109 <https://github.com/NVIDIA/gpu-operator/issues/1109>`__ for more details.
73+
74+
* Fixed an issue where events were not being generated by the NVIDIA driver upgrade controller.
75+
Refer to Github `issue #1101 <https://github.com/NVIDIA/gpu-operator/issues/1101>`__ for more details.
76+
3777
.. _v24.9.0:
3878

3979
24.9.0

gpu-operator/versions.json

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,10 @@
11
{
2-
"latest": "24.9.0",
2+
"latest": "24.9.1",
33
"versions":
44
[
5+
{
6+
"version": "24.9.1"
7+
},
58
{
69
"version": "24.9.0"
710
},
@@ -16,9 +19,6 @@
1619
},
1720
{
1821
"version": "24.3.0"
19-
},
20-
{
21-
"version": "23.9.2"
2222
}
2323
]
2424
}

openshift/versions.json

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,10 @@
11
{
2-
"latest": "24.9.0",
2+
"latest": "24.9.1",
33
"versions":
44
[
5+
{
6+
"version": "24.9.1"
7+
},
58
{
69
"version": "24.9.0"
710
},
@@ -13,9 +16,6 @@
1316
},
1417
{
1518
"version": "24.3.0"
16-
},
17-
{
18-
"version": "23.9.2"
1919
}
2020
]
2121
}

repo.toml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -142,8 +142,8 @@ output_format = "linkcheck"
142142
docs_root = "${root}/gpu-operator"
143143
project = "gpu-operator"
144144
name = "NVIDIA GPU Operator"
145-
version = "24.9.0"
146-
source_substitutions = { version = "v24.9.0", recommended = "550.127.08" }
145+
version = "24.9.1"
146+
source_substitutions = { version = "v24.9.1", recommended = "550.127.08" }
147147
copyright_start = 2020
148148
sphinx_exclude_patterns = [
149149
"life-cycle-policy.rst",
@@ -201,7 +201,7 @@ output_format = "linkcheck"
201201
docs_root = "${root}/openshift"
202202
project = "gpu-operator-openshift"
203203
name = "NVIDIA GPU Operator on Red Hat OpenShift Container Platform"
204-
version = "24.9.0"
204+
version = "24.9.1"
205205
copyright_start = 2020
206206
sphinx_exclude_patterns = [
207207
"get-entitlement.rst",

0 commit comments

Comments
 (0)