Skip to content

Commit 5105e8f

Browse files
authored
v25.3.2 driver update (#221)
* added new drivers Signed-off-by: Andrew Chen <[email protected]> * incorporated feedback Signed-off-by: Andrew Chen <[email protected]> --------- Signed-off-by: Andrew Chen <[email protected]>
1 parent f6612c2 commit 5105e8f

File tree

4 files changed

+18
-9
lines changed

4 files changed

+18
-9
lines changed

gpu-operator/life-cycle-policy.rst

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -92,8 +92,10 @@ Refer to :ref:`Upgrading the NVIDIA GPU Operator` for more information.
9292

9393
* - NVIDIA GPU Driver |ki|_
9494
- | `575.57.08 <https://docs.nvidia.com/datacenter/tesla/tesla-release-notes-575-57-08/index.html>`_
95-
| `570.158.01 <https://docs.nvidia.com/datacenter/tesla/tesla-release-notes-570-158-01/index.html>`_ (recommended)
96-
| `570.148.08 <https://docs.nvidia.com/datacenter/tesla/tesla-release-notes-570-148-08/index.html>`_ (default)
95+
| `570.172.08 <https://docs.nvidia.com/datacenter/tesla/tesla-release-notes-570-172-08/index.html>`_ (default, recommended)
96+
| `570.158.01 <https://docs.nvidia.com/datacenter/tesla/tesla-release-notes-570-158-01/index.html>`_
97+
| `570.148.08 <https://docs.nvidia.com/datacenter/tesla/tesla-release-notes-570-148-08/index.html>`_
98+
| `535.261.03 <https://docs.nvidia.com/datacenter/tesla/tesla-release-notes-535-261-03/index.html>`_
9799
| `550.163.01 <https://docs.nvidia.com/datacenter/tesla/tesla-release-notes-550-163-01/index.html>`_
98100
| `535.247.01 <https://docs.nvidia.com/datacenter/tesla/tesla-release-notes-535-247-01/index.html>`_
99101

gpu-operator/release-notes.rst

Lines changed: 11 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -48,6 +48,11 @@ New Features
4848
- NVIDIA Kubernetes Device Plugin/NVIDIA GPU Feature Discovery v0.17.3
4949
- NVIDIA MIG Manager for Kubernetes v0.12.2
5050

51+
* Added support for the following NVIDIA Data Center GPU Driver versions:
52+
53+
- 570.172.08 (default, recommended)
54+
- 535.261.03
55+
5156
.. _v25.3.2-known-issues:
5257

5358
Known Issues
@@ -56,8 +61,8 @@ Known Issues
5661
* For drivers 570.124.06, 570.133.20, 570.148.08, and 570.158.01,
5762
GPU workloads cannot be scheduled on nodes that have a mix of MIG slices and full GPUs.
5863
This manifests as GPU pods getting stuck indefinitely in the ``Pending`` state.
59-
NVIDIA recommends that you downgrade the driver to version 570.86.15 to work around this issue.
60-
For more detailed information, see GitHub issue #1361 <https://github.com/NVIDIA/gpu-operator/issue/1361>__.
64+
NVIDIA recommends that you upgrade the driver to version 570.172.08 to avoid this issue.
65+
For more detailed information, see GitHub issue https://github.com/NVIDIA/gpu-operator/issue/1361.
6166

6267
* Configuring the Operator to enable CDI is not supported on Rancher Kubernetes Engine 2 (RKE2).
6368

@@ -83,7 +88,9 @@ New Features
8388

8489
* Added support for the following NVIDIA Data Center GPU Driver versions:
8590

86-
- 570.148.08 (default, recommended)
91+
- 570.172.08 (default, recommended)
92+
- 535.261.03
93+
- 570.148.08
8794
- 570.133.20
8895
- 550.163.01
8996
- 535.247.01
@@ -106,7 +113,7 @@ Known Issues
106113
* For drivers 570.124.06, 570.133.20, 570.148.08, and 570.158.01,
107114
GPU workloads cannot be scheduled on nodes that have a mix of MIG slices and full GPUs.
108115
This manifests as GPU pods getting stuck indefinitely in the ``Pending`` state.
109-
It's recommended that you downgrade the driver to version 570.86.15 to work around this issue.
116+
NVIDIA recommends that you upgrade the driver to version 570.172.08 to avoid this issue.
110117
For more detailed information, see GitHub issue https://github.com/NVIDIA/gpu-operator/issues/1361.
111118

112119
* GPU Operator in CDI mode is not operational with RKE2.

openshift/install-gpu-ocp.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -213,7 +213,7 @@ Create the cluster policy using the web console
213213

214214
.. note:: For OpenShift 4.12 with GPU Operator 25.3.1 or later, you must expand the **Driver** section and set the following fields:
215215

216-
- **version**: 570.148.08 (or another supported version)
216+
- **version**: 570.172.08 (or another supported version)
217217
- **image**: driver (or another supported image)
218218
- **repository**: nvcr.io/nvidia (or another supported repository)
219219

@@ -244,7 +244,7 @@ Create the cluster policy using the CLI
244244
"driver": {
245245
"repository": "nvcr.io/nvidia",
246246
"image": "driver",
247-
"version": "570.148.08"
247+
"version": "570.172.08"
248248
}
249249
250250
.. code-block:: console

repo.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -167,7 +167,7 @@ docs_root = "${root}/gpu-operator"
167167
project = "gpu-operator"
168168
name = "NVIDIA GPU Operator"
169169
version = "25.3.2"
170-
source_substitutions = { version = "v25.3.2", recommended = "570.148.08" }
170+
source_substitutions = { version = "v25.3.2", recommended = "570.172.08" }
171171
copyright_start = 2020
172172
sphinx_exclude_patterns = [
173173
"life-cycle-policy.rst",

0 commit comments

Comments
 (0)