Skip to content

Commit f8f6a0b

Browse files
author
docs-preview
committed
Pushing changes to GitHub Pages.
1 parent 0e32312 commit f8f6a0b

File tree

4 files changed

+41
-26
lines changed

4 files changed

+41
-26
lines changed

review/pr-326/gpu-operator/latest/gpu-operator-rdma.html

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -663,8 +663,8 @@ <h3>Verifying the Installation by Performing a Data Transfer<a class="headerlink
663663
</li>
664664
<li><p>Start two pods that run the <code class="docutils literal notranslate"><span class="pre">mellanox/cuda-perftest</span></code> container on two different nodes in the cluster.</p>
665665
<div class="sd-tab-set docutils">
666-
<input checked="checked" id="994fe36e-4aec-480f-b770-ec11b27d0640" name="3a0e4e32-f6b4-4fa0-b7a4-209598f5061b" type="radio">
667-
</input><label class="sd-tab-label" for="994fe36e-4aec-480f-b770-ec11b27d0640">
666+
<input checked="checked" id="6aa8d393-1760-4501-8707-4f366871a278" name="f45ec1a2-2e1f-4f32-8d92-ce790f991d03" type="radio">
667+
</input><label class="sd-tab-label" for="6aa8d393-1760-4501-8707-4f366871a278">
668668
demo-pod-1</label><div class="sd-tab-content docutils">
669669
<ul>
670670
<li><p>Create a file, such as <code class="docutils literal notranslate"><span class="pre">demo-pod-1.yaml</span></code>, for the first pod with contents like the following:</p>
@@ -709,8 +709,8 @@ <h3>Verifying the Installation by Performing a Data Transfer<a class="headerlink
709709
</li>
710710
</ul>
711711
</div>
712-
<input id="4199b5e1-442f-423a-aaac-b1ec14df9012" name="3a0e4e32-f6b4-4fa0-b7a4-209598f5061b" type="radio">
713-
</input><label class="sd-tab-label" for="4199b5e1-442f-423a-aaac-b1ec14df9012">
712+
<input id="17295069-b0a2-480f-b60f-2c89ad66b8a7" name="f45ec1a2-2e1f-4f32-8d92-ce790f991d03" type="radio">
713+
</input><label class="sd-tab-label" for="17295069-b0a2-480f-b60f-2c89ad66b8a7">
714714
demo-pod-2</label><div class="sd-tab-content docutils">
715715
<ul>
716716
<li><p>Create a file, such as <code class="docutils literal notranslate"><span class="pre">demo-pod-2.yaml</span></code>, for the second pod with contents like the following:</p>

review/pr-326/gpu-operator/latest/platform-support.html

Lines changed: 15 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -541,7 +541,7 @@
541541
<div class="line">for Kubernetes</div>
542542
</div>
543543
</td>
544-
<td colspan="3"><p><a class="reference external" href="https://github.com/NVIDIA/k8s-device-plugin/releases">0.18.0</a></p></td>
544+
<td colspan="3"><p><a class="reference external" href="https://github.com/NVIDIA/k8s-device-plugin/releases">0.18.1</a></p></td>
545545
</tr>
546546
<tr class="row-even"><td><p>NVIDIA MIG Manager for Kubernetes</p></td>
547547
<td><p><a class="reference external" href="https://github.com/NVIDIA/mig-parted/blob/main/CHANGELOG.md">0.13.0</a></p></td>
@@ -606,8 +606,8 @@
606606
<span id="supported-nvidia-gpus-and-systems"></span><h3>Supported NVIDIA Data Center GPUs and Systems<a class="headerlink" href="#supported-nvidia-data-center-gpus-and-systems" title="Permalink to this headline">#</a></h3>
607607
<p>The following NVIDIA data center GPUs are supported on x86 based platforms:</p>
608608
<div class="sd-tab-set docutils">
609-
<input id="ea568169-87c6-4d2d-bca1-15ae15c6dbbe" name="0aa77cd6-d50a-44e8-a9f6-c5dfe35c0517" type="radio">
610-
</input><label class="sd-tab-label" for="ea568169-87c6-4d2d-bca1-15ae15c6dbbe">
609+
<input id="60179b7f-538a-458e-ab99-ecfb3f5b0058" name="d2877256-0545-494b-acf4-27fad023740d" type="radio">
610+
</input><label class="sd-tab-label" for="60179b7f-538a-458e-ab99-ecfb3f5b0058">
611611
GH-series Products</label><div class="sd-tab-content docutils">
612612
<div class="pst-scrollable-table-container"><table class="table">
613613
<colgroup>
@@ -635,8 +635,8 @@
635635
argument to the <code class="docutils literal notranslate"><span class="pre">helm</span></code> command.
636636
Refer to <a class="reference internal" href="getting-started.html#common-chart-customization-options"><span class="std std-ref">Common Chart Customization Options</span></a> for more information.</p>
637637
</div>
638-
<input checked="checked" id="0fd1d81c-820a-412b-af64-829d838ffeec" name="0aa77cd6-d50a-44e8-a9f6-c5dfe35c0517" type="radio">
639-
</input><label class="sd-tab-label" for="0fd1d81c-820a-412b-af64-829d838ffeec">
638+
<input checked="checked" id="652cb045-b891-4456-82ff-fd7d45e59ff5" name="d2877256-0545-494b-acf4-27fad023740d" type="radio">
639+
</input><label class="sd-tab-label" for="652cb045-b891-4456-82ff-fd7d45e59ff5">
640640
A, H and L-series Products</label><div class="sd-tab-content docutils">
641641
<div class="pst-scrollable-table-container"><table class="table">
642642
<colgroup>
@@ -766,8 +766,8 @@
766766
</ul>
767767
</div>
768768
</div>
769-
<input id="91849565-e4e9-4fa2-8247-cc0eba911990" name="0aa77cd6-d50a-44e8-a9f6-c5dfe35c0517" type="radio">
770-
</input><label class="sd-tab-label" for="91849565-e4e9-4fa2-8247-cc0eba911990">
769+
<input id="4de84fd6-9cdf-4890-b7c7-bba57cd2f2ea" name="d2877256-0545-494b-acf4-27fad023740d" type="radio">
770+
</input><label class="sd-tab-label" for="4de84fd6-9cdf-4890-b7c7-bba57cd2f2ea">
771771
D,T and V-series Products</label><div class="sd-tab-content docutils">
772772
<div class="pst-scrollable-table-container"><table class="table">
773773
<colgroup>
@@ -806,8 +806,8 @@
806806
</table>
807807
</div>
808808
</div>
809-
<input id="4e4ed7ae-3604-4b38-8e43-6cad55428c9b" name="0aa77cd6-d50a-44e8-a9f6-c5dfe35c0517" type="radio">
810-
</input><label class="sd-tab-label" for="4e4ed7ae-3604-4b38-8e43-6cad55428c9b">
809+
<input id="da235615-4366-43be-8640-b3b166950d1e" name="d2877256-0545-494b-acf4-27fad023740d" type="radio">
810+
</input><label class="sd-tab-label" for="da235615-4366-43be-8640-b3b166950d1e">
811811
RTX / T-series Products</label><div class="sd-tab-content docutils">
812812
<div class="pst-scrollable-table-container"><table class="table">
813813
<colgroup>
@@ -888,8 +888,8 @@
888888
</ul>
889889
</div>
890890
</div>
891-
<input id="e302d1ee-0c7a-46a6-9786-cf5416f24419" name="0aa77cd6-d50a-44e8-a9f6-c5dfe35c0517" type="radio">
892-
</input><label class="sd-tab-label" for="e302d1ee-0c7a-46a6-9786-cf5416f24419">
891+
<input id="dccdb9e1-0f30-4fbe-b492-437fee005969" name="d2877256-0545-494b-acf4-27fad023740d" type="radio">
892+
</input><label class="sd-tab-label" for="dccdb9e1-0f30-4fbe-b492-437fee005969">
893893
B-series Products</label><div class="sd-tab-content docutils">
894894
<div class="pst-scrollable-table-container"><table class="table">
895895
<colgroup>
@@ -1031,8 +1031,8 @@
10311031
<span id="container-platforms"></span><h3>Supported Operating Systems and Kubernetes Platforms<a class="headerlink" href="#supported-operating-systems-and-kubernetes-platforms" title="Permalink to this headline">#</a></h3>
10321032
<p>The GPU Operator has been validated in the following scenarios:</p>
10331033
<div class="sd-tab-set docutils">
1034-
<input checked="checked" id="6822002b-8057-4ea9-93d5-5afdb65198c4" name="29b9c7bf-1213-4a02-bdbd-205e9232d6d2" type="radio">
1035-
</input><label class="sd-tab-label" for="6822002b-8057-4ea9-93d5-5afdb65198c4">
1034+
<input checked="checked" id="a9d712ad-9d9f-4c18-8d14-54643e824248" name="e0b14666-5586-498a-b3fa-1c6ebf421ccd" type="radio">
1035+
</input><label class="sd-tab-label" for="a9d712ad-9d9f-4c18-8d14-54643e824248">
10361036
Bare Metal / Virtual Machines with GPU Passthrough and NVIDIA vGPU</label><div class="sd-tab-content docutils">
10371037
<div class="pst-scrollable-table-container"><table class="table">
10381038
<colgroup>
@@ -1174,8 +1174,8 @@
11741174
<p>Red Hat OpenShift Container Platform is supported on AWS, Azure, GCP, and OCI (Oracle) Virtual Machine or Bare Metal instances with T4, V100, L4, L40s, A10, A100, H100, and H200.</p>
11751175
</div>
11761176
</div>
1177-
<input id="94f97dc4-35d4-4e3a-bcd4-d7f5dbe12f5d" name="29b9c7bf-1213-4a02-bdbd-205e9232d6d2" type="radio">
1178-
</input><label class="sd-tab-label" for="94f97dc4-35d4-4e3a-bcd4-d7f5dbe12f5d">
1177+
<input id="ab49b6ee-9e4c-4ce2-95bc-60dd7f785e7e" name="e0b14666-5586-498a-b3fa-1c6ebf421ccd" type="radio">
1178+
</input><label class="sd-tab-label" for="ab49b6ee-9e4c-4ce2-95bc-60dd7f785e7e">
11791179
Cloud Service Providers</label><div class="sd-tab-content docutils">
11801180
<div class="pst-scrollable-table-container"><table class="table">
11811181
<colgroup>

review/pr-326/gpu-operator/latest/release-notes.html

Lines changed: 21 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -445,9 +445,9 @@
445445
<span id="id1"></span><h2>25.10.1<a class="headerlink" href="#v25-10-1" title="Permalink to this headline">#</a></h2>
446446
<section id="new-features">
447447
<h3>New Features<a class="headerlink" href="#new-features" title="Permalink to this headline">#</a></h3>
448-
<ul class="simple">
449-
<li><p>Updated software component versions:</p>
450448
<ul>
449+
<li><p>Updated software component versions:</p>
450+
<ul class="simple">
451451
<li><p>NVIDIA Container Toolkit v1.18.1</p></li>
452452
<li><p>NVIDIA DCGM v4.4.2-1</p></li>
453453
<li><p>NVIDIA DCGM Exporter v4.4.2-4.7.0</p></li>
@@ -458,19 +458,34 @@ <h3>New Features<a class="headerlink" href="#new-features" title="Permalink to t
458458
</ul>
459459
</li>
460460
<li><p>Added support for this NVIDIA Data Center GPU Driver version:</p>
461-
<ul>
461+
<ul class="simple">
462462
<li><p>580.105.08 (default)</p></li>
463463
</ul>
464464
</li>
465+
<li><p>Add HPC job mapping support to DCGM Exporter to collect metrics for HPC jobs running on the cluster.</p>
466+
<p>Configure the HPC job mapping by setting the <code class="docutils literal notranslate"><span class="pre">dcgmExporter.hpcJobMapping.enabled</span></code> field to <code class="docutils literal notranslate"><span class="pre">true</span></code> in the ClusterPolicy custom resource.
467+
Set <code class="docutils literal notranslate"><span class="pre">dcgmExporter.hpcJobMapping.directory</span></code> with the directory path where HPC job mapping files are created by the workload manager.
468+
The default directory is <code class="docutils literal notranslate"><span class="pre">/var/lib/dcgm-exporter/job-mapping</span></code>.</p>
469+
</li>
470+
<li><p>Improved the cluster policy reconciler to be more resilient to race conditions during node updates.</p></li>
465471
</ul>
466472
</section>
467473
<section id="fixed-issues">
468474
<h3>Fixed Issues<a class="headerlink" href="#fixed-issues" title="Permalink to this headline">#</a></h3>
469475
<ul class="simple">
470-
<li><p>Fixed a bug where driver images were being incorrectly assigned in multi-nodepool clusters.</p></li>
476+
<li><p>Fixed the following known issue introduced in GPU Operator v25.10.0:</p>
477+
<ul>
478+
<li><p>When using cri-o as the container runtime, several GPU Operator pods can be stuck in the <code class="docutils literal notranslate"><span class="pre">Init:RunContainerError</span></code> or <code class="docutils literal notranslate"><span class="pre">Init:CreateContainerError</span></code> state during GPU Operator installation or upgrade, or during GPU driver daemonset upgrade.</p></li>
479+
<li><p>NVIDIA Container Toolkit 1.18.0 overwrites the imports field in the top-level containerd configuration file, so any previously imported paths are lost.
480+
This was fixed in NVIDIA Container Toolkit v1.18.1.</p></li>
481+
</ul>
482+
</li>
483+
<li><p>Fixed a race condition where user-supplied NVIDIA kernel module parameters were sometimes not being applied by the driver daemonset.
484+
For more information, refer to <a class="reference external" href="https://github.com/NVIDIA/gpu-operator/pull/1939">PR #1939</a>.</p></li>
485+
<li><p>Fixed a bug where driver images were being incorrectly assigned in multi-nodepool clusters.
486+
For more information, refer to <a class="reference external" href="https://github.com/NVIDIA/gpu-operator/issues/1622">Issue #1622</a>.</p></li>
471487
<li><p>Fixed a bug where the GPU Operator Helm chart template was not assigning the correct namespace to resources it created.</p></li>
472-
<li><p>Fixed a bug where the ClusterPolicy reconciler would fail when it attempted to update node labels on a cluster.</p></li>
473-
<li><p>Fixed a bug where the k8s-driver-manager would wait indefinitely when MOFED is enabled despite the MOFED being pre-installed on the host.</p></li>
488+
<li><p>Fixed a bug where the k8s-driver-manager would wait indefinitely when MOFED is enabled and <code class="docutils literal notranslate"><span class="pre">USE_HOST_MOFED</span></code> is set to true despite the MOFED being pre-installed on the host.</p></li>
474489
</ul>
475490
</section>
476491
</section>

review/pr-326/gpu-operator/latest/searchindex.js

Lines changed: 1 addition & 1 deletion
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

0 commit comments

Comments
 (0)