[CI] Revert CUDA, PyTorch and ONNX upgrade by mshr-h · Pull Request #18787 · apache/tvm

mshr-h · 2026-02-16T17:16:29Z

With 20260214-152058-2a448ce4 images, OpenCL tests are failing with segmentation fault. Can't reproduce on my local machine but I gues it's due to CUDA version upgrade. This PR revert it and also downgrade PyTorch to match the CUDA compatibility.

This reverts commit ac70260.

gemini-code-assist · 2026-02-16T17:16:47Z

Summary of Changes

Hello @mshr-h, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request aims to resolve recent OpenCL test failures observed in the CI pipeline, which are believed to stem from an incompatibility introduced by a recent CUDA version upgrade. The changes involve reverting the CUDA base image to a known stable version and subsequently downgrading dependent libraries like PyTorch, ONNX, and their runtimes to ensure full compatibility and restore the reliability of the continuous integration tests.

Highlights

CUDA Version Reversion: The base CUDA image used in the CI GPU Dockerfile was reverted from version 12.8.1 to 12.4.1 to address suspected compatibility issues causing test failures.
PyTorch and ONNX Downgrade: The installation script for ONNX and PyTorch was updated to install specific, older versions of ONNX, ONNX Runtime, ONNX Optimizer, PyTorch, and torchvision. These versions are now conditionally selected based on the Python version (3.9, 3.11, or other) and the target device (CUDA/CPU) to ensure compatibility with the reverted CUDA version.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog

docker/Dockerfile.ci_gpu
- Reverted the CUDA base image from nvidia/cuda:12.8.1-cudnn-devel-ubuntu22.04 to nvidia/cuda:12.4.1-cudnn-devel-ubuntu22.04.
docker/install/ubuntu_install_onnx.sh
- Introduced conditional installation logic for ONNX, ONNX Runtime, ONNX Optimizer, PyTorch, and torchvision based on the detected Python version (3.9, 3.11, or other) and the specified device (cuda/cpu).
- Downgraded ONNX to 1.16.0 (for Python 3.9), 1.17.0 (for Python 3.11), or 1.12.0 (for other Python versions).
- Downgraded ONNX Runtime to 1.19.2 (for Python 3.9), 1.20.1 (for Python 3.11), or 1.12.1 (for other Python versions).
- Downgraded ONNX Optimizer to 0.2.7 across all conditional installation paths.
- Downgraded PyTorch to 2.7.0 (for Python 3.9 and 3.11) or 2.4.1 (for other Python versions).
- Downgraded torchvision to 0.22.0 (for Python 3.9 and 3.11) or 0.19.1 (for other Python versions).
- Added pip3 install future to the installation steps.
- Added comments explaining the rationale behind fixing ONNX versions due to potential test breakage.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request reverts the CUDA and PyTorch versions to fix a CI failure, which is reflected in the changes to docker/Dockerfile.ci_gpu and docker/install/ubuntu_install_onnx.sh. The changes seem correct for the stated purpose. I've added one suggestion to refactor the installation script in docker/install/ubuntu_install_onnx.sh to reduce code duplication and improve maintainability.

gemini-code-assist · 2026-02-16T17:18:49Z

docker/install/ubuntu_install_onnx.sh

+if [ "$PYTHON_VERSION" == "3.9" ]; then
+    pip3 install \
+        onnx==1.16.0 \
+        onnxruntime==1.19.2 \
+        onnxoptimizer==0.2.7

-if [ "$DEVICE" == "cuda" ]; then
+    if [ "$DEVICE" == "cuda" ]; then
+        pip3 install \
+            torch==2.7.0 \
+            torchvision==0.22.0 \
+            --index-url https://download.pytorch.org/whl/cu118
+    else
+        pip3 install \
+            torch==2.7.0 \
+            torchvision==0.22.0 \
+            --extra-index-url https://download.pytorch.org/whl/cpu
+    fi
+elif [ "$PYTHON_VERSION" == "3.11" ]; then
    pip3 install \
-        torch==2.10.0 \
-        torchvision==0.25.0
+        onnx==1.17.0 \
+        onnxruntime==1.20.1 \
+        onnxoptimizer==0.2.7
+
+    if [ "$DEVICE" == "cuda" ]; then
+        pip3 install \
+            torch==2.7.0 \
+            torchvision==0.22.0 \
+            --index-url https://download.pytorch.org/whl/cu118
+    else
+        pip3 install \
+            torch==2.7.0 \
+            torchvision==0.22.0 \
+            --extra-index-url https://download.pytorch.org/whl/cpu
+    fi
 else
    pip3 install \
-        torch==2.10.0 \
-        torchvision==0.25.0 \
-        --extra-index-url https://download.pytorch.org/whl/cpu
+        onnx==1.12.0 \
+        onnxruntime==1.12.1 \
+        onnxoptimizer==0.2.7
+
+    if [ "$DEVICE" == "cuda" ]; then
+        pip3 install \
+            torch==2.4.1 \
+            torchvision==0.19.1
+    else
+        pip3 install \
+            torch==2.4.1 \
+            torchvision==0.19.1 \
+            --extra-index-url https://download.pytorch.org/whl/cpu
+    fi
 fi


This if/elif/else block contains significant code duplication for installing torch and onnx packages across different Python versions. You can refactor this by defining package version strings in variables based on the Python version, and then using these variables in shared pip3 install commands. This will make the script more readable and easier to maintain in the future.

if [ "$PYTHON_VERSION" == "3.9" ]; then ONNX_PKGS="onnx==1.16.0 onnxruntime==1.19.2 onnxoptimizer==0.2.7" elif [ "$PYTHON_VERSION" == "3.11" ]; then ONNX_PKGS="onnx==1.17.0 onnxruntime==1.20.1 onnxoptimizer==0.2.7" else ONNX_PKGS="onnx==1.12.0 onnxruntime==1.12.1 onnxoptimizer==0.2.7" fi pip3 install ${ONNX_PKGS} if [ "$PYTHON_VERSION" == "3.9" ] || [ "$PYTHON_VERSION" == "3.11" ]; then TORCH_PKGS="torch==2.7.0 torchvision==0.22.0" TORCH_CUDA_ARGS="--index-url https://download.pytorch.org/whl/cu118" else TORCH_PKGS="torch==2.4.1 torchvision==0.19.1" TORCH_CUDA_ARGS="" fi if [ "$DEVICE" == "cuda" ]; then pip3 install ${TORCH_PKGS} ${TORCH_CUDA_ARGS} else pip3 install ${TORCH_PKGS} --extra-index-url https://download.pytorch.org/whl/cpu fi

tqchen · 2026-02-16T17:30:28Z

do we know which was the test failing? I feel it is important for the CI to be up to date in terms of cuda/torch versions.

So for the case of opencl, perhapas we can temp skip some of the tests?

mshr-h · 2026-02-16T17:39:08Z

I guess all of the opencl tests was failing. @tqchen

I feel it is important for the CI to be up to date in terms of cuda/torch versions.

agree

mshr-h · 2026-02-16T17:48:55Z

I'm trying to skip all opencl tests and see if it passes. https://ci.tlcpack.ai/blue/organizations/jenkins/tvm-gpu/detail/PR-18775/36/pipeline

tqchen · 2026-02-16T18:01:08Z

yes, i think it is ok to skip opencl tests for now

tqchen · 2026-02-16T19:05:41Z

docker/install/ubuntu_install_onnx.sh

-    onnx==1.20.1 \
-    onnxruntime==1.23.2 \
-    onnxoptimizer==0.4.2
+    future \


Let us wait and see if skip works

mshr-h · 2026-02-17T04:59:43Z

closing as skip works

Revert "[CI] Update system cuda version 12.4->12.8 (apache#18783)"

1c700d2

This reverts commit ac70260.

mshr-h changed the title ~~[CI} Revert CUDA and PyTorch~~ [CI] Revert CUDA and PyTorch Feb 16, 2026

gemini-code-assist bot reviewed Feb 16, 2026

View reviewed changes

Revert pytorch and onnx

50f22fe

mshr-h force-pushed the revert-cuda branch from 55b2e30 to 50f22fe Compare February 16, 2026 17:21

mshr-h changed the title ~~[CI] Revert CUDA and PyTorch~~ [CI] Revert CUDA, PyTorch and ONNX upgrade Feb 16, 2026

mshr-h mentioned this pull request Feb 16, 2026

[Tracking Issue] Upgrade Python to 3.10 #18682

Closed

11 tasks

tlopex approved these changes Feb 16, 2026

View reviewed changes

tqchen requested changes Feb 16, 2026

View reviewed changes

mshr-h closed this Feb 17, 2026

mshr-h deleted the revert-cuda branch February 17, 2026 05:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

[CI] Revert CUDA, PyTorch and ONNX upgrade#18787

[CI] Revert CUDA, PyTorch and ONNX upgrade#18787
mshr-h wants to merge 2 commits intoapache:mainfrom
mshr-h:revert-cuda

mshr-h commented Feb 16, 2026

Uh oh!

gemini-code-assist bot commented Feb 16, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Feb 16, 2026

Uh oh!

tqchen commented Feb 16, 2026 •

edited

Loading

Uh oh!

mshr-h commented Feb 16, 2026 •

edited

Loading

Uh oh!

mshr-h commented Feb 16, 2026

Uh oh!

tqchen commented Feb 16, 2026

Uh oh!

tqchen Feb 16, 2026

Uh oh!

mshr-h commented Feb 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Comments

Conversation

mshr-h commented Feb 16, 2026

Uh oh!

gemini-code-assist bot commented Feb 16, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Feb 16, 2026

Choose a reason for hiding this comment

Uh oh!

tqchen commented Feb 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mshr-h commented Feb 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mshr-h commented Feb 16, 2026

Uh oh!

tqchen commented Feb 16, 2026

Uh oh!

tqchen Feb 16, 2026

Choose a reason for hiding this comment

Uh oh!

mshr-h commented Feb 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

tqchen commented Feb 16, 2026 •

edited

Loading

mshr-h commented Feb 16, 2026 •

edited

Loading