Skip to content

Conversation

@winglian
Copy link
Collaborator

@winglian winglian commented Oct 20, 2025

Description

Motivation and Context

How has this been tested?

Screenshots (if appropriate)

Types of changes

Social Handles (Optional)

Summary by CodeRabbit

  • Chores
    • Updated testing infrastructure to include PyTorch 2.9.0.
    • Migrated primary test environment to CUDA 12.8.1.
    • Deprecated older CUDA 12.6.3 test configurations.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 20, 2025

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

📝 Walkthrough

Walkthrough

Updated .github/workflows/tests.yml to expand the PyTorch test matrix by adding version 2.9.0, and migrated the large e2e test configuration from CUDA 12.6.3 with PyTorch 2.7.1 to CUDA 12.8.1 with PyTorch 2.9.0.

Changes

Cohort / File(s) Summary
GitHub Actions workflow configuration
.github/workflows/tests.yml
Added PyTorch 2.9.0 to pytest and pytest-sdist test matrices; migrated large e2e job from CUDA 12.6.3/PyTorch 2.7.1 to CUDA 12.8.1/PyTorch 2.9.0; commented out legacy CUDA 12.6 configurations

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~8 minutes

Possibly related PRs

  • #3221: Adds PyTorch 2.9.0 and CUDA 12.8.1 support to the same workflow matrices and dockerfile references.
  • #3106: Modifies CI matrices to add and adjust CUDA 12.8 GPU test variants and PyTorch versions.
  • #3034: Updates workflow matrices and Dockerfile references to migrate from CUDA 12.6.3 to CUDA 12.8.1 with newer PyTorch versions.

Suggested labels

ready to merge

Suggested reviewers

  • djsaunde
  • SalmanMohammadi

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The pull request title "add torch 2.9.0 to ci" directly corresponds to the main changes documented in the raw summary, which shows that PyTorch 2.9.0 has been added to the test matrix in multiple CI jobs (.github/workflows/tests.yml). The title is concise, clear, and specific enough that a teammate reviewing commit history would immediately understand the primary change. While the changeset also includes CUDA version updates (from 12.6.3 to 12.8.1), the focus on adding PyTorch 2.9.0 aligns with the PR branch name "torch-290-ci" and represents the central objective of the modification.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
.github/workflows/tests.yml (1)

234-252: Clean up duplicate commented-out configurations.

The two identical commented-out blocks (lines 240–245 and 246–252) should be consolidated into a single entry to reduce noise and improve maintainability. Also, confirm that migrating the gate e2e test from PyTorch 2.7.1 to 2.9.0 is intentional, as this changes which version serves as the primary blocker.

If consolidating is desired, apply this diff to remove the duplicate:

           - cuda: 128
             cuda_version: 12.8.1
             python_version: "3.11"
             pytorch: 2.9.0
             num_gpus: 1
             axolotl_extras:
-#          - cuda: 126
-#            cuda_version: 12.6.3
-#            python_version: "3.11"
-#            pytorch: 2.7.1
-#            num_gpus: 1
-#            axolotl_extras:
-#          - cuda: 126
-#            cuda_version: 12.6.3
-#            python_version: "3.11"
-#            pytorch: 2.7.1
-#            num_gpus: 1
-#            axolotl_extras:
-#            dockerfile: "Dockerfile-uv.jinja"
+#          - cuda: 126
+#            cuda_version: 12.6.3
+#            python_version: "3.11"
+#            pytorch: 2.7.1
+#            num_gpus: 1
+#            axolotl_extras:
+#            dockerfile: "Dockerfile-uv.jinja"
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 383f220 and 935b3ff.

📒 Files selected for processing (1)
  • .github/workflows/tests.yml (3 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: PyTest (3.11, 2.9.0)
  • GitHub Check: PyTest from Source Dist (3.11, 2.9.0)
  • GitHub Check: PyTest (3.11, 2.7.1)
  • GitHub Check: PyTest from Source Dist (3.11, 2.8.0)
  • GitHub Check: PyTest (3.11, 2.8.0)
  • GitHub Check: PyTest from Source Dist (3.11, 2.7.1)
🔇 Additional comments (3)
.github/workflows/tests.yml (3)

58-58: Unit test matrix changes look good.

Adding PyTorch 2.9.0 to both pytest and pytest-sdist matrices is straightforward and enables version compatibility testing for the new release.

Also applies to: 133-133


295-307: PyTorch 2.9.0 is missing from the full e2e test matrix.

The gate job (docker-e2e-tests-1st) now tests PyTorch 2.9.0 as the primary configuration, but the full e2e matrix (docker-e2e-tests) only includes PyTorch 2.7.1 and 2.8.0. If 2.9.0 is the target for this PR, it should also run in the full test suite after the gate passes, or the strategy should be documented.

If 2.9.0 should be included in the full matrix, add an entry like:

         include:
           - cuda: 128
             cuda_version: 12.8.1
             python_version: "3.11"
             pytorch: 2.7.1
             num_gpus: 1
             axolotl_extras:
           - cuda: 128
             cuda_version: 12.8.1
             python_version: "3.11"
             pytorch: 2.8.0
             num_gpus: 1
             gpu_type: "B200"
             axolotl_extras: fbgemm-gpu
+          - cuda: 128
+            cuda_version: 12.8.1
+            python_version: "3.11"
+            pytorch: 2.9.0
+            num_gpus: 1
+            axolotl_extras:

345-350: Cleanup job still uses outdated CUDA/PyTorch versions.

The docker-e2e-cleanup job remains pinned to CUDA 126 (12.6.3) with PyTorch 2.7.1. Confirm whether this is intentional for backward-compatibility cleanup or if it should be updated to align with the new gate configuration.

@codecov
Copy link

codecov bot commented Oct 20, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants