Skip to content

upgrade transformers==5.3.0 trl==0.29.0 kernels#3459

Merged
winglian merged 26 commits intomainfrom
transformers-530
Mar 6, 2026
Merged

upgrade transformers==5.3.0 trl==0.29.0 kernels#3459
winglian merged 26 commits intomainfrom
transformers-530

Conversation

@winglian
Copy link
Collaborator

@winglian winglian commented Mar 5, 2026

Description

Motivation and Context

How has this been tested?

AI Usage Disclaimer

Screenshots (if appropriate)

Types of changes

Social Handles (Optional)

Summary by CodeRabbit

  • Chores

    • Updated dependencies including transformers, deepspeed, trl, hf_xet, and kernels to newer versions for improved compatibility and performance.
    • Updated CI/CD pipeline CUDA configuration.
  • Tests

    • Updated tokenizer test assertions and expected token sequences for improved accuracy.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 5, 2026

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: fcdd8650-600b-4603-a3da-9f7677608fda

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

This pull request updates dependencies in requirements.txt (transformers, deepspeed, trl, hf_xet, kernels), adjusts CUDA matrix entries in CI workflows from version 12.9.1 to 12.8.1, adds an optional model parameter to the OptimizerMixin.create_optimizer method, and updates test assertions for tokenizer type validation and expected token IDs.

Changes

Cohort / File(s) Summary
CI Workflow Configuration
.github/workflows/tests.yml
Replaced CUDA version 12.9.1 with 12.8.1 in docker-e2e-tests-1st and docker-e2e-cleanup job matrices.
Dependency Management
requirements.txt
Updated multiple dependencies: transformers (5.2.0→5.3.0), trl (0.28.0→0.29.0), hf_xet (1.2.0→1.3.2), kernels (0.12.1→0.12.2), and adjusted deepspeed version constraint (>=0.18.3 to >=0.18.6,<0.19.0).
Core Trainer Modifications
src/axolotl/core/trainers/mixins/optimizer.py
Added optional model parameter to create_optimizer() method; optimization model selection now respects provided parameter over instance model when specified.
Test Assertions
tests/prompt_strategies/test_chat_templates.py, tests/test_tokenizers.py
Added LlamaTokenizer class assertion for phi35 tokenizer; updated expected token IDs in test_phi35 (EOT markers: 22172→12199; label sequences: 1781→16773) and test_tokenizers (added token path: [1, 32000, 1404]→[1, 32000, 1792]).

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

  • #3407: Updates transformers dependency version in requirements.txt with overlapping release pattern.
  • #3272: Modifies CI workflow CUDA matrix entries to include CUDA 12.8.1 configurations.
  • #3214: Updates both transformers and trl dependencies in requirements.txt as part of coordinated release.

Suggested labels

scheduled_release

Suggested reviewers

  • NanoCode012
  • SalmanMohammadi
🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately reflects the main changes: upgrades to transformers (5.3.0), trl (0.29.0), and kernels, which are the primary dependency updates documented in requirements.txt.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch transformers-530

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 6, 2026

📖 Documentation Preview: https://69aabc1b2436da3456e58629--resonant-treacle-0fd729.netlify.app

Deployed on Netlify from commit bc23aa9

@winglian winglian requested a review from NanoCode012 March 6, 2026 02:42
@codecov
Copy link

codecov bot commented Mar 6, 2026

[
"sft_cfg",
"rm_cfg",
# "rm_cfg", # TODO fix for num_labels = 2 vs 1
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this mean?

@NanoCode012 NanoCode012 added the scheduled_release This PR is slated for the upcoming release label Mar 6, 2026
tokenizer = load_tokenizer(cfg)
assert tokenizer("<|im_start|>user")["input_ids"] == [1, 32000, 1404]
assert "LlamaTokenizer" in tokenizer.__class__.__name__
assert tokenizer("<|im_start|>user")["input_ids"] == [1, 32000, 1792]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did this change?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there was a tokenizers change upstream

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@winglian winglian merged commit cada93c into main Mar 6, 2026
24 checks passed
@winglian winglian deleted the transformers-530 branch March 6, 2026 14:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

scheduled_release This PR is slated for the upcoming release

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants