Add Nemotron 3 to tests via tiny model by sergiopaniego · Pull Request #5278 · huggingface/trl

sergiopaniego · 2026-03-12T11:53:47Z

What does this PR do?

Fixes # (issue)

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a GitHub issue? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes?
Did you write any new necessary tests?

Who can review?

Note

Low Risk
Low risk: changes are limited to test fixtures and unit tests, gated on transformers>=5.3.0, with a CPU-only fallback to avoid known NemotronH kernel/gradient-checkpointing incompatibilities.

Overview
Adds a new tiny NemotronH (hybrid Mamba-Attention) causal LM to the generate_tiny_models.py script so it can be published under trl-internal-testing for CI.

Extends SFTTrainer and DPOTrainer parametrized training tests to include tiny-NemotronHForCausalLM (skipped on older transformers), and conditionally disables gradient checkpointing + forces CPU for this model to avoid Mamba kernel stride constraints with tiny dimensions.

^{Written by Cursor Bugbot for commit 3c8f9d4. This will update automatically on new commits. Configure here.}

HuggingFaceDocBuilderDev · 2026-03-12T11:58:35Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

…o nemotron3-tiny-tests

albertvillanova

Thanks!

The CI is red:

  FAILED tests/test_dpo_trainer.py::TestDPOTrainer::test_train[trl-internal-testing/tiny-NemotronHForCausalLM] - RuntimeError: causal_conv1d with channel last layout requires strides (x.stride(0) and x.stride(2)) to be multiples of 8
  FAILED tests/test_sft_trainer.py::TestSFTTrainer::test_train[trl-internal-testing/tiny-NemotronHForCausalLM] - RuntimeError: causal_conv1d with channel last layout requires strides (x.stride(0) and x.stride(2)) to be multiples of 8

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

tests/test_sft_trainer.py

qgallouedec

thanks!! just a few comments

qgallouedec · 2026-03-13T15:48:10Z

scripts/generate_tiny_models.py

+    use_mamba_kernels=False,  # CPU-friendly for testing
+)
+model = NemotronHForCausalLM(config).to(dtype=torch.bfloat16)
+init_weights_tiny_model(model)


Can you cast backbone.layers.[N].mixer.D and backbone.layers.[N].mixer.A_log to fp32?: it seems like these two layers are in fp32, and we want to be as close as possible to the reference model

check how we do here for Qwen3.5 https://github.com/huggingface/trl/pull/5278/changes#diff-dd3349f840a26de373fc88378e6fcded0b75423da8a34f7cfa6ac573b7398b8bL404

qgallouedec · 2026-03-13T15:50:03Z

tests/test_dpo_trainer.py

+        kwargs = {}
+        if "NemotronH" in model_id:
+            kwargs["gradient_checkpointing"] = False
+            kwargs["use_cpu"] = True


really not sure about this. we don't train on cpu, so why testing it + we wouldn't know it a gpu-specific issue is introduced

qgallouedec · 2026-03-13T15:51:34Z

Thanks!

The CI is red:

  FAILED tests/test_dpo_trainer.py::TestDPOTrainer::test_train[trl-internal-testing/tiny-NemotronHForCausalLM] - RuntimeError: causal_conv1d with channel last layout requires strides (x.stride(0) and x.stride(2)) to be multiples of 8
  FAILED tests/test_sft_trainer.py::TestSFTTrainer::test_train[trl-internal-testing/tiny-NemotronHForCausalLM] - RuntimeError: causal_conv1d with channel last layout requires strides (x.stride(0) and x.stride(2)) to be multiples of 8

is it possible that this error originates from what params are used to build the model?

sergiopaniego added 3 commits March 12, 2026 12:51

Add Nemotron 3 to tests via tiny model

dd4f3a6

Code quality

529ef04

Merge branch 'main' into nemotron3-tiny-tests

875a46c

sergiopaniego added 2 commits March 12, 2026 14:41

Updated

b79861f

Merge branch 'nemotron3-tiny-tests' of github.com:huggingface/trl int…

27cef67

…o nemotron3-tiny-tests

albertvillanova reviewed Mar 12, 2026

View reviewed changes

sergiopaniego added 3 commits March 12, 2026 15:35

Update

4f9b1f8

Merge branch 'main' into nemotron3-tiny-tests

8122d6d

Update

c555040

cursor bot reviewed Mar 13, 2026

View reviewed changes

tests/test_sft_trainer.py Outdated Show resolved Hide resolved

Cursor review

3c8f9d4

qgallouedec reviewed Mar 13, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Nemotron 3 to tests via tiny model#5278

Add Nemotron 3 to tests via tiny model#5278
sergiopaniego wants to merge 9 commits intomainfrom
nemotron3-tiny-tests

sergiopaniego commented Mar 12, 2026 •

edited by cursor bot

Loading

Uh oh!

HuggingFaceDocBuilderDev commented Mar 12, 2026

Uh oh!

albertvillanova left a comment •

edited

Loading

Uh oh!

cursor bot left a comment

Uh oh!

Uh oh!

qgallouedec left a comment

Uh oh!

qgallouedec Mar 13, 2026

Uh oh!

qgallouedec Mar 13, 2026

Uh oh!

qgallouedec commented Mar 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

sergiopaniego commented Mar 12, 2026 • edited by cursor bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Before submitting

Who can review?

Uh oh!

HuggingFaceDocBuilderDev commented Mar 12, 2026

Uh oh!

albertvillanova left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

qgallouedec left a comment

Choose a reason for hiding this comment

Uh oh!

qgallouedec Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

qgallouedec Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

qgallouedec commented Mar 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

sergiopaniego commented Mar 12, 2026 •

edited by cursor bot

Loading

albertvillanova left a comment •

edited

Loading