fix num_labels= 1 test fail by ved1beta · Pull Request #3493 · axolotl-ai-cloud/axolotl

ved1beta · 2026-03-13T10:41:58Z

Description

Add a set_reward_model_defaults model validator in validation.py that automatically sets num_labels=1 for reward_model: True and num_labels=2 for process_reward_model: True, and defaults the model_type to the appropriate AutoModel class.

Re-enable rm_cfg in test_builder_w_rm_trainers which was previously disabled pending this fix.

How has this been tested?

test_reward_model_defaults
test_process_reward_model_defaults

Summary by CodeRabbit

Release Notes

New Features
- Added automatic default configuration for reward model settings, intelligently setting model type and label counts based on configuration context.
Tests
- Enhanced validation test coverage for reward model defaults to ensure configuration reliability.

coderabbitai · 2026-03-13T10:42:21Z

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 9938837a-40bf-47f4-a7a7-3060b93b64e9

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

📝 Walkthrough

Walkthrough

A new pre-validation hook was added to automatically set default values for reward model configurations. When reward_model or process_reward_model flags are present, missing num_labels and model_type fields are populated with appropriate defaults. Test coverage was extended to validate this behavior.

Changes

Cohort / File(s)	Summary
Validation Hook `src/axolotl/utils/schemas/validation.py`	Added `set_reward_model_defaults()` method to `TrainingValidationMixin` that sets `num_labels` defaults (1 for reward_model, 2 for process_reward_model) and `model_type` to appropriate AutoModel classes when not provided.
Test Coverage `tests/core/test_builders.py`, `tests/patched/test_validation.py`	Uncommented reward model fixture in builder test. Added two new test methods validating default values for `reward_model` and `process_reward_model` configurations.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check	❓ Inconclusive	The title 'fix num_labels= 1 test fail' is vague and lacks specific context about what is being fixed. It mentions 'num_labels' and 'test fail' but doesn't clearly explain the actual implementation—adding reward model default validators.	Consider a more descriptive title such as 'Add reward model default validators for num_labels and model_type' or 'Auto-set num_labels and model_type for reward models' to better convey the core change.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

📝 Coding Plan

Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

🧹 Nitpick comments (1)

src/axolotl/utils/schemas/validation.py (1)

260-270: Consider adding mutual exclusivity validation for reward_model and process_reward_model.

If both flags are set simultaneously, process_reward_model values will silently override reward_model defaults. While this edge case is unlikely in practice, you may want to add explicit validation to catch misconfiguration early.

♻️ Optional enhancement to validate mutual exclusivity

 `@model_validator`(mode="before")
 `@classmethod`
 def set_reward_model_defaults(cls, data):
+    if data.get("reward_model") and data.get("process_reward_model"):
+        raise ValueError(
+            "reward_model and process_reward_model are mutually exclusive"
+        )
+
     if data.get("reward_model"):
         if data.get("num_labels") is None:
             data["num_labels"] = 1

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/axolotl/utils/schemas/validation.py` around lines 260 - 270, Add a
mutual-exclusivity check before the existing defaulting logic so both
"reward_model" and "process_reward_model" cannot be set simultaneously: inspect
the incoming data dict at the top of the block that sets defaults for
reward_model/process_reward_model and if both data.get("reward_model") and
data.get("process_reward_model") are truthy raise a validation error (e.g.,
ValueError or the module's ValidationError) with a clear message; keep the rest
of the defaulting code for "model_type" and "num_labels" unchanged.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/axolotl/utils/schemas/validation.py`:
- Around line 260-270: Add a mutual-exclusivity check before the existing
defaulting logic so both "reward_model" and "process_reward_model" cannot be set
simultaneously: inspect the incoming data dict at the top of the block that sets
defaults for reward_model/process_reward_model and if both
data.get("reward_model") and data.get("process_reward_model") are truthy raise a
validation error (e.g., ValueError or the module's ValidationError) with a clear
message; keep the rest of the defaulting code for "model_type" and "num_labels"
unchanged.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: aae302b6-5fcf-4ae3-97b9-144cb6dfddba

📥 Commits

Reviewing files that changed from the base of the PR and between 083c5a0 and 70cf3f4.

📒 Files selected for processing (3)

src/axolotl/utils/schemas/validation.py
tests/core/test_builders.py
tests/patched/test_validation.py

…_num_lables

codecov · 2026-03-13T11:19:42Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

ved1beta and others added 2 commits March 13, 2026 11:44

trl_num_lables=1

5414528

Merge branch 'main' into trl_num_lables

70cf3f4

coderabbitai bot reviewed Mar 13, 2026

View reviewed changes

ved1beta added 3 commits March 13, 2026 16:32

casual num_lables=1,rwd model

a67c060

Merge branch 'trl_num_lables' of github.com:ved1beta/axolotl into trl…

00c6bfa

…_num_lables

lint

1bb0fe0

winglian approved these changes Mar 17, 2026

View reviewed changes

winglian added the ready to merge label Mar 17, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix num_labels= 1 test fail #3493

fix num_labels= 1 test fail #3493
ved1beta wants to merge 5 commits intoaxolotl-ai-cloud:mainfrom
ved1beta:trl_num_lables

ved1beta commented Mar 13, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Mar 13, 2026 •

edited

Loading

Review skipped

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning, 1 inconclusive)

Uh oh!

coderabbitai bot left a comment

Uh oh!

codecov bot commented Mar 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

ved1beta commented Mar 13, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

How has this been tested?

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai bot commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning, 1 inconclusive)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Mar 13, 2026

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ved1beta commented Mar 13, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 13, 2026 •

edited

Loading