feat(pt): add warmup_ratio setting to set warmup steps conveniently #5134

OutisLi · 2026-01-07T08:38:51Z

Summary by CodeRabbit

New Features
- Added warmup_ratio to specify warm-up as a ratio of total steps and warmup_start_factor to control the warm-up starting scale; warmup_steps still takes precedence when provided.
Bug Fixes / Validation
- Enforced validation so warm-up values are within valid ranges and warm-up steps remain less than total steps; warns if ratio yields zero steps while >0.
Documentation
- Updated training argument docs to include the new options.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

coderabbitai · 2026-01-07T08:42:19Z

📝 Walkthrough

Walkthrough

Added two warm-up options: warmup_ratio (optional float) to compute warm-up steps as int(warmup_ratio * num_steps) when warmup_steps is absent, and warmup_start_factor (float, default 0.0) to set the initial LR scale for linear warm-up. Existing warm-up validation and precedence of explicit warmup_steps are preserved.

Changes

Cohort / File(s)	Summary
Warm-up Argument Definitions `deepmd/utils/argcheck.py`	Added `warmup_ratio` (optional float) and `warmup_start_factor` (optional float, default `0.0`) to `training_args`; added `doc_warmup_ratio` and `doc_warmup_start_factor` doc strings and included them in assembled training docs.
Warm-up Calculation & Schedule `deepmd/pt/train/training.py`	Resolve warm-up steps by: use explicit `warmup_steps` if provided; else compute `warmup_steps = int(warmup_ratio * num_steps)` if `warmup_ratio` provided; else `0`. Validate `warmup_ratio` in [0,1) and warn if ratio>0 yields 0 steps. Added `warmup_start_factor` and updated linear warm-up formula to: if `step < warmup_steps`, scale = `warmup_start_factor + (1 - warmup_start_factor) * (step / warmup_steps)`; otherwise use existing exponential schedule scaled by start LR. Existing validation that warm-up < total steps retained.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately describes the main change: adding warmup_ratio as a convenience parameter for setting warmup steps in the PyTorch training module.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In @deepmd/pt/train/training.py:
- Around line 418-426: The warmup_ratio is used to compute self.warmup_steps
without validation and with int() truncation; add a validation step: if
warmup_ratio is provided ensure 0.0 <= warmup_ratio < 1.0 and raise a clear
ValueError mentioning warmup_ratio when out of range, then compute
self.warmup_steps using round(warmup_ratio * self.num_steps) (or keep int() but
add a comment if truncation is intentional) and if round(...) == 0 while
warmup_ratio > 0, set self.warmup_steps = 1 to avoid producing zero warmup
steps; update the error message used by the existing assertion (the warmup_steps
check) to include warmup_ratio when raising.

🧹 Nitpick comments (1)

deepmd/utils/argcheck.py (1)
3344-3349: Consider adding range validation for warmup_ratio in the argument schema.

The warmup_ratio parameter lacks explicit range constraints. Based on the implementation in training.py, valid values should be in the range [0, 1) to ensure the computed warmup_steps doesn't exceed num_steps. Adding validation here would provide clearer error messages during configuration validation rather than at runtime.
♻️ Suggested improvement with validation

While the dargs Argument class may not directly support min/max validation for float types, you could document the expected range more explicitly and consider adding a validation function:
 Argument(
     "warmup_ratio",
     float,
     optional=True,
-    doc=doc_only_pt_supported + doc_warmup_ratio,
+    doc=doc_only_pt_supported + doc_warmup_ratio + " Valid range: [0, 1).",
 ),
Alternatively, if custom validation is supported, add explicit bounds checking to fail fast during configuration parsing.

📜 Review details

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fe1662d and 4946566.

📒 Files selected for processing (2)

deepmd/pt/train/training.py
deepmd/utils/argcheck.py

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (30)

GitHub Check: CodeQL analysis (python)
GitHub Check: Agent
GitHub Check: Test Python (6, 3.13)
GitHub Check: Test Python (5, 3.13)
GitHub Check: Test Python (1, 3.13)
GitHub Check: Build C library (2.18, libdeepmd_c.tar.gz)
GitHub Check: Test Python (3, 3.10)
GitHub Check: Build C++ (cpu, cpu)
GitHub Check: Test Python (5, 3.10)
GitHub Check: Build C++ (clang, clang)
GitHub Check: Test Python (2, 3.10)
GitHub Check: Test Python (3, 3.13)
GitHub Check: Test Python (4, 3.13)
GitHub Check: Build C++ (rocm, rocm)
GitHub Check: Build C++ (cuda120, cuda)
GitHub Check: Test Python (4, 3.10)
GitHub Check: Test Python (2, 3.13)
GitHub Check: Test Python (6, 3.10)
GitHub Check: Test Python (1, 3.10)
GitHub Check: Build wheels for cp311-macosx_arm64
GitHub Check: Build wheels for cp311-win_amd64
GitHub Check: Build wheels for cp310-manylinux_aarch64
GitHub Check: Build wheels for cp311-manylinux_x86_64
GitHub Check: Build wheels for cp311-macosx_x86_64
GitHub Check: Test C++ (false, true, true, false)
GitHub Check: Analyze (python)
GitHub Check: Analyze (c-cpp)
GitHub Check: Test C++ (true, false, false, true)
GitHub Check: Test C++ (true, true, true, false)
GitHub Check: Test C++ (false, false, false, true)

deepmd/pt/train/training.py

Copilot

Pull request overview

This PR adds a convenient warmup_ratio parameter for PyTorch training that allows users to specify warmup steps as a ratio of total training steps, rather than an absolute number.

Key changes:

Added warmup_ratio configuration parameter that calculates warmup steps as a fraction of numb_steps
Implemented precedence logic where warmup_steps takes priority over warmup_ratio when both are specified

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File	Description
deepmd/utils/argcheck.py	Adds documentation and argument definition for the new `warmup_ratio` parameter, marked as PyTorch-only
deepmd/pt/train/training.py	Implements the warmup_ratio logic by calculating `warmup_steps = int(warmup_ratio * num_steps)` when warmup_steps is not explicitly set

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

deepmd/pt/train/training.py

deepmd/utils/argcheck.py

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (2)

deepmd/pt/train/training.py (2)
424-425: Consider clarifying why the upper bound is exclusive.

The validation correctly enforces warmup_ratio < 1, but the error message doesn't explain why 1.0 is excluded. Users might wonder why the full range isn't [0, 1].
📝 Suggested improvement
         if not 0 <= warmup_ratio < 1:
-            raise ValueError(f"warmup_ratio must be in [0, 1), got {warmup_ratio}")
+            raise ValueError(
+                f"warmup_ratio must be in [0, 1) to leave steps for training, got {warmup_ratio}"
+            )
This also addresses the Ruff TRY003 style hint by breaking the message into multiple lines.
435-435: Add validation for warmup_start_factor to prevent unexpected behavior.

The warmup_start_factor parameter lacks range validation. While the most common use case is [0, 1] (starting from 0% to 100% of the target learning rate), negative or > 1 values could cause unexpected learning rate schedules.
🛡️ Proposed validation

Add validation after line 435:
self.warmup_start_factor = training_params.get("warmup_start_factor", 0.0)
if not 0 <= self.warmup_start_factor <= 1:
    log.warning(
        f"warmup_start_factor is typically in [0, 1], got {self.warmup_start_factor}. "
        f"This may result in unusual learning rate schedules."
    )
Alternatively, if values outside [0, 1] should be disallowed, use raise ValueError instead of log.warning.

📜 Review details

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 636c9dd and 186262c.

📒 Files selected for processing (2)

deepmd/pt/train/training.py
deepmd/utils/argcheck.py

🚧 Files skipped from review as they are similar to previous changes (1)

deepmd/utils/argcheck.py

🧰 Additional context used

🪛 Ruff (0.14.10)

deepmd/pt/train/training.py

425-425: Avoid specifying long messages outside the exception class

(TRY003)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (36)

GitHub Check: Test Python (9, 3.10)
GitHub Check: Test Python (10, 3.10)
GitHub Check: Test Python (12, 3.13)
GitHub Check: Test Python (9, 3.13)
GitHub Check: Test Python (11, 3.13)
GitHub Check: Test Python (4, 3.13)
GitHub Check: Test Python (10, 3.13)
GitHub Check: Test Python (7, 3.13)
GitHub Check: Test Python (8, 3.13)
GitHub Check: Test Python (8, 3.10)
GitHub Check: Test Python (4, 3.10)
GitHub Check: Test Python (6, 3.10)
GitHub Check: Test Python (5, 3.13)
GitHub Check: Test Python (7, 3.10)
GitHub Check: Test Python (5, 3.10)
GitHub Check: Test Python (2, 3.10)
GitHub Check: Test Python (1, 3.13)
GitHub Check: Test Python (1, 3.10)
GitHub Check: Test Python (3, 3.10)
GitHub Check: Test Python (2, 3.13)
GitHub Check: Build wheels for cp311-win_amd64
GitHub Check: Build wheels for cp310-manylinux_aarch64
GitHub Check: Build C++ (clang, clang)
GitHub Check: Build C++ (rocm, rocm)
GitHub Check: Build wheels for cp311-macosx_x86_64
GitHub Check: Build C++ (cuda120, cuda)
GitHub Check: Build wheels for cp311-macosx_arm64
GitHub Check: Build wheels for cp311-manylinux_x86_64
GitHub Check: Build C++ (cpu, cpu)
GitHub Check: Build C library (2.18, libdeepmd_c.tar.gz)
GitHub Check: Test C++ (false, true, true, false)
GitHub Check: Test C++ (true, false, false, true)
GitHub Check: Test C++ (true, true, true, false)
GitHub Check: Test C++ (false, false, false, true)
GitHub Check: Analyze (python)
GitHub Check: Analyze (c-cpp)

🔇 Additional comments (2)

deepmd/pt/train/training.py (2)

419-434: LGTM! Warmup configuration logic is well-structured.

The precedence (explicit warmup_steps → warmup_ratio → default 0) is clear and correctly implements the feature. The validation and warning for truncation are helpful for users.

685-691: LGTM! Linear warmup implementation is mathematically correct.

The modified warmup schedule correctly interpolates the learning rate multiplier from warmup_start_factor to 1.0 over warmup_steps steps. The division by zero is protected by the conditional check, and the formula produces the expected linear ramp.

codecov · 2026-01-09T08:25:49Z

Codecov Report

❌ Patch coverage is 90.00000% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 81.93%. Comparing base (fe1662d) to head (186262c).
⚠️ Report is 6 commits behind head on master.

Files with missing lines	Patch %	Lines
deepmd/pt/train/training.py	87.50%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #5134      +/-   ##
==========================================
- Coverage   82.15%   81.93%   -0.22%     
==========================================
  Files         709      712       +3     
  Lines       72468    72900     +432     
  Branches     3616     3617       +1     
==========================================
+ Hits        59535    59733     +198     
- Misses      11769    12003     +234     
  Partials     1164     1164

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

feat(pt): add warmup_ratio setting to set warmup steps conveniently

4946566

Copilot AI review requested due to automatic review settings January 7, 2026 08:38

github-actions bot added the Python label Jan 7, 2026

Copilot started reviewing on behalf of OutisLi January 7, 2026 08:39 View session

dosubot bot added the new feature label Jan 7, 2026

coderabbitai bot reviewed Jan 7, 2026

View reviewed changes

deepmd/pt/train/training.py Show resolved Hide resolved

Copilot AI reviewed Jan 7, 2026

View reviewed changes

deepmd/pt/train/training.py Show resolved Hide resolved

deepmd/pt/train/training.py Show resolved Hide resolved

deepmd/pt/train/training.py Show resolved Hide resolved

deepmd/utils/argcheck.py Show resolved Hide resolved

OutisLi added 2 commits January 7, 2026 17:27

add warmup_start_factor

636c9dd

fix

186262c

OutisLi requested review from iProzd and njzjz January 9, 2026 07:53

coderabbitai bot reviewed Jan 9, 2026

View reviewed changes

iProzd approved these changes Jan 9, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(pt): add warmup_ratio setting to set warmup steps conveniently #5134

feat(pt): add warmup_ratio setting to set warmup steps conveniently #5134

OutisLi commented Jan 7, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Jan 7, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

codecov bot commented Jan 9, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat(pt): add warmup_ratio setting to set warmup steps conveniently #5134

Are you sure you want to change the base?

feat(pt): add warmup_ratio setting to set warmup steps conveniently #5134

Conversation

OutisLi commented Jan 7, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Jan 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Jan 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

OutisLi commented Jan 7, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 7, 2026 •

edited

Loading

codecov bot commented Jan 9, 2026 •

edited

Loading