fix: remove noisy qwen2 vl nightly test loss check#1272
Merged
Conversation
Signed-off-by: Terry Kong <terryk@nvidia.com>
Contributor
📝 WalkthroughWalkthroughAdjusted a VLM test suite shell script to modify metric checks passed to tests/check_metrics.py: removed the train/loss threshold at step 200, retaining only the train/reward threshold at step 200. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant CI as CI Runner
participant Sh as Test Suite (.sh)
participant CM as check_metrics.py
CI->>Sh: Execute VLM test suite
Sh->>CM: Invoke with metric checks
Note over Sh,CM: Change: Only reward@step200 checked (loss@step200 removed)
CM->>Sh: Return pass/fail based on reward condition
Sh->>CI: Exit code reflects result
Estimated code review effort🎯 2 (Simple) | ⏱️ ~6 minutes Suggested labels
Suggested reviewers
Pre-merge checks and finishing touches✅ Passed checks (4 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
💤 Files with no reviewable changes (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
chtruong814
approved these changes
Oct 5, 2025
PrinsYin
pushed a commit
to PrinsYin/RL
that referenced
this pull request
Nov 30, 2025
Signed-off-by: Terry Kong <terryk@nvidia.com>
yuanhangsu1986
pushed a commit
to yuanhangsu1986/RL-Nemontron-Edge-Omni
that referenced
this pull request
Feb 21, 2026
Signed-off-by: Terry Kong <terryk@nvidia.com> Signed-off-by: yuanhangs <yuanhangs@nvidia.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What does this PR do ?
The loss check for this test does not follow any particular trend and just hovers. In general we shouldn't use train/loss in grpo nightlies:
https://wandb.ai/nvidia/nemo-rl/panel/952wk61kd?nw=u4eso5k9jwb
Issues
List issues that this PR closes (syntax):
Usage
# Add a code snippet demonstrating how to use thisBefore your PR is "Ready for review"
Pre checks:
Additional Information
Summary by CodeRabbit