Skip to content

Conversation

@chtruong814
Copy link
Contributor

@chtruong814 chtruong814 commented Jan 5, 2026

beep boop [🤖]: Hi @guyueh1 👋,

we've cherry picked #1703 into  for you! 🚀

Please review and approve this cherry pick by your convenience!

Summary by CodeRabbit

  • Chores
    • Updated internal dependency reference to latest version.

✏️ Tip: You can customize this high-level summary in your review settings.

@github-actions
Copy link

github-actions bot commented Jan 5, 2026

✅ Submodule Fast-Forward Check Results

Check based on commit: 98c9d6b (PR #1714 from cherry-pick-1703-r0.5.0)

✅ Submodules that are properly updated:

Megatron-LM: ✅ PR branch is ahead of r0.5.0 branch (fast-forward)

All submodule changes look good! ✨

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 5, 2026

📝 Walkthrough

Walkthrough

This PR updates the submodule reference for Megatron-LM from commit 25a62ed to b73ae5c. No functional or API changes are introduced; only the tracked submodule commit hash is modified.

Changes

Cohort / File(s) Summary
Submodule reference
3rdparty/Megatron-LM-workspace/Megatron-LM
Updated submodule commit pointer from 25a62edf77b5130f888328ca8119d6a76117cf23 to b73ae5cdab9d409fcface2b2f3c375710abe6911

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~2 minutes

Possibly related PRs

Suggested labels

GB200, r0.5.0

Suggested reviewers

  • guyueh1
  • terrykong

Pre-merge checks

✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately reflects the cherry-pick operation into r0.5.0 and references the original fix for GB200 checkpoint save issues.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Test Results For Major Changes ✅ Passed PR only updates submodule reference hash with no functional or public API changes, qualifying as minor changes.

📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6e7a2f7 and 98c9d6b.

📒 Files selected for processing (1)
  • 3rdparty/Megatron-LM-workspace/Megatron-LM
🧰 Additional context used
📓 Path-based instructions (1)
!(**/tests/**|**/test_*.py|**/test_*.sh)

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

Add the NVIDIA copyright header to all Python files and shell scripts (excluding tests). The header should include the current year

Files:

  • 3rdparty/Megatron-LM-workspace/Megatron-LM
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Lint check
  • GitHub Check: Lint check
  • GitHub Check: Lint check
  • GitHub Check: Post automodel integration comment / Comment on PR
🔇 Additional comments (1)
3rdparty/Megatron-LM-workspace/Megatron-LM (1)

1-1: Verify the submodule commit and cherry-pick completeness.

This PR cherry-picks commit #1703 (GB200 single-thread checkpoint save fix) into the r0.5.0 release branch. The only visible change is the submodule commit hash update. Please confirm:

  1. The target commit hash (b73ae5c) in the Megatron-LM submodule contains the fix for checkpoint saving on GB200 as described in PR fix: on GB200 use single-thread checkpoint save to avoid Cpu OOM #1703.
  2. No additional changes to the main repository are required to activate or support this fix.
  3. The cherry-pick applies cleanly without conflicts.

You can verify the commit details by checking the Megatron-LM submodule's commit log.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@terrykong terrykong added the CI:docs Run doctest label Jan 5, 2026
@terrykong terrykong added CI:L1 Run doctests, unit tests, and functional tests and removed CI:docs Run doctest labels Jan 5, 2026
@terrykong terrykong enabled auto-merge (squash) January 5, 2026 18:12
@terrykong terrykong merged commit edfc23d into r0.5.0 Jan 5, 2026
79 of 86 checks passed
@terrykong terrykong deleted the cherry-pick-1703-r0.5.0 branch January 5, 2026 23:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cherry-pick CI:L1 Run doctests, unit tests, and functional tests Run CICD

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants