Update LLAMA3 70B LoRa Base Configs for GB300, GB200 and H100 by rhmukundan · Pull Request #2265 · NVIDIA-NeMo/Megatron-Bridge

rhmukundan · 2026-02-06T18:53:02Z

Summary by CodeRabbit

Chores
- Updated Llama 3 70B LoRA workload configurations for GB300, GB200, and H100 hardware platforms
- Adjusted parallelism parameters (tensor, pipeline, and context) across model variants
- Modified recomputation settings for H100 configurations

copy-pr-bot · 2026-02-06T18:53:07Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

coderabbitai · 2026-02-06T18:57:10Z

📝 Walkthrough

Walkthrough

This change modifies Llama3 70B LoRA configuration settings for GB300, GB200, and H100 hardware platforms. Updates include adjusting tensor/pipeline/context parallelism dimensions and recomputation parameters across base configs and their variants through replace() calls.

Changes

Cohort / File(s)	Summary
Llama3 LoRA Performance Config Updates `scripts/performance/configs/llama/llama3_workload_base_configs.py`	Modified parallelism dimensions (tensor_model_parallel_size, pipeline_model_parallel_size, context_parallel_size, virtual_pipeline_model_parallel_size) for GB300, GB200, and H100 base configs. Updated public aliases to derive from refactored bases using replace() calls. Adjusted recompute_num_layers for H100_BF16_V1 from 2 to 1.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Suggested labels

Run CICD, performance

Suggested reviewers

malay-nagda
erhoo82
thomasdhc

🚥 Pre-merge checks | ✅ 3 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Test Results For Major Changes	⚠️ Warning	PR contains major configuration changes for LLAMA3 70B LoRA model parallelism across multiple platforms but lacks comprehensive testing information, performance metrics, and validation details in the PR description.	Add test results validating configurations, before-and-after performance metrics for each platform, batch sizes used, and specify which configurations have been validated.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately and specifically describes the main change: updating LLAMA3 70B LoRa base configurations for three hardware platforms (GB300, GB200, H100).
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch rmukundan/llama3_lora_baseconfigs_change

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Signed-off-by: Raghav Hrishikeshan Mukundan <rmukundan@nvidia.com>

rhmukundan · 2026-02-06T20:58:35Z

/ok to test 16aa00e

rhmukundan requested review from erhoo82 and malay-nagda February 6, 2026 18:53

rhmukundan self-assigned this Feb 6, 2026

rhmukundan mentioned this pull request Feb 6, 2026

[Test] Fix LoRA perf configurations of different GPUs #2061

Open

Update LLAMA3 70B LoRa Base Configs for GB300, GB200 and H100

31d7bcf

Signed-off-by: Raghav Hrishikeshan Mukundan <rmukundan@nvidia.com>

rhmukundan force-pushed the rmukundan/llama3_lora_baseconfigs_change branch from 6b69798 to 31d7bcf Compare February 6, 2026 20:58

Merge branch 'main' into rmukundan/llama3_lora_baseconfigs_change

16aa00e

rhmukundan enabled auto-merge (squash) February 6, 2026 20:58

copy-pr-bot bot temporarily deployed to nemo-ci February 6, 2026 20:59 Inactive

copy-pr-bot bot temporarily deployed to test February 6, 2026 20:59 Inactive

copy-pr-bot bot temporarily deployed to nemo-ci February 6, 2026 21:14 Inactive

copy-pr-bot bot temporarily deployed to nemo-ci February 6, 2026 21:22 Inactive

copy-pr-bot bot temporarily deployed to nemo-ci February 6, 2026 21:32 Inactive

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update LLAMA3 70B LoRa Base Configs for GB300, GB200 and H100#2265

Update LLAMA3 70B LoRa Base Configs for GB300, GB200 and H100#2265
rhmukundan wants to merge 2 commits intomainfrom
rmukundan/llama3_lora_baseconfigs_change

rhmukundan commented Feb 6, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

copy-pr-bot bot commented Feb 6, 2026

Uh oh!

coderabbitai bot commented Feb 6, 2026

Walkthrough

Changes

Estimated code review effort

Suggested labels

Suggested reviewers

Uh oh!

rhmukundan commented Feb 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

rhmukundan commented Feb 6, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

copy-pr-bot bot commented Feb 6, 2026

Uh oh!

coderabbitai bot commented Feb 6, 2026

Walkthrough

Changes

Estimated code review effort

Suggested labels

Suggested reviewers

Uh oh!

rhmukundan commented Feb 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

rhmukundan commented Feb 6, 2026 •

edited by coderabbitai bot

Loading