-
Notifications
You must be signed in to change notification settings - Fork 190
Updated amax_sync test to set const weights based on rank #451
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Kinjal Patel <[email protected]>
WalkthroughTest initialization was changed to set per-parameter, rank-dependent weight values (excluding embeddings) and to create a rank-scaled Changes
Sequence Diagram(s)mermaid Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
✅ Files skipped from review due to trivial changes (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
tests/gpu/torch/quantization/plugins/test_megatron.py (1)
691-691: Consider improving prompt_tokens variation across ranks.The current formula
(torch.ones(...) * 0.05 + rank * 0.5).long()produces identical values for consecutive rank pairs (ranks 0-1 get 0, ranks 2-3 get 1, etc.). While this doesn't break the test (since weights already differ), you could improve variation:- prompt_tokens = (torch.ones((2, model.max_sequence_length)) * 0.05 + rank * 0.5).cuda().long() + prompt_tokens = (torch.ones((2, model.max_sequence_length)) * rank).cuda().long()This creates unique token indices for each rank: 0, 1, 2, 3, etc., providing better per-rank distinction.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
tests/gpu/torch/quantization/plugins/test_megatron.py(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
tests/gpu/torch/quantization/plugins/test_megatron.py (2)
modelopt/torch/utils/distributed.py (2)
rank(68-72)rank(200-202)modelopt/torch/quantization/src/tensor_quant_mx.cu (4)
cuda(75-101)cuda(75-76)cuda(105-137)cuda(105-105)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: linux
- GitHub Check: wait-checks / wait
- GitHub Check: code-quality
- GitHub Check: build-docs
🔇 Additional comments (1)
tests/gpu/torch/quantization/plugins/test_megatron.py (1)
677-690: Excellent approach to create rank-dependent weight values.The weight initialization logic correctly:
- Filters for trainable weight matrices (excluding embeddings and biases)
- Assigns distinct constant values per rank and parameter index
- Ensures different amax values across ranks for proper synchronization testing
The filtering conditions are appropriate and the rank-dependent formula
0.1 + (rank * 0.5) + (weight_idx * 0.05)effectively creates the needed variation.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #451 +/- ##
==========================================
+ Coverage 73.38% 73.39% +0.01%
==========================================
Files 180 180
Lines 17942 17976 +34
==========================================
+ Hits 13166 13194 +28
- Misses 4776 4782 +6 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Signed-off-by: Kinjal Patel <[email protected]>
What does this PR do?
Bug fix
Overview:
For amax sync test, set the weights explicitly with constant value based on each rank which should ensure different amax without synchronization
Usage
Testing
Ran the above mentioned pytest
Before your PR is "Ready for review"
Additional Information
Summary by CodeRabbit
New Features
Tests