-
Notifications
You must be signed in to change notification settings - Fork 190
Remove unnecessary Tracing use in Minitron Search Space generation #405
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughConsolidates Megatron GPT-specific APIs and removes the Megatron trace plugin; adds a DynamicSpace conversion workflow for MCore Minitron with convert/restore entrypoints and updates tests and trace plugin imports accordingly. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor User
participant Search as Searcher / Mode runner
participant Convert as convert_mcore_minitron
participant Helper as _convert_model_to_dynamic_space
participant DM as DMRegistry
participant Model as nn.Module
User->>Search: start(mode=MCoreMinitron, model, config)
Search->>Convert: call convert_mcore_minitron(model, config)
Convert->>Helper: _convert_model_to_dynamic_space(model, config?)
Helper->>DM: request DMRegistry conversion / mark models
DM-->>Helper: returns DynamicSpace
Helper->>Helper: validate configurability (or raise ApplyModeError)
Helper-->>Convert: DynamicSpace + metadata
Convert-->>Search: (model, metadata)
Note over Search,DM: Subsequent step: dynamic_space.export(DMRegistry) → export pipeline
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Signed-off-by: Keval Morabia <[email protected]>
00f4c5a to
ce1c306
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
modelopt/torch/nas/plugins/megatron.py(2 hunks)modelopt/torch/prune/plugins/mcore_minitron.py(4 hunks)modelopt/torch/trace/plugins/__init__.py(1 hunks)modelopt/torch/trace/plugins/megatron.py(0 hunks)tests/gpu/torch/nas/plugins/test_megatron_gpt_dynamic_modules.py(3 hunks)tests/gpu/torch/nas/plugins/test_megatron_mamba_dynamic_modules.py(3 hunks)tests/unit/torch/trace/test_symbol.py(0 hunks)
💤 Files with no reviewable changes (2)
- modelopt/torch/trace/plugins/megatron.py
- tests/unit/torch/trace/test_symbol.py
🧰 Additional context used
🧬 Code graph analysis (3)
modelopt/torch/prune/plugins/mcore_minitron.py (6)
modelopt/torch/nas/plugins/megatron.py (3)
_DynamicMCoreLanguageModel(1150-1389)convert(272-297)convert(382-405)modelopt/torch/nas/utils.py (1)
get_subnet_config(166-176)modelopt/torch/opt/dynamic.py (3)
config(1265-1278)DynamicSpace(1127-1368)convert_to_dynamic(1138-1187)modelopt/torch/opt/config.py (2)
ModeloptBaseConfig(59-147)keys(132-134)modelopt/torch/opt/conversion.py (1)
ApplyModeError(314-315)modelopt/torch/opt/mode.py (1)
ModeDescriptor(56-259)
tests/gpu/torch/nas/plugins/test_megatron_mamba_dynamic_modules.py (3)
modelopt/torch/prune/plugins/mcore_minitron.py (1)
_convert_model_to_dynamic_space(235-248)modelopt/torch/nas/search_space.py (1)
sort_parameters(131-170)modelopt/torch/opt/utils.py (1)
named_hparams(62-66)
tests/gpu/torch/nas/plugins/test_megatron_gpt_dynamic_modules.py (2)
modelopt/torch/prune/plugins/mcore_minitron.py (1)
_convert_model_to_dynamic_space(235-248)modelopt/torch/nas/search_space.py (2)
sort_parameters(131-170)export(172-174)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: linux
- GitHub Check: wait-checks / wait
- GitHub Check: wait-checks / wait
- GitHub Check: code-quality
- GitHub Check: build-docs
🔇 Additional comments (10)
modelopt/torch/prune/plugins/mcore_minitron.py (5)
30-30: LGTM!The import additions support the new dynamic-space conversion workflow and properly expose the updated public API.
Also applies to: 34-40, 45-54, 73-73
235-248: LGTM!The function correctly implements the simplified dynamic-space conversion workflow by directly converting top-level language models (GPT/Mamba) without recursive tracing. The validation check ensures only models with configurable hyperparameters proceed.
251-264: LGTM!The convert entrypoint correctly creates the dynamic space and captures the current subnet configuration in metadata for potential restoration later.
267-271: LGTM!The restore entrypoint re-applies the mode conversion, which is appropriate when the conversion is idempotent and model state is restored separately via checkpoint loading mechanisms.
292-315: LGTM!The mode descriptor properties correctly implement the ModeDescriptor interface and wire up the new convert/restore entrypoints. The next_modes and export_mode properties properly define the mode's compatibility and export behavior.
modelopt/torch/trace/plugins/__init__.py (1)
18-21: LGTM!The centralized
import_pluginutility simplifies plugin loading and aligns with the removal of the Megatron trace plugin, which is no longer needed after switching to direct dynamic-space conversion.modelopt/torch/nas/plugins/megatron.py (1)
16-16: LGTM!The API consolidation to
drop_mcore_language_model_layersand removal of the deprecated GPT-specific variant simplifies the public interface while supporting both GPT and Mamba models.Also applies to: 100-100
tests/gpu/torch/nas/plugins/test_megatron_gpt_dynamic_modules.py (2)
51-53: LGTM!The added imports support the new dynamic-space conversion workflow.
182-182: LGTM!The test correctly adopts the new dynamic-space conversion workflow:
- Uses
_convert_model_to_dynamic_spacefor conversion- Calls
mtn.utils.sort_parametersutility function- Uses
dynamic_spacemethods for querying hparams and exportingAlso applies to: 192-193, 196-197, 203-203
tests/gpu/torch/nas/plugins/test_megatron_mamba_dynamic_modules.py (1)
48-48: LGTM!The test adopts the new dynamic-space conversion workflow with
_convert_model_to_dynamic_spaceand usesdynamic_space.named_hparamsto query hyperparameters.Also applies to: 166-166, 180-180
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #405 +/- ##
==========================================
- Coverage 73.79% 73.79% -0.01%
==========================================
Files 171 171
Lines 17591 17591
==========================================
- Hits 12982 12981 -1
- Misses 4609 4610 +1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks great!!
What does this PR do?
Type of change: Implementation improvement
Overview: Remove unnecessary complexities of using the FastNAS/AutoNAS automatic recursive tracer and directly convert top level gpt/mamba model to DynamicModule for simplicity. No change in user-facing feature, only internal implementation improved.
Testing
Before your PR is "Ready for review"
Summary by CodeRabbit
New Features
Refactor
Tests