Skip to content

Conversation

@kevalmorabia97
Copy link
Collaborator

@kevalmorabia97 kevalmorabia97 commented Oct 10, 2025

What does this PR do?

  • Fix Minitron M-LM sharded modelopt state restore where subnet_config is not present - We skip re-conversion to minintron search space (forces TP=1) during restore
  • Move Nas Export mode related logic from autonas.py to conversion.py as its not just for autonas but for all nas algos. Also rename mode name from export to export_nas
  • Update Megatron-LM pruning example README with pruning guidelines and uneven PP command
  • Add minitron import one level up: Instead of mtp.plugins.mcore_minitron.* we can do mtp.mcore_minitron.* which looks more neater

Summary by CodeRabbit

  • New Features

    • Improved plugin loading for pruning integrations.
    • NAS export workflow expanded for richer export/restore of subnet configurations and calibration; export route renamed to a NAS-specific "export_nas" path.
  • Documentation

    • Expanded pruning guide with getting-started link, depth-pruning example (e.g., 36→24 layers), output path clarification, and tip for uneven pipeline-parallel sizing.
  • Tests

    • Updated tests and inference checks to exercise the new export_nas flow and pruning behavior.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 10, 2025

Walkthrough

Replaces the legacy "export" path with a new "export_nas" export workflow, adds NAS export primitives (ExportConfig, export_searchspace, restore_export), rewires mode descriptors, conditionally loads the mcore_minitron plugin, expands plugin exports, updates docs, and adapts tests to the new "export_nas" name and behavior.

Changes

Cohort / File(s) Summary
Docs: Megatron-LM pruning README
examples/megatron-lm/README.md
Adds pruning docs: link to pruning getting started/guidelines, depth-pruning example (36→24) with updated save path, and a TIP for uneven pipeline-parallel sizing via MLM_EXTRA_ARGS. Minor ancillary text edits.
Prune package init: Conditional plugin import
modelopt/torch/prune/__init__.py
Adds import_plugin and conditionally imports plugins.mcore_minitron inside with import_plugin("mcore_minitron", verbose=False); preserves existing wildcard imports.
Minitron plugin: Public API & mode rename
modelopt/torch/prune/plugins/mcore_minitron.py
Updates __all__ to export SUPPORTED_HPARAMS and drop_mcore_language_model_layers (re-exported from modelopt.torch.nas.plugins.megatron), makes restore_mcore_minitron a no-op, and renames mode strings to use export_nas.
NAS conversion: new export NAS workflow & APIs
modelopt/torch/nas/conversion.py
Introduces ExportConfig, ExportNASModeDescriptor, export_searchspace, restore_export, metadata handling and helpers (PatchManager/SearchSpace/get_subnet_config), and routes export() through "export_nas". Updates __all__.
AutoNAS: mode rewiring & constants
modelopt/torch/nas/autonas.py
Removes prior ExportConfig/ExportModeDescriptor/export helpers; updates AutoNASModeDescriptor to reference export_nas; adds MODELOPT_QUEUE_MAXLEN and MODELOPT_BN_CALIB_ITERS constants; updates __all__.
NAS utils: constant removal
modelopt/torch/nas/utils.py
Removes top-level constants MODELOPT_QUEUE_MAXLEN and MODELOPT_BN_CALIB_ITERS (moved/added elsewhere).
FastNAS / Prune mode rename
modelopt/torch/prune/fastnas.py
Replaces "export" with "export_nas" in next_modes and export_mode.
Registry doc comment
modelopt/torch/nas/registry.py
Updates doc comment to refer specifically to the NAS registry; no functional change.
Tests: rename export mode & adjust pruning test
tests/gpu/torch/prune/plugins/test_mcore_gpt_minitron_pruning.py, tests/unit/torch/nas/test_nas.py, tests/unit/torch/opt/test_chaining.py, tests/unit/torch/nas/plugins/test_hf_nas_save_restore.py
Replace "export" with "export_nas" across tests; update sampling/export-related tests; modify pruning test to capture and reload state_dict, run real inference via run_mcore_inference, and assert outputs match after rerun.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    participant User
    participant App
    participant ModeMgr as ModeDescriptor
    participant NASConv as nas.conversion
    participant Model

    Note over ModeMgr: Legacy flow (before change)
    User->>App: apply_mode(..., mode="export")
    App->>ModeMgr: select "export"
    ModeMgr->>Model: legacy export/convert
    Model-->>User: exported model

    Note over ModeMgr: New NAS export flow (this change)
    User->>App: apply_mode(..., mode="export_nas")
    App->>ModeMgr: select "export_nas"
    ModeMgr->>NASConv: export_searchspace(model, ExportConfig)
    NASConv->>Model: in-place subnet export + optional BN calibration
    NASConv-->>ModeMgr: metadata (subnet_config, patches)
    ModeMgr-->>User: exported model + metadata
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

I hopped through modes and names today,
Swapped "export" for a NASy way.
Plugins peek if doors align,
Subnets saved and layers fine.
A rabbit cheers — hop, prune, hooray! 🐇✨

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title concisely captures the two primary changes in this pull request by referencing the NAS export refactoring and the new behavior to skip conversion during Minitron restore. It aligns directly with the PR objectives and uses clear terminology without extraneous detail. This makes it immediately understandable to reviewers scanning the project history.
Docstring Coverage ✅ Passed Docstring coverage is 82.14% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch kmorabia/pruning-doc-update-2

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
examples/megatron-lm/README.md (1)

136-139: Consider showing a complete usage example.

The TIP explains uneven PP configuration but doesn't demonstrate how to integrate MLM_EXTRA_ARGS into the pruning command shown above (lines 129-134). Consider adding a concrete example:

+
+Example with uneven PP:
+```sh
+PP=4 \
+TARGET_NUM_LAYERS=24 \
+HF_MODEL_CKPT=<pretrained_model_name_or_path> \
+MLM_MODEL_SAVE=Qwen3-8B-Pruned \
+MLM_EXTRA_ARGS="--decoder-first-pipeline-num-layers 7 --decoder-last-pipeline-num-layers 5" \
+bash megatron-lm/examples/post_training/modelopt/prune.sh qwen/Qwen3-8B
+```
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6dffcd0 and b6f831f.

📒 Files selected for processing (3)
  • examples/megatron-lm/README.md (2 hunks)
  • modelopt/torch/prune/__init__.py (1 hunks)
  • modelopt/torch/prune/plugins/mcore_minitron.py (2 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
modelopt/torch/prune/plugins/mcore_minitron.py (1)
modelopt/torch/nas/plugins/megatron.py (1)
  • drop_mcore_language_model_layers (1392-1456)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: linux
  • GitHub Check: wait-checks / wait
  • GitHub Check: wait-checks / wait
  • GitHub Check: build-docs
  • GitHub Check: code-quality
🔇 Additional comments (6)
examples/megatron-lm/README.md (2)

113-113: LGTM!

Good addition of links to the pruning getting started section and guidelines. This helps users find the detailed documentation they need.


126-134: Verify the path change is intentional.

The example now uses a relative path Qwen3-8B-Pruned instead of the absolute path /tmp/Qwen3-8B-DPruned. This changes the behavior:

  • Relative path: saves to $PWD/Qwen3-8B-Pruned
  • Absolute path: saves to /tmp/Qwen3-8B-Pruned

Ensure this change aligns with user expectations and is consistent with other examples in the file (e.g., lines 61, 67, 88, 94 use /tmp/ prefix).

modelopt/torch/prune/__init__.py (2)

24-24: LGTM!

Good addition of the import_plugin utility to support conditional plugin loading.


29-30: LGTM!

The conditional plugin loading pattern is appropriate for the mcore_minitron plugin. This allows the plugin to be loaded only when its dependencies are available, preventing import errors when Megatron-Core is not installed. The verbose=False flag suppresses unnecessary logging during import.

modelopt/torch/prune/plugins/mcore_minitron.py (2)

40-40: LGTM!

Good addition of the drop_mcore_language_model_layers import. This helper function is now available for re-export, making it accessible to users at the plugin level.


74-80: LGTM!

The expanded __all__ makes the plugin's public API clearer and aligns with the PR objective to bring minitron imports one level up. Users can now access:

  • SUPPORTED_HPARAMS: For discovering supported pruning hyperparameters
  • Configuration and descriptor classes for the pruning mode
  • drop_mcore_language_model_layers: Helper function for manual layer dropping

This enables imports like from modelopt.torch.prune.plugins.mcore_minitron import drop_mcore_language_model_layers or through the parent package as from modelopt.torch.prune import mcore_minitron (with mcore_minitron.drop_mcore_language_model_layers).

@codecov
Copy link

codecov bot commented Oct 10, 2025

Codecov Report

❌ Patch coverage is 96.77419% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 73.38%. Comparing base (5b02483) to head (7a4394e).
⚠️ Report is 6 commits behind head on main.

Files with missing lines Patch % Lines
modelopt/torch/nas/conversion.py 95.91% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #424      +/-   ##
==========================================
+ Coverage   73.36%   73.38%   +0.01%     
==========================================
  Files         180      180              
  Lines       17919    17934      +15     
==========================================
+ Hits        13147    13160      +13     
- Misses       4772     4774       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@kevalmorabia97 kevalmorabia97 changed the title [Minor] Pruning doc update + bring minitron import one level up NAS export refactor + skip conversion on minitron restore Oct 11, 2025
@kevalmorabia97 kevalmorabia97 enabled auto-merge (squash) October 11, 2025 08:37
@kevalmorabia97 kevalmorabia97 force-pushed the kmorabia/pruning-doc-update-2 branch from 1016d82 to 7a4394e Compare October 11, 2025 08:50
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1016d82 and 7a4394e.

📒 Files selected for processing (10)
  • modelopt/torch/nas/autonas.py (3 hunks)
  • modelopt/torch/nas/conversion.py (3 hunks)
  • modelopt/torch/nas/registry.py (1 hunks)
  • modelopt/torch/nas/utils.py (0 hunks)
  • modelopt/torch/prune/fastnas.py (1 hunks)
  • modelopt/torch/prune/plugins/mcore_minitron.py (4 hunks)
  • tests/gpu/torch/prune/plugins/test_mcore_gpt_minitron_pruning.py (3 hunks)
  • tests/unit/torch/nas/plugins/test_hf_nas_save_restore.py (1 hunks)
  • tests/unit/torch/nas/test_nas.py (3 hunks)
  • tests/unit/torch/opt/test_chaining.py (3 hunks)
💤 Files with no reviewable changes (1)
  • modelopt/torch/nas/utils.py
✅ Files skipped from review due to trivial changes (1)
  • tests/unit/torch/nas/plugins/test_hf_nas_save_restore.py
🚧 Files skipped from review as they are similar to previous changes (3)
  • modelopt/torch/prune/fastnas.py
  • modelopt/torch/nas/registry.py
  • tests/unit/torch/nas/test_nas.py
🧰 Additional context used
🧬 Code graph analysis (4)
modelopt/torch/nas/autonas.py (5)
modelopt/torch/opt/config.py (1)
  • get_kwargs_for_create_model_with_rules (322-383)
modelopt/torch/nas/search_space.py (1)
  • generate_search_space (199-260)
modelopt/torch/nas/utils.py (3)
  • get_subnet_config (160-170)
  • sample (131-142)
  • select (145-157)
modelopt/torch/prune/fastnas.py (2)
  • sample (141-144)
  • export_mode (349-351)
modelopt/torch/prune/plugins/mcore_minitron.py (1)
  • export_mode (305-307)
modelopt/torch/prune/plugins/mcore_minitron.py (3)
modelopt/torch/nas/plugins/megatron.py (1)
  • drop_mcore_language_model_layers (1392-1456)
modelopt/torch/nas/autonas.py (1)
  • export_mode (680-682)
modelopt/torch/prune/fastnas.py (1)
  • export_mode (349-351)
modelopt/torch/nas/conversion.py (6)
modelopt/torch/opt/config.py (2)
  • ModeloptBaseConfig (59-147)
  • ModeloptField (50-53)
modelopt/torch/opt/conversion.py (2)
  • ApplyModeError (314-315)
  • apply_mode (342-429)
modelopt/torch/opt/mode.py (2)
  • ModeDescriptor (56-259)
  • _ModeRegistryCls (267-344)
modelopt/torch/utils/network.py (2)
  • compare_dict (423-427)
  • unwrap_model (430-454)
modelopt/torch/nas/search_space.py (1)
  • SearchSpace (38-196)
modelopt/torch/nas/utils.py (2)
  • get_subnet_config (160-170)
  • select (145-157)
tests/gpu/torch/prune/plugins/test_mcore_gpt_minitron_pruning.py (1)
tests/_test_utils/torch_dist/plugins/megatron_common.py (1)
  • run_mcore_inference (326-379)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: linux
  • GitHub Check: wait-checks / wait
  • GitHub Check: wait-checks / wait
  • GitHub Check: build-docs
  • GitHub Check: code-quality

@kevalmorabia97 kevalmorabia97 merged commit 9e64f81 into main Oct 11, 2025
27 checks passed
@kevalmorabia97 kevalmorabia97 deleted the kmorabia/pruning-doc-update-2 branch October 11, 2025 10:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants