Skip to content

Commit b6f831f

Browse files
[Minor] Pruning doc update + bring minitron import to mtp.* instead of mtp.plugins.*
Signed-off-by: Keval Morabia <[email protected]>
1 parent 6dffcd0 commit b6f831f

File tree

3 files changed

+21
-2
lines changed

3 files changed

+21
-2
lines changed

examples/megatron-lm/README.md

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -110,6 +110,8 @@ Coming soon ...
110110

111111
### ⭐ Pruning
112112

113+
Checkout pruning [getting started section](../pruning/README.md#getting-started) and [guidelines](../pruning/README.md#pruning-guidelines) for configuring pruning parameters in the pruning README.
114+
113115
Pruning is supported for GPT and Mamba models in Pipeline Parallel mode. Available pruning options are:
114116

115117
- `TARGET_FFN_HIDDEN_SIZE`
@@ -121,14 +123,20 @@ Pruning is supported for GPT and Mamba models in Pipeline Parallel mode. Availab
121123
- `TARGET_NUM_LAYERS`
122124
- `LAYERS_TO_DROP` (comma separated, 1-indexed list of layer numbers to directly drop)
123125

126+
Example for depth pruning Qwen3-8B from 36 to 24 layers:
127+
124128
```sh
125129
PP=1 \
126130
TARGET_NUM_LAYERS=24 \
127131
HF_MODEL_CKPT=<pretrained_model_name_or_path> \
128-
MLM_MODEL_SAVE=/tmp/Qwen3-8B-DPruned \
132+
MLM_MODEL_SAVE=Qwen3-8B-Pruned \
129133
bash megatron-lm/examples/post_training/modelopt/prune.sh qwen/Qwen3-8B
130134
```
131135

136+
> [!TIP]
137+
> If number of layers in the model is not divisible by pipeline parallel size (PP), you can configure uneven
138+
> PP by setting `MLM_EXTRA_ARGS="--decoder-first-pipeline-num-layers <X> --decoder-last-pipeline-num-layers <Y>"`
139+
132140
## Learn More About Configuration
133141

134142
For simplicity, we use `shell` scripts and variables as arguments. Each script has at least 1 positional

modelopt/torch/prune/__init__.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,10 @@
2121

2222
# nas is a required - so let's check if it's available
2323
import modelopt.torch.nas
24+
from modelopt.torch.utils import import_plugin
2425

2526
from . import fastnas, gradnas, plugins
2627
from .pruning import *
28+
29+
with import_plugin("mcore_minitron", verbose=False):
30+
from .plugins import mcore_minitron

modelopt/torch/prune/plugins/mcore_minitron.py

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -37,6 +37,7 @@
3737
HAS_MAMBA,
3838
_DynamicMCoreLanguageModel,
3939
SUPPORTED_MODELS,
40+
drop_mcore_language_model_layers,
4041
)
4142
# isort: on
4243

@@ -70,7 +71,13 @@
7071
"num_layers",
7172
}
7273

73-
__all__ = ["MCoreMinitronConfig", "MCoreMinitronModeDescriptor", "MCoreMinitronSearcher"]
74+
__all__ = [
75+
"SUPPORTED_HPARAMS",
76+
"MCoreMinitronConfig",
77+
"MCoreMinitronModeDescriptor",
78+
"MCoreMinitronSearcher",
79+
"drop_mcore_language_model_layers",
80+
]
7481

7582

7683
class MCoreMinitronSearcher(BaseSearcher):

0 commit comments

Comments
 (0)