You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: CHANGELOG.rst
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,7 +6,7 @@ Model Optimizer Changelog (Linux)
6
6
7
7
**New Features**
8
8
9
-
- Add MoE pruning support for ``num_moe_experts`` and ``moe_shared_expert_intermediate_size`` in Minitron pruning (``mcore_minitron``).
9
+
- Add MoE (e.g. Qwen3-30B-A3B) pruning support for ``num_moe_experts`` and ``moe_shared_expert_intermediate_size`` parameters in Minitron pruning (``mcore_minitron``).
Checkout pruning [getting started section](../pruning/README.md#getting-started) and [guidelines](../pruning/README.md#pruning-guidelines) for configuring pruning parameters in the pruning README.
114
114
115
-
Pruning is supported for GPT and Mamba models in Pipeline Parallel mode. Available pruning options are:
115
+
Pruning is supported for GPT and Mamba models in Pipeline Parallel mode. Available pruning dimensions are:
116
116
117
117
-`TARGET_FFN_HIDDEN_SIZE`
118
118
-`TARGET_HIDDEN_SIZE`
119
119
-`TARGET_NUM_ATTENTION_HEADS`
120
120
-`TARGET_NUM_QUERY_GROUPS`
121
121
-`TARGET_MAMBA_NUM_HEADS`
122
122
-`TARGET_MAMBA_HEAD_DIM`
123
+
-`TARGET_NUM_MOE_EXPERTS`
124
+
-`TARGET_MOE_SHARED_EXPERT_INTERMEDIATE_SIZE`
123
125
-`TARGET_NUM_LAYERS`
124
126
-`LAYERS_TO_DROP` (comma separated, 1-indexed list of layer numbers to directly drop)
0 commit comments