Commit a113bea
authored
[OMNIML-2917] handle lm_head and other un-quantized modules correctly (#504)
## What does this PR do?
**Type of change:**
Bug fix.
**Overview:**
This is change set 2 from working on OMNIML-2917.
Two correlated changes:
1. when we just quantize the langauge_model submodule, correctly disable
quantization of all other modules, we do not need to hard code anything
2. When we export quantized model to hf unified format, we hard code the
exclusion of "lm_head". With the change set 1 where we use the full
model for export config generation, we can natually exclude lm_head if
it is not quantized. Therefore, remove the hard coded lm_head inclusion
in the exclusion list.
## Testing
Correctly exported Llama 3.1 70B, Qwen3 VL MoE, Nemotron Super, Llama4
Scout, NVIDIA-Nemotron-Nano-12B-v2-VL-BF16
---------
Signed-off-by: Shengliang Xu <[email protected]>1 parent 479f729 commit a113bea
File tree
4 files changed
+41
-50
lines changed- examples/llm_ptq
- modelopt/torch/export
4 files changed
+41
-50
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
318 | 318 | | |
319 | 319 | | |
320 | 320 | | |
321 | | - | |
322 | | - | |
| 321 | + | |
| 322 | + | |
| 323 | + | |
| 324 | + | |
| 325 | + | |
| 326 | + | |
323 | 327 | | |
324 | 328 | | |
325 | 329 | | |
326 | 330 | | |
327 | 331 | | |
328 | | - | |
329 | | - | |
330 | | - | |
331 | | - | |
| 332 | + | |
| 333 | + | |
| 334 | + | |
| 335 | + | |
| 336 | + | |
| 337 | + | |
332 | 338 | | |
333 | 339 | | |
334 | 340 | | |
| |||
493 | 499 | | |
494 | 500 | | |
495 | 501 | | |
496 | | - | |
497 | | - | |
| 502 | + | |
| 503 | + | |
498 | 504 | | |
499 | | - | |
| 505 | + | |
500 | 506 | | |
501 | 507 | | |
502 | 508 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
14 | 14 | | |
15 | 15 | | |
16 | 16 | | |
| 17 | + | |
| 18 | + | |
17 | 19 | | |
18 | 20 | | |
19 | 21 | | |
| |||
111 | 113 | | |
112 | 114 | | |
113 | 115 | | |
114 | | - | |
115 | | - | |
| 116 | + | |
| 117 | + | |
116 | 118 | | |
117 | 119 | | |
118 | 120 | | |
| |||
122 | 124 | | |
123 | 125 | | |
124 | 126 | | |
125 | | - | |
126 | | - | |
127 | | - | |
| 127 | + | |
128 | 128 | | |
129 | 129 | | |
130 | 130 | | |
131 | | - | |
132 | | - | |
133 | | - | |
134 | | - | |
135 | | - | |
136 | | - | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
137 | 134 | | |
138 | | - | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
139 | 139 | | |
140 | | - | |
141 | | - | |
142 | | - | |
143 | | - | |
144 | | - | |
145 | | - | |
146 | | - | |
147 | | - | |
148 | | - | |
149 | | - | |
150 | | - | |
151 | | - | |
152 | | - | |
153 | | - | |
154 | | - | |
| 140 | + | |
155 | 141 | | |
156 | 142 | | |
157 | | - | |
| 143 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
17 | 17 | | |
18 | 18 | | |
19 | 19 | | |
| 20 | + | |
20 | 21 | | |
21 | 22 | | |
22 | 23 | | |
| |||
1121 | 1122 | | |
1122 | 1123 | | |
1123 | 1124 | | |
| 1125 | + | |
| 1126 | + | |
1124 | 1127 | | |
1125 | 1128 | | |
1126 | | - | |
1127 | | - | |
1128 | | - | |
| 1129 | + | |
| 1130 | + | |
| 1131 | + | |
1129 | 1132 | | |
| 1133 | + | |
1130 | 1134 | | |
1131 | | - | |
| 1135 | + | |
1132 | 1136 | | |
1133 | | - | |
| 1137 | + | |
1134 | 1138 | | |
1135 | 1139 | | |
1136 | 1140 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
155 | 155 | | |
156 | 156 | | |
157 | 157 | | |
158 | | - | |
| 158 | + | |
159 | 159 | | |
160 | | - | |
| 160 | + | |
161 | 161 | | |
162 | 162 | | |
| 163 | + | |
163 | 164 | | |
164 | 165 | | |
165 | 166 | | |
| |||
472 | 473 | | |
473 | 474 | | |
474 | 475 | | |
475 | | - | |
476 | 476 | | |
477 | 477 | | |
478 | 478 | | |
| |||
491 | 491 | | |
492 | 492 | | |
493 | 493 | | |
494 | | - | |
495 | 494 | | |
496 | 495 | | |
497 | 496 | | |
| |||
525 | 524 | | |
526 | 525 | | |
527 | 526 | | |
528 | | - | |
529 | | - | |
530 | | - | |
531 | | - | |
532 | 527 | | |
533 | 528 | | |
534 | 529 | | |
| |||
0 commit comments