Commit 815309e
authored
fix(cloud.infer): reduce Qwen3-MoE export OOM risk (#821)
Summary
- Keep `use_onnx_subfunctions` disabled by default in
`QEfficient.cloud.infer`
- Provide explicit opt-in via `--use-onnx-subfunctions` only
- Remove `--no-use-onnx-subfunctions`
- Update infer unit tests for explicit-enable and default-disabled
behavior
- Update quick-start and text-generation docs to reflect explicit opt-in
behavior
Why
- Align infer behavior with reviewer feedback to keep defaults unchanged
and avoid model-specific auto-enable behavior.
Fixes
- Fixes #702
Validation
- `python -m py_compile QEfficient/cloud/infer.py
tests/cloud/test_infer.py`
- `ruff check QEfficient/cloud/infer.py tests/cloud/test_infer.py`
- `pytest -q tests/cloud/test_infer.py -m "not on_qaic"` (2 passed, 5
deselected)
---------
Signed-off-by: jd316 <jd316biswas@gmail.com>1 parent 3d0d663 commit 815309e
File tree
4 files changed
+72
-8
lines changed- QEfficient/cloud
- docs/source
- examples/text_generation
- tests/cloud
4 files changed
+72
-8
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
139 | 139 | | |
140 | 140 | | |
141 | 141 | | |
| 142 | + | |
142 | 143 | | |
143 | 144 | | |
144 | 145 | | |
| |||
205 | 206 | | |
206 | 207 | | |
207 | 208 | | |
| 209 | + | |
| 210 | + | |
208 | 211 | | |
209 | 212 | | |
210 | 213 | | |
| |||
231 | 234 | | |
232 | 235 | | |
233 | 236 | | |
234 | | - | |
235 | | - | |
236 | | - | |
237 | | - | |
238 | | - | |
239 | | - | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
240 | 241 | | |
241 | 242 | | |
242 | 243 | | |
| |||
280 | 281 | | |
281 | 282 | | |
282 | 283 | | |
| 284 | + | |
283 | 285 | | |
284 | 286 | | |
285 | 287 | | |
| |||
382 | 384 | | |
383 | 385 | | |
384 | 386 | | |
| 387 | + | |
| 388 | + | |
| 389 | + | |
| 390 | + | |
| 391 | + | |
| 392 | + | |
| 393 | + | |
| 394 | + | |
385 | 395 | | |
386 | 396 | | |
387 | 397 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
111 | 111 | | |
112 | 112 | | |
113 | 113 | | |
| 114 | + | |
114 | 115 | | |
115 | 116 | | |
116 | 117 | | |
117 | 118 | | |
118 | 119 | | |
119 | 120 | | |
120 | 121 | | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
121 | 127 | | |
122 | 128 | | |
123 | 129 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
115 | 115 | | |
116 | 116 | | |
117 | 117 | | |
| 118 | + | |
118 | 119 | | |
119 | 120 | | |
120 | 121 | | |
| |||
216 | 217 | | |
217 | 218 | | |
218 | 219 | | |
| 220 | + | |
219 | 221 | | |
220 | 222 | | |
221 | 223 | | |
| |||
312 | 314 | | |
313 | 315 | | |
314 | 316 | | |
315 | | - | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
5 | 5 | | |
6 | 6 | | |
7 | 7 | | |
| 8 | + | |
| 9 | + | |
8 | 10 | | |
9 | 11 | | |
10 | 12 | | |
11 | 13 | | |
12 | 14 | | |
13 | 15 | | |
14 | 16 | | |
15 | | - | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
16 | 24 | | |
17 | 25 | | |
18 | 26 | | |
| |||
99 | 107 | | |
100 | 108 | | |
101 | 109 | | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
0 commit comments