Commit 8c00879
fix: increase graph nodes for Megrez-MoE warmup
Megrez-MoE creates many intermediate tensors during MoE FFN construction:
- sigmoid, add, reshape (3x), get_rows, sum_rows, div, view_2d, mul_mat operations
- ggml_top_k internally calls ggml_argsort + ggml_view_4d (2 more tensors per layer)
- Each of 30 MoE layers creates ~35 intermediate tensors during graph construction
During warmup, the graph is built 3 times with different batch sizes, requiring
sufficient memory pool space for all intermediate tensors.
Add 4096 node overhead for LLM_ARCH_MEGREZ_MOE to accommodate these intermediate
tensors (30 layers × 35 tensors/layer ≈ 1050 nodes, doubled for safety margin).
This fixes the 'not enough space in the context's memory pool' error during warmup,
allowing Megrez-MoE to work without the --no-warmup flag.
Tested:
- All 39 tests pass
- Megrez-MoE works with warmup enabled (no crashes)
- Other models (e.g., Gemma-2) are unaffected
- Verified with outputs up to 100 tokens1 parent 256414a commit 8c00879
1 file changed
+15
-1
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1362 | 1362 | | |
1363 | 1363 | | |
1364 | 1364 | | |
1365 | | - | |
| 1365 | + | |
| 1366 | + | |
| 1367 | + | |
| 1368 | + | |
| 1369 | + | |
| 1370 | + | |
| 1371 | + | |
| 1372 | + | |
| 1373 | + | |
| 1374 | + | |
| 1375 | + | |
| 1376 | + | |
| 1377 | + | |
| 1378 | + | |
| 1379 | + | |
1366 | 1380 | | |
1367 | 1381 | | |
1368 | 1382 | | |
| |||
0 commit comments