Commit 256414a
feat: Add Megrez-MoE architecture support
Implements complete support for Megrez-MoE (Mixture of Experts) models:
- Add LLM_ARCH_MEGREZ_MOE architecture enum and mappings
- Implement build_mergez_moe_ffn() with sigmoid+bias gating
- Add llm_build_megrez_moe class for full model graph construction
- Support 31-layer architecture (layer 0: dense FFN, layers 1-30: MoE)
- Implement expert sharing pattern with 64 experts, 6 used per token, 4 shared
- Load all model hyperparameters and 372 tensors correctly
- Configure NEOX RoPE type for proper positional encoding
Tested with Megrez2-3x7B-A3B_Q4_K_M.gguf model.
All 39 llama.cpp tests pass successfully.
Output verified to match infinigence/llama.cpp reference implementation.
Note: Use --no-warmup flag to avoid warmup memory allocation issue.1 parent 1eeb523 commit 256414a
File tree
5 files changed
+426
-0
lines changed- src
5 files changed
+426
-0
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
98 | 98 | | |
99 | 99 | | |
100 | 100 | | |
| 101 | + | |
101 | 102 | | |
102 | 103 | | |
103 | 104 | | |
| |||
2185 | 2186 | | |
2186 | 2187 | | |
2187 | 2188 | | |
| 2189 | + | |
| 2190 | + | |
| 2191 | + | |
| 2192 | + | |
| 2193 | + | |
| 2194 | + | |
| 2195 | + | |
| 2196 | + | |
| 2197 | + | |
| 2198 | + | |
| 2199 | + | |
| 2200 | + | |
| 2201 | + | |
| 2202 | + | |
| 2203 | + | |
| 2204 | + | |
| 2205 | + | |
| 2206 | + | |
| 2207 | + | |
| 2208 | + | |
| 2209 | + | |
| 2210 | + | |
| 2211 | + | |
| 2212 | + | |
| 2213 | + | |
2188 | 2214 | | |
2189 | 2215 | | |
2190 | 2216 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
102 | 102 | | |
103 | 103 | | |
104 | 104 | | |
| 105 | + | |
105 | 106 | | |
106 | 107 | | |
107 | 108 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1063 | 1063 | | |
1064 | 1064 | | |
1065 | 1065 | | |
| 1066 | + | |
| 1067 | + | |
| 1068 | + | |
| 1069 | + | |
| 1070 | + | |
| 1071 | + | |
| 1072 | + | |
| 1073 | + | |
| 1074 | + | |
| 1075 | + | |
| 1076 | + | |
| 1077 | + | |
| 1078 | + | |
| 1079 | + | |
| 1080 | + | |
| 1081 | + | |
| 1082 | + | |
| 1083 | + | |
| 1084 | + | |
| 1085 | + | |
| 1086 | + | |
| 1087 | + | |
| 1088 | + | |
| 1089 | + | |
| 1090 | + | |
| 1091 | + | |
| 1092 | + | |
| 1093 | + | |
| 1094 | + | |
| 1095 | + | |
| 1096 | + | |
| 1097 | + | |
| 1098 | + | |
| 1099 | + | |
| 1100 | + | |
| 1101 | + | |
| 1102 | + | |
| 1103 | + | |
| 1104 | + | |
| 1105 | + | |
| 1106 | + | |
| 1107 | + | |
| 1108 | + | |
| 1109 | + | |
| 1110 | + | |
| 1111 | + | |
| 1112 | + | |
| 1113 | + | |
| 1114 | + | |
| 1115 | + | |
| 1116 | + | |
| 1117 | + | |
| 1118 | + | |
| 1119 | + | |
| 1120 | + | |
| 1121 | + | |
| 1122 | + | |
| 1123 | + | |
| 1124 | + | |
| 1125 | + | |
| 1126 | + | |
| 1127 | + | |
| 1128 | + | |
| 1129 | + | |
| 1130 | + | |
| 1131 | + | |
| 1132 | + | |
| 1133 | + | |
| 1134 | + | |
| 1135 | + | |
| 1136 | + | |
| 1137 | + | |
| 1138 | + | |
| 1139 | + | |
| 1140 | + | |
| 1141 | + | |
| 1142 | + | |
| 1143 | + | |
| 1144 | + | |
| 1145 | + | |
| 1146 | + | |
| 1147 | + | |
| 1148 | + | |
| 1149 | + | |
| 1150 | + | |
| 1151 | + | |
| 1152 | + | |
| 1153 | + | |
| 1154 | + | |
| 1155 | + | |
| 1156 | + | |
| 1157 | + | |
| 1158 | + | |
| 1159 | + | |
1066 | 1160 | | |
1067 | 1161 | | |
1068 | 1162 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
667 | 667 | | |
668 | 668 | | |
669 | 669 | | |
| 670 | + | |
| 671 | + | |
| 672 | + | |
| 673 | + | |
| 674 | + | |
| 675 | + | |
| 676 | + | |
| 677 | + | |
| 678 | + | |
| 679 | + | |
| 680 | + | |
| 681 | + | |
| 682 | + | |
670 | 683 | | |
671 | 684 | | |
672 | 685 | | |
| |||
0 commit comments