Commit fb54971
authored
Fix HF -> Torchtitan Expert Conversion Sorting Bug (#1918)
The expert_num is a string, which causes `sorted_expert_ids =
sorted(experts.keys())` to not sort correctly for Deepseek and Qwen3
(sorts lexicographically).
This means, that converting from huggingface currently results in
wrongly ordered experts. Roundtripping a state dict with more than 10
experts catches this bug.
Fix: Cast to int, as the type signature was intended.1 parent b206439 commit fb54971
File tree
2 files changed
+2
-2
lines changed- torchtitan/models
- deepseek_v3/model
- qwen3/model
2 files changed
+2
-2
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
171 | 171 | | |
172 | 172 | | |
173 | 173 | | |
174 | | - | |
| 174 | + | |
175 | 175 | | |
176 | 176 | | |
177 | 177 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
131 | 131 | | |
132 | 132 | | |
133 | 133 | | |
134 | | - | |
| 134 | + | |
135 | 135 | | |
136 | 136 | | |
137 | 137 | | |
| |||
0 commit comments