File tree Expand file tree Collapse file tree 3 files changed +6
-0
lines changed Expand file tree Collapse file tree 3 files changed +6
-0
lines changed Original file line number Diff line number Diff line change 214214| [ swift/Qwen3-235B-A22B-AWQ] ( https://modelscope.cn/models/swift/Qwen3-235B-A22B-AWQ ) | qwen3_moe| qwen3| transformers>=4.51| ✘ ; | -| [ cognitivecomputations/Qwen3-235B-A22B-AWQ] ( https://huggingface.co/cognitivecomputations/Qwen3-235B-A22B-AWQ ) |
215215| [ Qwen/Qwen3-235B-A22B-Instruct-2507] ( https://modelscope.cn/models/Qwen/Qwen3-235B-A22B-Instruct-2507 ) | qwen3_moe| qwen3| transformers>=4.51| ✔ ; | -| [ Qwen/Qwen3-235B-A22B-Instruct-2507] ( https://huggingface.co/Qwen/Qwen3-235B-A22B-Instruct-2507 ) |
216216| [ Qwen/Qwen3-235B-A22B-Instruct-2507-FP8] ( https://modelscope.cn/models/Qwen/Qwen3-235B-A22B-Instruct-2507-FP8 ) | qwen3_moe| qwen3| transformers>=4.51| ✘ ; | -| [ Qwen/Qwen3-235B-A22B-Instruct-2507-FP8] ( https://huggingface.co/Qwen/Qwen3-235B-A22B-Instruct-2507-FP8 ) |
217+ | [ Qwen/Qwen3-235B-A22B-Thinking-2507] ( https://modelscope.cn/models/Qwen/Qwen3-235B-A22B-Thinking-2507 ) | qwen3_moe| qwen3| transformers>=4.51| ✔ ; | -| [ Qwen/Qwen3-235B-A22B-Thinking-2507] ( https://huggingface.co/Qwen/Qwen3-235B-A22B-Thinking-2507 ) |
218+ | [ Qwen/Qwen3-235B-A22B-Thinking-2507-FP8] ( https://modelscope.cn/models/Qwen/Qwen3-235B-A22B-Thinking-2507-FP8 ) | qwen3_moe| qwen3| transformers>=4.51| ✘ ; | -| [ Qwen/Qwen3-235B-A22B-Thinking-2507-FP8] ( https://huggingface.co/Qwen/Qwen3-235B-A22B-Thinking-2507-FP8 ) |
217219| [ swift/Qwen3-235B-A22B-Instruct-2507-AWQ] ( https://modelscope.cn/models/swift/Qwen3-235B-A22B-Instruct-2507-AWQ ) | qwen3_moe| qwen3| transformers>=4.51| ✘ ; | -| -|
218220| [ Qwen/Qwen3-Coder-480B-A35B-Instruct] ( https://modelscope.cn/models/Qwen/Qwen3-Coder-480B-A35B-Instruct ) | qwen3_moe| qwen3| transformers>=4.51| ✔ ; | coding| [ Qwen/Qwen3-Coder-480B-A35B-Instruct] ( https://huggingface.co/Qwen/Qwen3-Coder-480B-A35B-Instruct ) |
219221| [ Qwen/Qwen3-Coder-480B-A35B-Instruct-FP8] ( https://modelscope.cn/models/Qwen/Qwen3-Coder-480B-A35B-Instruct-FP8 ) | qwen3_moe| qwen3| transformers>=4.51| ✘ ; | coding| [ Qwen/Qwen3-Coder-480B-A35B-Instruct-FP8] ( https://huggingface.co/Qwen/Qwen3-Coder-480B-A35B-Instruct-FP8 ) |
Original file line number Diff line number Diff line change @@ -214,6 +214,8 @@ The table below introduces the models integrated with ms-swift:
214214| [ swift/Qwen3-235B-A22B-AWQ] ( https://modelscope.cn/models/swift/Qwen3-235B-A22B-AWQ ) | qwen3_moe| qwen3| transformers>=4.51| ✘ ; | -| [ cognitivecomputations/Qwen3-235B-A22B-AWQ] ( https://huggingface.co/cognitivecomputations/Qwen3-235B-A22B-AWQ ) |
215215| [ Qwen/Qwen3-235B-A22B-Instruct-2507] ( https://modelscope.cn/models/Qwen/Qwen3-235B-A22B-Instruct-2507 ) | qwen3_moe| qwen3| transformers>=4.51| ✔ ; | -| [ Qwen/Qwen3-235B-A22B-Instruct-2507] ( https://huggingface.co/Qwen/Qwen3-235B-A22B-Instruct-2507 ) |
216216| [ Qwen/Qwen3-235B-A22B-Instruct-2507-FP8] ( https://modelscope.cn/models/Qwen/Qwen3-235B-A22B-Instruct-2507-FP8 ) | qwen3_moe| qwen3| transformers>=4.51| ✘ ; | -| [ Qwen/Qwen3-235B-A22B-Instruct-2507-FP8] ( https://huggingface.co/Qwen/Qwen3-235B-A22B-Instruct-2507-FP8 ) |
217+ | [ Qwen/Qwen3-235B-A22B-Thinking-2507] ( https://modelscope.cn/models/Qwen/Qwen3-235B-A22B-Thinking-2507 ) | qwen3_moe| qwen3| transformers>=4.51| ✔ ; | -| [ Qwen/Qwen3-235B-A22B-Thinking-2507] ( https://huggingface.co/Qwen/Qwen3-235B-A22B-Thinking-2507 ) |
218+ | [ Qwen/Qwen3-235B-A22B-Thinking-2507-FP8] ( https://modelscope.cn/models/Qwen/Qwen3-235B-A22B-Thinking-2507-FP8 ) | qwen3_moe| qwen3| transformers>=4.51| ✘ ; | -| [ Qwen/Qwen3-235B-A22B-Thinking-2507-FP8] ( https://huggingface.co/Qwen/Qwen3-235B-A22B-Thinking-2507-FP8 ) |
217219| [ swift/Qwen3-235B-A22B-Instruct-2507-AWQ] ( https://modelscope.cn/models/swift/Qwen3-235B-A22B-Instruct-2507-AWQ ) | qwen3_moe| qwen3| transformers>=4.51| ✘ ; | -| -|
218220| [ Qwen/Qwen3-Coder-480B-A35B-Instruct] ( https://modelscope.cn/models/Qwen/Qwen3-Coder-480B-A35B-Instruct ) | qwen3_moe| qwen3| transformers>=4.51| ✔ ; | coding| [ Qwen/Qwen3-Coder-480B-A35B-Instruct] ( https://huggingface.co/Qwen/Qwen3-Coder-480B-A35B-Instruct ) |
219221| [ Qwen/Qwen3-Coder-480B-A35B-Instruct-FP8] ( https://modelscope.cn/models/Qwen/Qwen3-Coder-480B-A35B-Instruct-FP8 ) | qwen3_moe| qwen3| transformers>=4.51| ✘ ; | coding| [ Qwen/Qwen3-Coder-480B-A35B-Instruct-FP8] ( https://huggingface.co/Qwen/Qwen3-Coder-480B-A35B-Instruct-FP8 ) |
Original file line number Diff line number Diff line change @@ -555,6 +555,8 @@ def _get_cast_dtype(self) -> torch.dtype:
555555 ModelGroup ([
556556 Model ('Qwen/Qwen3-235B-A22B-Instruct-2507' , 'Qwen/Qwen3-235B-A22B-Instruct-2507' ),
557557 Model ('Qwen/Qwen3-235B-A22B-Instruct-2507-FP8' , 'Qwen/Qwen3-235B-A22B-Instruct-2507-FP8' ),
558+ Model ('Qwen/Qwen3-235B-A22B-Thinking-2507' , 'Qwen/Qwen3-235B-A22B-Thinking-2507' ),
559+ Model ('Qwen/Qwen3-235B-A22B-Thinking-2507-FP8' , 'Qwen/Qwen3-235B-A22B-Thinking-2507-FP8' ),
558560 # awq
559561 Model ('swift/Qwen3-235B-A22B-Instruct-2507-AWQ' ),
560562 ]),
You can’t perform that action at this time.
0 commit comments