File tree Expand file tree Collapse file tree 3 files changed +5
-0
lines changed
Expand file tree Collapse file tree 3 files changed +5
-0
lines changed Original file line number Diff line number Diff line change 211211| [ Qwen/Qwen3Guard-Gen-8B] ( https://modelscope.cn/models/Qwen/Qwen3Guard-Gen-8B ) | qwen3_guard| qwen3_guard| transformers>=4.51| ✘ ; | -| [ Qwen/Qwen3Guard-Gen-8B] ( https://huggingface.co/Qwen/Qwen3Guard-Gen-8B ) |
212212| [ Qwen/Qwen3-4B-Thinking-2507] ( https://modelscope.cn/models/Qwen/Qwen3-4B-Thinking-2507 ) | qwen3_thinking| qwen3_thinking| transformers>=4.51| ✔ ; | -| [ Qwen/Qwen3-4B-Thinking-2507] ( https://huggingface.co/Qwen/Qwen3-4B-Thinking-2507 ) |
213213| [ Qwen/Qwen3-4B-Thinking-2507-FP8] ( https://modelscope.cn/models/Qwen/Qwen3-4B-Thinking-2507-FP8 ) | qwen3_thinking| qwen3_thinking| transformers>=4.51| ✔ ; | -| [ Qwen/Qwen3-4B-Thinking-2507-FP8] ( https://huggingface.co/Qwen/Qwen3-4B-Thinking-2507-FP8 ) |
214+ | [ iic/QwenLong-L1.5-30B-A3B] ( https://modelscope.cn/models/iic/QwenLong-L1.5-30B-A3B ) | qwen3_thinking| qwen3_thinking| transformers>=4.51| ✔ ; | -| [ Tongyi-Zhiwen/QwenLong-L1.5-30B-A3B] ( https://huggingface.co/Tongyi-Zhiwen/QwenLong-L1.5-30B-A3B ) |
214215| [ Qwen/Qwen3-30B-A3B-Instruct-2507] ( https://modelscope.cn/models/Qwen/Qwen3-30B-A3B-Instruct-2507 ) | qwen3_nothinking| qwen3_nothinking| transformers>=4.51| ✔ ; | -| [ Qwen/Qwen3-30B-A3B-Instruct-2507] ( https://huggingface.co/Qwen/Qwen3-30B-A3B-Instruct-2507 ) |
215216| [ Qwen/Qwen3-30B-A3B-Instruct-2507-FP8] ( https://modelscope.cn/models/Qwen/Qwen3-30B-A3B-Instruct-2507-FP8 ) | qwen3_nothinking| qwen3_nothinking| transformers>=4.51| ✔ ; | -| [ Qwen/Qwen3-30B-A3B-Instruct-2507-FP8] ( https://huggingface.co/Qwen/Qwen3-30B-A3B-Instruct-2507-FP8 ) |
216217| [ Qwen/Qwen3-235B-A22B-Instruct-2507] ( https://modelscope.cn/models/Qwen/Qwen3-235B-A22B-Instruct-2507 ) | qwen3_nothinking| qwen3_nothinking| transformers>=4.51| ✔ ; | -| [ Qwen/Qwen3-235B-A22B-Instruct-2507] ( https://huggingface.co/Qwen/Qwen3-235B-A22B-Instruct-2507 ) |
Original file line number Diff line number Diff line change @@ -212,6 +212,7 @@ The table below introduces the models integrated with ms-swift:
212212| [ Qwen/Qwen3Guard-Gen-8B] ( https://modelscope.cn/models/Qwen/Qwen3Guard-Gen-8B ) | qwen3_guard| qwen3_guard| transformers>=4.51| ✘ ; | -| [ Qwen/Qwen3Guard-Gen-8B] ( https://huggingface.co/Qwen/Qwen3Guard-Gen-8B ) |
213213| [ Qwen/Qwen3-4B-Thinking-2507] ( https://modelscope.cn/models/Qwen/Qwen3-4B-Thinking-2507 ) | qwen3_thinking| qwen3_thinking| transformers>=4.51| ✔ ; | -| [ Qwen/Qwen3-4B-Thinking-2507] ( https://huggingface.co/Qwen/Qwen3-4B-Thinking-2507 ) |
214214| [ Qwen/Qwen3-4B-Thinking-2507-FP8] ( https://modelscope.cn/models/Qwen/Qwen3-4B-Thinking-2507-FP8 ) | qwen3_thinking| qwen3_thinking| transformers>=4.51| ✔ ; | -| [ Qwen/Qwen3-4B-Thinking-2507-FP8] ( https://huggingface.co/Qwen/Qwen3-4B-Thinking-2507-FP8 ) |
215+ | [ iic/QwenLong-L1.5-30B-A3B] ( https://modelscope.cn/models/iic/QwenLong-L1.5-30B-A3B ) | qwen3_thinking| qwen3_thinking| transformers>=4.51| ✔ ; | -| [ Tongyi-Zhiwen/QwenLong-L1.5-30B-A3B] ( https://huggingface.co/Tongyi-Zhiwen/QwenLong-L1.5-30B-A3B ) |
215216| [ Qwen/Qwen3-30B-A3B-Instruct-2507] ( https://modelscope.cn/models/Qwen/Qwen3-30B-A3B-Instruct-2507 ) | qwen3_nothinking| qwen3_nothinking| transformers>=4.51| ✔ ; | -| [ Qwen/Qwen3-30B-A3B-Instruct-2507] ( https://huggingface.co/Qwen/Qwen3-30B-A3B-Instruct-2507 ) |
216217| [ Qwen/Qwen3-30B-A3B-Instruct-2507-FP8] ( https://modelscope.cn/models/Qwen/Qwen3-30B-A3B-Instruct-2507-FP8 ) | qwen3_nothinking| qwen3_nothinking| transformers>=4.51| ✔ ; | -| [ Qwen/Qwen3-30B-A3B-Instruct-2507-FP8] ( https://huggingface.co/Qwen/Qwen3-30B-A3B-Instruct-2507-FP8 ) |
217218| [ Qwen/Qwen3-235B-A22B-Instruct-2507] ( https://modelscope.cn/models/Qwen/Qwen3-235B-A22B-Instruct-2507 ) | qwen3_nothinking| qwen3_nothinking| transformers>=4.51| ✔ ; | -| [ Qwen/Qwen3-235B-A22B-Instruct-2507] ( https://huggingface.co/Qwen/Qwen3-235B-A22B-Instruct-2507 ) |
Original file line number Diff line number Diff line change @@ -592,6 +592,9 @@ def _get_cast_dtype(self) -> torch.dtype:
592592 Model ('Qwen/Qwen3-4B-Thinking-2507' , 'Qwen/Qwen3-4B-Thinking-2507' ),
593593 Model ('Qwen/Qwen3-4B-Thinking-2507-FP8' , 'Qwen/Qwen3-4B-Thinking-2507-FP8' ),
594594 ]),
595+ ModelGroup ([
596+ Model ('iic/QwenLong-L1.5-30B-A3B' , 'Tongyi-Zhiwen/QwenLong-L1.5-30B-A3B' ),
597+ ]),
595598 ],
596599 TemplateType .qwen3_thinking ,
597600 get_model_tokenizer_with_flash_attn ,
You can’t perform that action at this time.
0 commit comments