Skip to content

Commit b47f451

Browse files
authored
[model] support qwenlong L1.5 (#7237)
1 parent 76795cb commit b47f451

File tree

3 files changed

+5
-0
lines changed

3 files changed

+5
-0
lines changed

docs/source/Instruction/Supported-models-and-datasets.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -211,6 +211,7 @@
211211
|[Qwen/Qwen3Guard-Gen-8B](https://modelscope.cn/models/Qwen/Qwen3Guard-Gen-8B)|qwen3_guard|qwen3_guard|transformers>=4.51|✘|-|[Qwen/Qwen3Guard-Gen-8B](https://huggingface.co/Qwen/Qwen3Guard-Gen-8B)|
212212
|[Qwen/Qwen3-4B-Thinking-2507](https://modelscope.cn/models/Qwen/Qwen3-4B-Thinking-2507)|qwen3_thinking|qwen3_thinking|transformers>=4.51|✔|-|[Qwen/Qwen3-4B-Thinking-2507](https://huggingface.co/Qwen/Qwen3-4B-Thinking-2507)|
213213
|[Qwen/Qwen3-4B-Thinking-2507-FP8](https://modelscope.cn/models/Qwen/Qwen3-4B-Thinking-2507-FP8)|qwen3_thinking|qwen3_thinking|transformers>=4.51|✔|-|[Qwen/Qwen3-4B-Thinking-2507-FP8](https://huggingface.co/Qwen/Qwen3-4B-Thinking-2507-FP8)|
214+
|[iic/QwenLong-L1.5-30B-A3B](https://modelscope.cn/models/iic/QwenLong-L1.5-30B-A3B)|qwen3_thinking|qwen3_thinking|transformers>=4.51|✔|-|[Tongyi-Zhiwen/QwenLong-L1.5-30B-A3B](https://huggingface.co/Tongyi-Zhiwen/QwenLong-L1.5-30B-A3B)|
214215
|[Qwen/Qwen3-30B-A3B-Instruct-2507](https://modelscope.cn/models/Qwen/Qwen3-30B-A3B-Instruct-2507)|qwen3_nothinking|qwen3_nothinking|transformers>=4.51|✔|-|[Qwen/Qwen3-30B-A3B-Instruct-2507](https://huggingface.co/Qwen/Qwen3-30B-A3B-Instruct-2507)|
215216
|[Qwen/Qwen3-30B-A3B-Instruct-2507-FP8](https://modelscope.cn/models/Qwen/Qwen3-30B-A3B-Instruct-2507-FP8)|qwen3_nothinking|qwen3_nothinking|transformers>=4.51|✔|-|[Qwen/Qwen3-30B-A3B-Instruct-2507-FP8](https://huggingface.co/Qwen/Qwen3-30B-A3B-Instruct-2507-FP8)|
216217
|[Qwen/Qwen3-235B-A22B-Instruct-2507](https://modelscope.cn/models/Qwen/Qwen3-235B-A22B-Instruct-2507)|qwen3_nothinking|qwen3_nothinking|transformers>=4.51|✔|-|[Qwen/Qwen3-235B-A22B-Instruct-2507](https://huggingface.co/Qwen/Qwen3-235B-A22B-Instruct-2507)|

docs/source_en/Instruction/Supported-models-and-datasets.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -212,6 +212,7 @@ The table below introduces the models integrated with ms-swift:
212212
|[Qwen/Qwen3Guard-Gen-8B](https://modelscope.cn/models/Qwen/Qwen3Guard-Gen-8B)|qwen3_guard|qwen3_guard|transformers>=4.51|✘|-|[Qwen/Qwen3Guard-Gen-8B](https://huggingface.co/Qwen/Qwen3Guard-Gen-8B)|
213213
|[Qwen/Qwen3-4B-Thinking-2507](https://modelscope.cn/models/Qwen/Qwen3-4B-Thinking-2507)|qwen3_thinking|qwen3_thinking|transformers>=4.51|✔|-|[Qwen/Qwen3-4B-Thinking-2507](https://huggingface.co/Qwen/Qwen3-4B-Thinking-2507)|
214214
|[Qwen/Qwen3-4B-Thinking-2507-FP8](https://modelscope.cn/models/Qwen/Qwen3-4B-Thinking-2507-FP8)|qwen3_thinking|qwen3_thinking|transformers>=4.51|✔|-|[Qwen/Qwen3-4B-Thinking-2507-FP8](https://huggingface.co/Qwen/Qwen3-4B-Thinking-2507-FP8)|
215+
|[iic/QwenLong-L1.5-30B-A3B](https://modelscope.cn/models/iic/QwenLong-L1.5-30B-A3B)|qwen3_thinking|qwen3_thinking|transformers>=4.51|✔|-|[Tongyi-Zhiwen/QwenLong-L1.5-30B-A3B](https://huggingface.co/Tongyi-Zhiwen/QwenLong-L1.5-30B-A3B)|
215216
|[Qwen/Qwen3-30B-A3B-Instruct-2507](https://modelscope.cn/models/Qwen/Qwen3-30B-A3B-Instruct-2507)|qwen3_nothinking|qwen3_nothinking|transformers>=4.51|✔|-|[Qwen/Qwen3-30B-A3B-Instruct-2507](https://huggingface.co/Qwen/Qwen3-30B-A3B-Instruct-2507)|
216217
|[Qwen/Qwen3-30B-A3B-Instruct-2507-FP8](https://modelscope.cn/models/Qwen/Qwen3-30B-A3B-Instruct-2507-FP8)|qwen3_nothinking|qwen3_nothinking|transformers>=4.51|✔|-|[Qwen/Qwen3-30B-A3B-Instruct-2507-FP8](https://huggingface.co/Qwen/Qwen3-30B-A3B-Instruct-2507-FP8)|
217218
|[Qwen/Qwen3-235B-A22B-Instruct-2507](https://modelscope.cn/models/Qwen/Qwen3-235B-A22B-Instruct-2507)|qwen3_nothinking|qwen3_nothinking|transformers>=4.51|✔|-|[Qwen/Qwen3-235B-A22B-Instruct-2507](https://huggingface.co/Qwen/Qwen3-235B-A22B-Instruct-2507)|

swift/llm/model/model/qwen.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -592,6 +592,9 @@ def _get_cast_dtype(self) -> torch.dtype:
592592
Model('Qwen/Qwen3-4B-Thinking-2507', 'Qwen/Qwen3-4B-Thinking-2507'),
593593
Model('Qwen/Qwen3-4B-Thinking-2507-FP8', 'Qwen/Qwen3-4B-Thinking-2507-FP8'),
594594
]),
595+
ModelGroup([
596+
Model('iic/QwenLong-L1.5-30B-A3B', 'Tongyi-Zhiwen/QwenLong-L1.5-30B-A3B'),
597+
]),
595598
],
596599
TemplateType.qwen3_thinking,
597600
get_model_tokenizer_with_flash_attn,

0 commit comments

Comments
 (0)