@@ -709,6 +709,10 @@ The table below introduces the models integrated with ms-swift:
709709| [ Qwen/Qwen3-Omni-30B-A3B-Captioner] ( https://modelscope.cn/models/Qwen/Qwen3-Omni-30B-A3B-Captioner ) | qwen3_omni| qwen3_omni| transformers>=4.57.dev0, soundfile, decord, qwen_omni_utils| ✔ ; | vision, video, audio| [ Qwen/Qwen3-Omni-30B-A3B-Captioner] ( https://huggingface.co/Qwen/Qwen3-Omni-30B-A3B-Captioner ) |
710710| [ Qwen/Qwen2-Audio-7B-Instruct] ( https://modelscope.cn/models/Qwen/Qwen2-Audio-7B-Instruct ) | qwen2_audio| qwen2_audio| transformers>=4.45,<4.49, librosa| ✘ ; | audio| [ Qwen/Qwen2-Audio-7B-Instruct] ( https://huggingface.co/Qwen/Qwen2-Audio-7B-Instruct ) |
711711| [ Qwen/Qwen2-Audio-7B] ( https://modelscope.cn/models/Qwen/Qwen2-Audio-7B ) | qwen2_audio| qwen2_audio| transformers>=4.45,<4.49, librosa| ✘ ; | audio| [ Qwen/Qwen2-Audio-7B] ( https://huggingface.co/Qwen/Qwen2-Audio-7B ) |
712+ | [ Qwen/Qwen3-VL-2B-Instruct] ( https://modelscope.cn/models/Qwen/Qwen3-VL-2B-Instruct ) | qwen3_vl| qwen3_vl| transformers>=4.57, qwen_vl_utils>=0.0.14, decord| ✔ ; | vision, video| [ Qwen/Qwen3-VL-2B-Instruct] ( https://huggingface.co/Qwen/Qwen3-VL-2B-Instruct ) |
713+ | [ Qwen/Qwen3-VL-2B-Thinking] ( https://modelscope.cn/models/Qwen/Qwen3-VL-2B-Thinking ) | qwen3_vl| qwen3_vl| transformers>=4.57, qwen_vl_utils>=0.0.14, decord| ✔ ; | vision, video| [ Qwen/Qwen3-VL-2B-Thinking] ( https://huggingface.co/Qwen/Qwen3-VL-2B-Thinking ) |
714+ | [ Qwen/Qwen3-VL-2B-Instruct-FP8] ( https://modelscope.cn/models/Qwen/Qwen3-VL-2B-Instruct-FP8 ) | qwen3_vl| qwen3_vl| transformers>=4.57, qwen_vl_utils>=0.0.14, decord| ✘ ; | vision, video| [ Qwen/Qwen3-VL-2B-Instruct-FP8] ( https://huggingface.co/Qwen/Qwen3-VL-2B-Instruct-FP8 ) |
715+ | [ Qwen/Qwen3-VL-2B-Thinking-FP8] ( https://modelscope.cn/models/Qwen/Qwen3-VL-2B-Thinking-FP8 ) | qwen3_vl| qwen3_vl| transformers>=4.57, qwen_vl_utils>=0.0.14, decord| ✘ ; | vision, video| [ Qwen/Qwen3-VL-2B-Thinking-FP8] ( https://huggingface.co/Qwen/Qwen3-VL-2B-Thinking-FP8 ) |
712716| [ Qwen/Qwen3-VL-4B-Instruct] ( https://modelscope.cn/models/Qwen/Qwen3-VL-4B-Instruct ) | qwen3_vl| qwen3_vl| transformers>=4.57, qwen_vl_utils>=0.0.14, decord| ✔ ; | vision, video| [ Qwen/Qwen3-VL-4B-Instruct] ( https://huggingface.co/Qwen/Qwen3-VL-4B-Instruct ) |
713717| [ Qwen/Qwen3-VL-4B-Thinking] ( https://modelscope.cn/models/Qwen/Qwen3-VL-4B-Thinking ) | qwen3_vl| qwen3_vl| transformers>=4.57, qwen_vl_utils>=0.0.14, decord| ✔ ; | vision, video| [ Qwen/Qwen3-VL-4B-Thinking] ( https://huggingface.co/Qwen/Qwen3-VL-4B-Thinking ) |
714718| [ Qwen/Qwen3-VL-4B-Instruct-FP8] ( https://modelscope.cn/models/Qwen/Qwen3-VL-4B-Instruct-FP8 ) | qwen3_vl| qwen3_vl| transformers>=4.57, qwen_vl_utils>=0.0.14, decord| ✘ ; | vision, video| [ Qwen/Qwen3-VL-4B-Instruct-FP8] ( https://huggingface.co/Qwen/Qwen3-VL-4B-Instruct-FP8 ) |
@@ -717,6 +721,10 @@ The table below introduces the models integrated with ms-swift:
717721| [ Qwen/Qwen3-VL-8B-Thinking] ( https://modelscope.cn/models/Qwen/Qwen3-VL-8B-Thinking ) | qwen3_vl| qwen3_vl| transformers>=4.57, qwen_vl_utils>=0.0.14, decord| ✔ ; | vision, video| [ Qwen/Qwen3-VL-8B-Thinking] ( https://huggingface.co/Qwen/Qwen3-VL-8B-Thinking ) |
718722| [ Qwen/Qwen3-VL-8B-Instruct-FP8] ( https://modelscope.cn/models/Qwen/Qwen3-VL-8B-Instruct-FP8 ) | qwen3_vl| qwen3_vl| transformers>=4.57, qwen_vl_utils>=0.0.14, decord| ✘ ; | vision, video| [ Qwen/Qwen3-VL-8B-Instruct-FP8] ( https://huggingface.co/Qwen/Qwen3-VL-8B-Instruct-FP8 ) |
719723| [ Qwen/Qwen3-VL-8B-Thinking-FP8] ( https://modelscope.cn/models/Qwen/Qwen3-VL-8B-Thinking-FP8 ) | qwen3_vl| qwen3_vl| transformers>=4.57, qwen_vl_utils>=0.0.14, decord| ✘ ; | vision, video| [ Qwen/Qwen3-VL-8B-Thinking-FP8] ( https://huggingface.co/Qwen/Qwen3-VL-8B-Thinking-FP8 ) |
724+ | [ Qwen/Qwen3-VL-32B-Instruct] ( https://modelscope.cn/models/Qwen/Qwen3-VL-32B-Instruct ) | qwen3_vl| qwen3_vl| transformers>=4.57, qwen_vl_utils>=0.0.14, decord| ✔ ; | vision, video| [ Qwen/Qwen3-VL-32B-Instruct] ( https://huggingface.co/Qwen/Qwen3-VL-32B-Instruct ) |
725+ | [ Qwen/Qwen3-VL-32B-Thinking] ( https://modelscope.cn/models/Qwen/Qwen3-VL-32B-Thinking ) | qwen3_vl| qwen3_vl| transformers>=4.57, qwen_vl_utils>=0.0.14, decord| ✔ ; | vision, video| [ Qwen/Qwen3-VL-32B-Thinking] ( https://huggingface.co/Qwen/Qwen3-VL-32B-Thinking ) |
726+ | [ Qwen/Qwen3-VL-32B-Instruct-FP8] ( https://modelscope.cn/models/Qwen/Qwen3-VL-32B-Instruct-FP8 ) | qwen3_vl| qwen3_vl| transformers>=4.57, qwen_vl_utils>=0.0.14, decord| ✘ ; | vision, video| [ Qwen/Qwen3-VL-32B-Instruct-FP8] ( https://huggingface.co/Qwen/Qwen3-VL-32B-Instruct-FP8 ) |
727+ | [ Qwen/Qwen3-VL-32B-Thinking-FP8] ( https://modelscope.cn/models/Qwen/Qwen3-VL-32B-Thinking-FP8 ) | qwen3_vl| qwen3_vl| transformers>=4.57, qwen_vl_utils>=0.0.14, decord| ✘ ; | vision, video| [ Qwen/Qwen3-VL-32B-Thinking-FP8] ( https://huggingface.co/Qwen/Qwen3-VL-32B-Thinking-FP8 ) |
720728| [ Qwen/Qwen3-VL-30B-A3B-Instruct] ( https://modelscope.cn/models/Qwen/Qwen3-VL-30B-A3B-Instruct ) | qwen3_moe_vl| qwen3_vl| transformers>=4.57, qwen_vl_utils>=0.0.14, decord| ✔ ; | vision, video| [ Qwen/Qwen3-VL-30B-A3B-Instruct] ( https://huggingface.co/Qwen/Qwen3-VL-30B-A3B-Instruct ) |
721729| [ Qwen/Qwen3-VL-30B-A3B-Thinking] ( https://modelscope.cn/models/Qwen/Qwen3-VL-30B-A3B-Thinking ) | qwen3_moe_vl| qwen3_vl| transformers>=4.57, qwen_vl_utils>=0.0.14, decord| ✔ ; | vision, video| [ Qwen/Qwen3-VL-30B-A3B-Thinking] ( https://huggingface.co/Qwen/Qwen3-VL-30B-A3B-Thinking ) |
722730| [ Qwen/Qwen3-VL-30B-A3B-Instruct-FP8] ( https://modelscope.cn/models/Qwen/Qwen3-VL-30B-A3B-Instruct-FP8 ) | qwen3_moe_vl| qwen3_vl| transformers>=4.57, qwen_vl_utils>=0.0.14, decord| ✘ ; | vision, video| [ Qwen/Qwen3-VL-30B-A3B-Instruct-FP8] ( https://huggingface.co/Qwen/Qwen3-VL-30B-A3B-Instruct-FP8 ) |
0 commit comments