Skip to content
Open
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion docs/source/Instruction/Supported-models-and-datasets.md
Original file line number Diff line number Diff line change
Expand Up @@ -647,6 +647,8 @@
|[PaddlePaddle/ERNIE-4.5-21B-A3B-Thinking](https://modelscope.cn/models/PaddlePaddle/ERNIE-4.5-21B-A3B-Thinking)|ernie_thinking|ernie_thinking|-|✔|-|[baidu/ERNIE-4.5-21B-A3B-Thinking](https://huggingface.co/baidu/ERNIE-4.5-21B-A3B-Thinking)|
|[meituan-longcat/LongCat-Flash-Chat](https://modelscope.cn/models/meituan-longcat/LongCat-Flash-Chat)|longchat|longchat|transformers>=4.54,<4.56|&#x2718;|-|[meituan-longcat/LongCat-Flash-Chat](https://huggingface.co/meituan-longcat/LongCat-Flash-Chat)|
|[meituan-longcat/LongCat-Flash-Chat-FP8](https://modelscope.cn/models/meituan-longcat/LongCat-Flash-Chat-FP8)|longchat|longchat|transformers>=4.54,<4.56|&#x2718;|-|[meituan-longcat/LongCat-Flash-Chat-FP8](https://huggingface.co/meituan-longcat/LongCat-Flash-Chat-FP8)|
|[XiaomiMiMo/MiMo-V2-Flash](https://modelscope.cn/models/XiaomiMiMo/MiMo-V2-Flash)|mimo_v2|mimo_v2|-|&#x2718;|-|[XiaomiMiMo/MiMo-V2-Flash](https://huggingface.co/XiaomiMiMo/MiMo-V2-Flash)|
|[XiaomiMiMo/MiMo-V2-Flash-Base](https://modelscope.cn/models/XiaomiMiMo/MiMo-V2-Flash-Base)|mimo_v2|mimo_v2|-|&#x2718;|-|[XiaomiMiMo/MiMo-V2-Flash-Base](https://huggingface.co/XiaomiMiMo/MiMo-V2-Flash-Base)|
|[answerdotai/ModernBERT-base](https://modelscope.cn/models/answerdotai/ModernBERT-base)|modern_bert|dummy|transformers>=4.48|&#x2718;|bert|[answerdotai/ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base)|
|[answerdotai/ModernBERT-large](https://modelscope.cn/models/answerdotai/ModernBERT-large)|modern_bert|dummy|transformers>=4.48|&#x2718;|bert|[answerdotai/ModernBERT-large](https://huggingface.co/answerdotai/ModernBERT-large)|
|[iic/gte-modernbert-base](https://modelscope.cn/models/iic/gte-modernbert-base)|modern_bert_gte|dummy|transformers>=4.48|&#x2718;|bert, embedding|[Alibaba-NLP/gte-modernbert-base](https://huggingface.co/Alibaba-NLP/gte-modernbert-base)|
Expand Down Expand Up @@ -1053,8 +1055,8 @@
|[mistralai/Ministral-3-8B-Reasoning-2512](https://modelscope.cn/models/mistralai/Ministral-3-8B-Reasoning-2512)|mistral_2512_thinking|mistral_2512_thinking|transformers>=5.0.0.dev0, mistral-common>=1.8.6|&#x2718;|vision|[mistralai/Ministral-3-8B-Reasoning-2512](https://huggingface.co/mistralai/Ministral-3-8B-Reasoning-2512)|
|[mistralai/Ministral-3-14B-Reasoning-2512](https://modelscope.cn/models/mistralai/Ministral-3-14B-Reasoning-2512)|mistral_2512_thinking|mistral_2512_thinking|transformers>=5.0.0.dev0, mistral-common>=1.8.6|&#x2718;|vision|[mistralai/Ministral-3-14B-Reasoning-2512](https://huggingface.co/mistralai/Ministral-3-14B-Reasoning-2512)|
|[PaddlePaddle/PaddleOCR-VL](https://modelscope.cn/models/PaddlePaddle/PaddleOCR-VL)|paddle_ocr|paddle_ocr|-|&#x2718;|-|[PaddlePaddle/PaddleOCR-VL](https://huggingface.co/PaddlePaddle/PaddleOCR-VL)|
|[Tencent-Hunyuan/HunyuanOCR](https://modelscope.cn/models/Tencent-Hunyuan/HunyuanOCR)|hunyuan_ocr|hunyuan_ocr|transformers>=4.49.0|&#x2718;|-|[tencent/HunyuanOCR](https://huggingface.co/tencent/HunyuanOCR)|
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The Tags for HunyuanOCR was changed from vision to -. Since OCR is a vision-related task, it would be more accurate to keep the vision tag for better discoverability and consistency.

Suggested change
|[Tencent-Hunyuan/HunyuanOCR](https://modelscope.cn/models/Tencent-Hunyuan/HunyuanOCR)|hunyuan_ocr|hunyuan_ocr|transformers>=4.49.0|&#x2718;|-|[tencent/HunyuanOCR](https://huggingface.co/tencent/HunyuanOCR)|
|[Tencent-Hunyuan/HunyuanOCR](https://modelscope.cn/models/Tencent-Hunyuan/HunyuanOCR)|hunyuan_ocr|hunyuan_ocr|transformers>=4.49.0|&#x2718;|vision|[tencent/HunyuanOCR](https://huggingface.co/tencent/HunyuanOCR)|

|[JinaAI/jina-reranker-m0](https://modelscope.cn/models/JinaAI/jina-reranker-m0)|jina_reranker_m0|jina_reranker_m0|-|&#x2718;|reranker, vision|[JinaAI/jina-reranker-m0](https://huggingface.co/JinaAI/jina-reranker-m0)|
|[Tencent-Hunyuan/HunyuanOCR](https://modelscope.cn/models/Tencent-Hunyuan/HunyuanOCR)|hunyuan_ocr|hunyuan_ocr|transformers>=4.49.0|&#x2718;|vision|[tencent/HunyuanOCR](https://huggingface.co/tencent/HunyuanOCR)|


## 数据集
Expand Down
4 changes: 3 additions & 1 deletion docs/source_en/Instruction/Supported-models-and-datasets.md
Original file line number Diff line number Diff line change
Expand Up @@ -647,6 +647,8 @@ The table below introduces the models integrated with ms-swift:
|[PaddlePaddle/ERNIE-4.5-21B-A3B-Thinking](https://modelscope.cn/models/PaddlePaddle/ERNIE-4.5-21B-A3B-Thinking)|ernie_thinking|ernie_thinking|-|&#x2714;|-|[baidu/ERNIE-4.5-21B-A3B-Thinking](https://huggingface.co/baidu/ERNIE-4.5-21B-A3B-Thinking)|
|[meituan-longcat/LongCat-Flash-Chat](https://modelscope.cn/models/meituan-longcat/LongCat-Flash-Chat)|longchat|longchat|transformers>=4.54,<4.56|&#x2718;|-|[meituan-longcat/LongCat-Flash-Chat](https://huggingface.co/meituan-longcat/LongCat-Flash-Chat)|
|[meituan-longcat/LongCat-Flash-Chat-FP8](https://modelscope.cn/models/meituan-longcat/LongCat-Flash-Chat-FP8)|longchat|longchat|transformers>=4.54,<4.56|&#x2718;|-|[meituan-longcat/LongCat-Flash-Chat-FP8](https://huggingface.co/meituan-longcat/LongCat-Flash-Chat-FP8)|
|[XiaomiMiMo/MiMo-V2-Flash](https://modelscope.cn/models/XiaomiMiMo/MiMo-V2-Flash)|mimo_v2|mimo_v2|-|&#x2718;|-|[XiaomiMiMo/MiMo-V2-Flash](https://huggingface.co/XiaomiMiMo/MiMo-V2-Flash)|
|[XiaomiMiMo/MiMo-V2-Flash-Base](https://modelscope.cn/models/XiaomiMiMo/MiMo-V2-Flash-Base)|mimo_v2|mimo_v2|-|&#x2718;|-|[XiaomiMiMo/MiMo-V2-Flash-Base](https://huggingface.co/XiaomiMiMo/MiMo-V2-Flash-Base)|
|[answerdotai/ModernBERT-base](https://modelscope.cn/models/answerdotai/ModernBERT-base)|modern_bert|dummy|transformers>=4.48|&#x2718;|bert|[answerdotai/ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base)|
|[answerdotai/ModernBERT-large](https://modelscope.cn/models/answerdotai/ModernBERT-large)|modern_bert|dummy|transformers>=4.48|&#x2718;|bert|[answerdotai/ModernBERT-large](https://huggingface.co/answerdotai/ModernBERT-large)|
|[iic/gte-modernbert-base](https://modelscope.cn/models/iic/gte-modernbert-base)|modern_bert_gte|dummy|transformers>=4.48|&#x2718;|bert, embedding|[Alibaba-NLP/gte-modernbert-base](https://huggingface.co/Alibaba-NLP/gte-modernbert-base)|
Expand Down Expand Up @@ -1053,8 +1055,8 @@ The table below introduces the models integrated with ms-swift:
|[mistralai/Ministral-3-8B-Reasoning-2512](https://modelscope.cn/models/mistralai/Ministral-3-8B-Reasoning-2512)|mistral_2512_thinking|mistral_2512_thinking|transformers>=5.0.0.dev0, mistral-common>=1.8.6|&#x2718;|vision|[mistralai/Ministral-3-8B-Reasoning-2512](https://huggingface.co/mistralai/Ministral-3-8B-Reasoning-2512)|
|[mistralai/Ministral-3-14B-Reasoning-2512](https://modelscope.cn/models/mistralai/Ministral-3-14B-Reasoning-2512)|mistral_2512_thinking|mistral_2512_thinking|transformers>=5.0.0.dev0, mistral-common>=1.8.6|&#x2718;|vision|[mistralai/Ministral-3-14B-Reasoning-2512](https://huggingface.co/mistralai/Ministral-3-14B-Reasoning-2512)|
|[PaddlePaddle/PaddleOCR-VL](https://modelscope.cn/models/PaddlePaddle/PaddleOCR-VL)|paddle_ocr|paddle_ocr|-|&#x2718;|-|[PaddlePaddle/PaddleOCR-VL](https://huggingface.co/PaddlePaddle/PaddleOCR-VL)|
|[Tencent-Hunyuan/HunyuanOCR](https://modelscope.cn/models/Tencent-Hunyuan/HunyuanOCR)|hunyuan_ocr|hunyuan_ocr|transformers>=4.49.0|&#x2718;|-|[tencent/HunyuanOCR](https://huggingface.co/tencent/HunyuanOCR)|
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The Tags for HunyuanOCR was changed from vision to -. Since OCR is a vision-related task, it would be more accurate to keep the vision tag for better discoverability and consistency.

Suggested change
|[Tencent-Hunyuan/HunyuanOCR](https://modelscope.cn/models/Tencent-Hunyuan/HunyuanOCR)|hunyuan_ocr|hunyuan_ocr|transformers>=4.49.0|&#x2718;|-|[tencent/HunyuanOCR](https://huggingface.co/tencent/HunyuanOCR)|
|[Tencent-Hunyuan/HunyuanOCR](https://modelscope.cn/models/Tencent-Hunyuan/HunyuanOCR)|hunyuan_ocr|hunyuan_ocr|transformers>=4.49.0|&#x2718;|vision|[tencent/HunyuanOCR](https://huggingface.co/tencent/HunyuanOCR)|

|[JinaAI/jina-reranker-m0](https://modelscope.cn/models/JinaAI/jina-reranker-m0)|jina_reranker_m0|jina_reranker_m0|-|&#x2718;|reranker, vision|[JinaAI/jina-reranker-m0](https://huggingface.co/JinaAI/jina-reranker-m0)|
|[Tencent-Hunyuan/HunyuanOCR](https://modelscope.cn/models/Tencent-Hunyuan/HunyuanOCR)|hunyuan_ocr|hunyuan_ocr|transformers>=4.49.0|&#x2718;|vision|[tencent/HunyuanOCR](https://huggingface.co/tencent/HunyuanOCR)|


## Datasets
Expand Down
1 change: 1 addition & 0 deletions swift/llm/model/constant.py
Original file line number Diff line number Diff line change
Expand Up @@ -141,6 +141,7 @@ class LLMModelType:
gemma_emb = 'gemma_emb'
ernie_thinking = 'ernie_thinking'
longchat = 'longchat'
mimo_v2 = 'mimo_v2'


class BertModelType:
Expand Down
14 changes: 14 additions & 0 deletions swift/llm/model/model/llm.py
Original file line number Diff line number Diff line change
Expand Up @@ -397,3 +397,17 @@ def get_model_tokenizer_yuan(model_dir: str,
get_model_tokenizer_with_flash_attn,
architectures=['BailingMoeV2ForCausalLM'],
))

register_model(
ModelMeta(
LLMModelType.mimo_v2,
[
ModelGroup([
Model('XiaomiMiMo/MiMo-V2-Flash', 'XiaomiMiMo/MiMo-V2-Flash'),
Model('XiaomiMiMo/MiMo-V2-Flash-Base', 'XiaomiMiMo/MiMo-V2-Flash-Base'),
])
],
TemplateType.mimo_v2,
get_model_tokenizer_with_flash_attn,
architectures=['MiMoV2FlashForCausalLM'],
))
1 change: 1 addition & 0 deletions swift/llm/template/constant.py
Original file line number Diff line number Diff line change
Expand Up @@ -104,6 +104,7 @@ class LLMTemplateType:
ernie = 'ernie'
ernie_thinking = 'ernie_thinking'
longchat = 'longchat'
mimo_v2 = 'mimo_v2'

aya = 'aya'
c4ai = 'c4ai'
Expand Down
7 changes: 7 additions & 0 deletions swift/llm/template/template/llm.py
Original file line number Diff line number Diff line change
Expand Up @@ -411,3 +411,10 @@ class GptOssTemplateMeta(TemplateMeta):
suffix=['<|endoftext|>'],
response_prefix='<think>\n',
))

register_template(
ChatmlTemplateMeta(
LLMTemplateType.mimo_v2,
default_system='You are MiMo, a helpful AI assistant engineered by Xiaomi.',
response_prefix='<think></think>',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The response_prefix is set to <think></think>, which is an empty tag pair. This is inconsistent with other "thinking" templates like ring2 and gpt_oss which use <think>\n to allow the model to generate thoughts. An empty tag might not be the intended behavior. If the model is not expected to generate thoughts, it might be better to use <think>\n</think>\n to represent an empty thinking step explicitly. If it is expected to generate thoughts, <think>\n would be more appropriate.

Suggested change
response_prefix='<think></think>',
response_prefix='<think>\n</think>\n',

))