Skip to content

Commit 1c51628

Browse files
committed
[model] support ring2 ling2 (#5830)
1 parent 172803a commit 1c51628

File tree

10 files changed

+79
-5
lines changed

10 files changed

+79
-5
lines changed

docs/source/Instruction/支持的模型和数据集.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -552,6 +552,9 @@
552552
|[inclusionAI/Ling-plus](https://modelscope.cn/models/inclusionAI/Ling-plus)|ling|ling|-|✘|-|[inclusionAI/Ling-plus](https://huggingface.co/inclusionAI/Ling-plus)|
553553
|[inclusionAI/Ling-lite-base](https://modelscope.cn/models/inclusionAI/Ling-lite-base)|ling|ling|-|✘|-|[inclusionAI/Ling-lite-base](https://huggingface.co/inclusionAI/Ling-lite-base)|
554554
|[inclusionAI/Ling-plus-base](https://modelscope.cn/models/inclusionAI/Ling-plus-base)|ling|ling|-|✘|-|[inclusionAI/Ling-plus-base](https://huggingface.co/inclusionAI/Ling-plus-base)|
555+
|[inclusionAI/Ling-mini-2.0](https://modelscope.cn/models/inclusionAI/Ling-mini-2.0)|ling2|ling2|-|✘|-|[inclusionAI/Ling-mini-2.0](https://huggingface.co/inclusionAI/Ling-mini-2.0)|
556+
|[inclusionAI/Ling-mini-base-2.0](https://modelscope.cn/models/inclusionAI/Ling-mini-base-2.0)|ling2|ling2|-|✘|-|[inclusionAI/Ling-mini-base-2.0](https://huggingface.co/inclusionAI/Ling-mini-base-2.0)|
557+
|[inclusionAI/Ring-mini-2.0](https://modelscope.cn/models/inclusionAI/Ring-mini-2.0)|ring2|ring2|-|✘|-|[inclusionAI/Ring-mini-2.0](https://huggingface.co/inclusionAI/Ring-mini-2.0)|
555558
|[IEITYuan/Yuan2.0-2B-hf](https://modelscope.cn/models/IEITYuan/Yuan2.0-2B-hf)|yuan2|yuan|-|✘|-|[IEITYuan/Yuan2-2B-hf](https://huggingface.co/IEITYuan/Yuan2-2B-hf)|
556559
|[IEITYuan/Yuan2.0-51B-hf](https://modelscope.cn/models/IEITYuan/Yuan2.0-51B-hf)|yuan2|yuan|-|✘|-|[IEITYuan/Yuan2-51B-hf](https://huggingface.co/IEITYuan/Yuan2-51B-hf)|
557560
|[IEITYuan/Yuan2.0-102B-hf](https://modelscope.cn/models/IEITYuan/Yuan2.0-102B-hf)|yuan2|yuan|-|✘|-|[IEITYuan/Yuan2-102B-hf](https://huggingface.co/IEITYuan/Yuan2-102B-hf)|

docs/source/Megatron-SWIFT/快速开始.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -32,6 +32,7 @@ export MODELSCOPE_CACHE='/xxx/shared'
3232

3333
# Megatron-LM
3434
# 依赖库Megatron-LM中的训练模块将由swift进行git clone并安装。你也可以通过环境变量`MEGATRON_LM_PATH`指向已经下载好的repo路径(断网环境,[core_r0.13.0分支](https://github.com/NVIDIA/Megatron-LM/tree/core_r0.13.0))。
35+
git clone --branch core_r0.13.0 https://github.com/NVIDIA/Megatron-LM.git
3536
export MEGATRON_LM_PATH='/xxx/Megatron-LM'
3637
```
3738

@@ -56,7 +57,6 @@ modelscope-registry.us-west-1.cr.aliyuncs.com/modelscope-repo/modelscope:ubuntu2
5657
| modelscope | >=1.23 | | |
5758
| peft | >=0.11,<0.18 | | LoRA |
5859
| trl | >=0.15,<0.21 | | RLHF |
59-
| deepspeed | >=0.14 | 0.16.9 | |
6060

6161

6262
## 快速入门案例

docs/source_en/Instruction/Supported-models-and-datasets.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -552,6 +552,9 @@ The table below introduces the models integrated with ms-swift:
552552
|[inclusionAI/Ling-plus](https://modelscope.cn/models/inclusionAI/Ling-plus)|ling|ling|-|&#x2718;|-|[inclusionAI/Ling-plus](https://huggingface.co/inclusionAI/Ling-plus)|
553553
|[inclusionAI/Ling-lite-base](https://modelscope.cn/models/inclusionAI/Ling-lite-base)|ling|ling|-|&#x2718;|-|[inclusionAI/Ling-lite-base](https://huggingface.co/inclusionAI/Ling-lite-base)|
554554
|[inclusionAI/Ling-plus-base](https://modelscope.cn/models/inclusionAI/Ling-plus-base)|ling|ling|-|&#x2718;|-|[inclusionAI/Ling-plus-base](https://huggingface.co/inclusionAI/Ling-plus-base)|
555+
|[inclusionAI/Ling-mini-2.0](https://modelscope.cn/models/inclusionAI/Ling-mini-2.0)|ling2|ling2|-|&#x2718;|-|[inclusionAI/Ling-mini-2.0](https://huggingface.co/inclusionAI/Ling-mini-2.0)|
556+
|[inclusionAI/Ling-mini-base-2.0](https://modelscope.cn/models/inclusionAI/Ling-mini-base-2.0)|ling2|ling2|-|&#x2718;|-|[inclusionAI/Ling-mini-base-2.0](https://huggingface.co/inclusionAI/Ling-mini-base-2.0)|
557+
|[inclusionAI/Ring-mini-2.0](https://modelscope.cn/models/inclusionAI/Ring-mini-2.0)|ring2|ring2|-|&#x2718;|-|[inclusionAI/Ring-mini-2.0](https://huggingface.co/inclusionAI/Ring-mini-2.0)|
555558
|[IEITYuan/Yuan2.0-2B-hf](https://modelscope.cn/models/IEITYuan/Yuan2.0-2B-hf)|yuan2|yuan|-|&#x2718;|-|[IEITYuan/Yuan2-2B-hf](https://huggingface.co/IEITYuan/Yuan2-2B-hf)|
556559
|[IEITYuan/Yuan2.0-51B-hf](https://modelscope.cn/models/IEITYuan/Yuan2.0-51B-hf)|yuan2|yuan|-|&#x2718;|-|[IEITYuan/Yuan2-51B-hf](https://huggingface.co/IEITYuan/Yuan2-51B-hf)|
557560
|[IEITYuan/Yuan2.0-102B-hf](https://modelscope.cn/models/IEITYuan/Yuan2.0-102B-hf)|yuan2|yuan|-|&#x2718;|-|[IEITYuan/Yuan2-102B-hf](https://huggingface.co/IEITYuan/Yuan2-102B-hf)|

docs/source_en/Megatron-SWIFT/Quick-start.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -32,6 +32,7 @@ export MODELSCOPE_CACHE='/xxx/shared'
3232

3333
# Megatron-LM
3434
# The training module in the dependent library Megatron-LM will be cloned and installed by swift via `git clone`. Alternatively, you can use the environment variable `MEGATRON_LM_PATH` to point to the path of an already downloaded repository (in offline environments, use the [core_r0.13.0 branch](https://github.com/NVIDIA/Megatron-LM/tree/core_r0.13.0)).
35+
git clone --branch core_r0.13.0 https://github.com/NVIDIA/Megatron-LM.git
3536
export MEGATRON_LM_PATH='/xxx/Megatron-LM'
3637
```
3738

@@ -57,7 +58,6 @@ Recommended Operating Environment:
5758
| modelscope | >=1.23 | | |
5859
| peft | >=0.11,<0.18 | | LoRA |
5960
| trl | >=0.15,<0.21 | | RLHF |
60-
| deepspeed | >=0.14 | 0.16.9 | |
6161

6262

6363
## Quick Start Example

swift/llm/model/constant.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -113,6 +113,8 @@ class LLMModelType:
113113
skywork_o1 = 'skywork_o1'
114114

115115
ling = 'ling'
116+
ling2 = 'ling2'
117+
ring2 = 'ring2'
116118
yuan2 = 'yuan2'
117119
orion = 'orion'
118120
xverse = 'xverse'

swift/llm/model/model/llm.py

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -420,3 +420,28 @@ def forward(self, **kwargs):
420420
architectures=['LongcatFlashForCausalLM'],
421421
requires=['transformers>=4.54,<4.56'],
422422
))
423+
424+
register_model(
425+
ModelMeta(
426+
LLMModelType.ling2,
427+
[
428+
ModelGroup([
429+
Model('inclusionAI/Ling-mini-2.0', 'inclusionAI/Ling-mini-2.0'),
430+
Model('inclusionAI/Ling-mini-base-2.0', 'inclusionAI/Ling-mini-base-2.0'),
431+
])
432+
],
433+
TemplateType.ling2,
434+
get_model_tokenizer_with_flash_attn,
435+
architectures=['BailingMoeV2ForCausalLM'],
436+
))
437+
438+
register_model(
439+
ModelMeta(
440+
LLMModelType.ring2,
441+
[ModelGroup([
442+
Model('inclusionAI/Ring-mini-2.0', 'inclusionAI/Ring-mini-2.0'),
443+
])],
444+
TemplateType.ring2,
445+
get_model_tokenizer_with_flash_attn,
446+
architectures=['BailingMoeV2ForCausalLM'],
447+
))

swift/llm/template/base.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1130,8 +1130,8 @@ def _swift_encode(self, inputs: StdTemplateInputs):
11301130
query_role, query = query_message['role'], query_message['content']
11311131
response_role, response = response_message['role'], response_message['content']
11321132
# TODO: Optimize the Template mechanism.
1133-
assert query_role in {'user', 'tool'}, f'query_role: {query_role}'
1134-
assert response_role in {'assistant'}, f'response_role: {response_role}'
1133+
assert query_role in {'user', 'tool'}, f'query_role: "{query_role}"'
1134+
assert response_role in {'assistant'}, f'response_role: "{response_role}"'
11351135
if query_role == 'tool':
11361136
prompt = query
11371137
query = ''

swift/llm/template/constant.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -85,6 +85,8 @@ class LLMTemplateType:
8585
phi4 = 'phi4'
8686

8787
ling = 'ling'
88+
ling2 = 'ling2'
89+
ring2 = 'ring2'
8890
yuan = 'yuan'
8991
xverse = 'xverse'
9092
bluelm = 'bluelm'

swift/llm/template/template/llm.py

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -357,3 +357,24 @@ class GptOssTemplateMeta(TemplateMeta):
357357
chat_sep=['</longcat_s>'],
358358
suffix=['</longcat_s>'],
359359
))
360+
361+
register_template(
362+
TemplateMeta(
363+
LLMTemplateType.ling2,
364+
prefix=['<role>SYSTEM</role>detailed thinking off<|role_end|>'],
365+
system_prefix=['<role>SYSTEM</role>{{SYSTEM}}\ndetailed thinking off<|role_end|>'],
366+
prompt=['<role>HUMAN</role>{{QUERY}}<|role_end|><role>ASSISTANT</role>'],
367+
chat_sep=['<|role_end|>'],
368+
suffix=['<|role_end|>'],
369+
))
370+
371+
register_template(
372+
TemplateMeta(
373+
LLMTemplateType.ring2,
374+
prefix=[],
375+
system_prefix=['<role>SYSTEM</role>{{SYSTEM}}'],
376+
prompt=['<role>HUMAN</role>{{QUERY}}<role>ASSISTANT</role>'],
377+
chat_sep=[],
378+
suffix=['<|endoftext|>'],
379+
response_prefix='<think>\n',
380+
))

tests/test_align/test_template/test_llm.py

Lines changed: 19 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -639,6 +639,22 @@ def test_qwen3_next():
639639
assert res == res2, f'res: {res}, res2: {res2}'
640640

641641

642+
def test_ring2():
643+
pt_engine = PtEngine('inclusionAI/Ring-mini-2.0')
644+
response = _infer_model(pt_engine)
645+
pt_engine.default_template.template_backend = 'jinja'
646+
response2 = _infer_model(pt_engine)
647+
assert response == response2
648+
649+
650+
def test_ling2():
651+
pt_engine = PtEngine('inclusionAI/Ling-mini-2.0')
652+
response = _infer_model(pt_engine)
653+
pt_engine.default_template.template_backend = 'jinja'
654+
response2 = _infer_model(pt_engine)
655+
assert response == response2
656+
657+
642658
if __name__ == '__main__':
643659
from swift.llm import PtEngine, RequestConfig
644660
from swift.utils import get_logger, seed_everything
@@ -686,4 +702,6 @@ def test_qwen3_next():
686702
# test_glm4_5()
687703
# test_devstral()
688704
# test_gpt_oss()
689-
test_qwen3_next()
705+
# test_qwen3_next()
706+
test_ring2()
707+
test_ling2()

0 commit comments

Comments
 (0)