Skip to content

Commit 4947d5b

Browse files
committed
support cognitivecomputations/DeepSeek-R1-0528-AWQ (#4537)
1 parent 4650715 commit 4947d5b

File tree

3 files changed

+3
-0
lines changed

3 files changed

+3
-0
lines changed

docs/source/Instruction/支持的模型和数据集.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -403,6 +403,7 @@
403403
|[deepseek-ai/DeepSeek-R1-Zero](https://modelscope.cn/models/deepseek-ai/DeepSeek-R1-Zero)|deepseek_r1|deepseek_r1|transformers>=4.39.3|✘|-|[deepseek-ai/DeepSeek-R1-Zero](https://huggingface.co/deepseek-ai/DeepSeek-R1-Zero)|
404404
|[deepseek-ai/DeepSeek-R1-0528](https://modelscope.cn/models/deepseek-ai/DeepSeek-R1-0528)|deepseek_r1|deepseek_r1|transformers>=4.39.3|✘|-|[deepseek-ai/DeepSeek-R1-0528](https://huggingface.co/deepseek-ai/DeepSeek-R1-0528)|
405405
|[cognitivecomputations/DeepSeek-R1-awq](https://modelscope.cn/models/cognitivecomputations/DeepSeek-R1-awq)|deepseek_r1|deepseek_r1|transformers>=4.39.3|✘|-|[cognitivecomputations/DeepSeek-R1-AWQ](https://huggingface.co/cognitivecomputations/DeepSeek-R1-AWQ)|
406+
|[cognitivecomputations/DeepSeek-R1-0528-AWQ](https://modelscope.cn/models/cognitivecomputations/DeepSeek-R1-0528-AWQ)|deepseek_r1|deepseek_r1|transformers>=4.39.3|✘|-|[cognitivecomputations/DeepSeek-R1-0528-AWQ](https://huggingface.co/cognitivecomputations/DeepSeek-R1-0528-AWQ)|
406407
|[deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B](https://modelscope.cn/models/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B)|deepseek_r1_distill|deepseek_r1|transformers>=4.37|✔|-|[deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B)|
407408
|[deepseek-ai/DeepSeek-R1-Distill-Qwen-7B](https://modelscope.cn/models/deepseek-ai/DeepSeek-R1-Distill-Qwen-7B)|deepseek_r1_distill|deepseek_r1|transformers>=4.37|✔|-|[deepseek-ai/DeepSeek-R1-Distill-Qwen-7B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-7B)|
408409
|[deepseek-ai/DeepSeek-R1-Distill-Qwen-14B](https://modelscope.cn/models/deepseek-ai/DeepSeek-R1-Distill-Qwen-14B)|deepseek_r1_distill|deepseek_r1|transformers>=4.37|✔|-|[deepseek-ai/DeepSeek-R1-Distill-Qwen-14B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-14B)|

docs/source_en/Instruction/Supported-models-and-datasets.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -403,6 +403,7 @@ The table below introduces the models integrated with ms-swift:
403403
|[deepseek-ai/DeepSeek-R1-Zero](https://modelscope.cn/models/deepseek-ai/DeepSeek-R1-Zero)|deepseek_r1|deepseek_r1|transformers>=4.39.3|✘|-|[deepseek-ai/DeepSeek-R1-Zero](https://huggingface.co/deepseek-ai/DeepSeek-R1-Zero)|
404404
|[deepseek-ai/DeepSeek-R1-0528](https://modelscope.cn/models/deepseek-ai/DeepSeek-R1-0528)|deepseek_r1|deepseek_r1|transformers>=4.39.3|✘|-|[deepseek-ai/DeepSeek-R1-0528](https://huggingface.co/deepseek-ai/DeepSeek-R1-0528)|
405405
|[cognitivecomputations/DeepSeek-R1-awq](https://modelscope.cn/models/cognitivecomputations/DeepSeek-R1-awq)|deepseek_r1|deepseek_r1|transformers>=4.39.3|✘|-|[cognitivecomputations/DeepSeek-R1-AWQ](https://huggingface.co/cognitivecomputations/DeepSeek-R1-AWQ)|
406+
|[cognitivecomputations/DeepSeek-R1-0528-AWQ](https://modelscope.cn/models/cognitivecomputations/DeepSeek-R1-0528-AWQ)|deepseek_r1|deepseek_r1|transformers>=4.39.3|✘|-|[cognitivecomputations/DeepSeek-R1-0528-AWQ](https://huggingface.co/cognitivecomputations/DeepSeek-R1-0528-AWQ)|
406407
|[deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B](https://modelscope.cn/models/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B)|deepseek_r1_distill|deepseek_r1|transformers>=4.37|✔|-|[deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B)|
407408
|[deepseek-ai/DeepSeek-R1-Distill-Qwen-7B](https://modelscope.cn/models/deepseek-ai/DeepSeek-R1-Distill-Qwen-7B)|deepseek_r1_distill|deepseek_r1|transformers>=4.37|✔|-|[deepseek-ai/DeepSeek-R1-Distill-Qwen-7B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-7B)|
408409
|[deepseek-ai/DeepSeek-R1-Distill-Qwen-14B](https://modelscope.cn/models/deepseek-ai/DeepSeek-R1-Distill-Qwen-14B)|deepseek_r1_distill|deepseek_r1|transformers>=4.37|✔|-|[deepseek-ai/DeepSeek-R1-Distill-Qwen-14B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-14B)|

swift/llm/model/model/deepseek.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -255,6 +255,7 @@ def get_model_tokenizer_deepseek_vl2(model_dir: str, *args, **kwargs):
255255
]),
256256
ModelGroup([
257257
Model('cognitivecomputations/DeepSeek-R1-awq', 'cognitivecomputations/DeepSeek-R1-AWQ'),
258+
Model('cognitivecomputations/DeepSeek-R1-0528-AWQ', 'cognitivecomputations/DeepSeek-R1-0528-AWQ'),
258259
])
259260
],
260261
TemplateType.deepseek_r1,

0 commit comments

Comments
 (0)