You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+3-1Lines changed: 3 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -45,6 +45,8 @@ Additionally, we are expanding capabilities for other modalities. Currently, we
45
45
SWIFT has rich documentations for users, please check [here](https://github.com/modelscope/swift/tree/main/docs/source_en/LLM).
46
46
47
47
## 🎉 News
48
+
- 2024.05.22: Supports DeepSeek-V2-Lite series models, model_type are `deepseek-v2-lite` and `deepseek-v2-lite-chat`
49
+
- 2024.05.22: Supports TeleChat-12B-v2 model with quantized version, model_type are `telechat-12b-v2` and `telechat-12b-v2-gptq-int4`
48
50
- 🔥2024.05.21: Inference and fine-tuning support for MiniCPM-Llama3-V-2_5 are now available. For more details, please refer to [minicpm-v-2.5 Best Practice](docs/source/Multi-Modal/minicpm-v-2.5最佳实践.md).
49
51
- 🔥2024.05.20: Support for inferencing and fine-tuning cogvlm2-llama3-chinese-chat-19B, cogvlm2-llama3-chat-19B. you can refer to [cogvlm2 Best Practice](docs/source_en/Multi-Modal/cogvlm2-best-practice.md).
50
52
- 🔥2024.05.17: Support peft=0.11.0. Meanwhile support 3 new tuners: `BOFT`, `Vera` and `Pissa`. use `--sft_type boft/vera` to use BOFT or Vera, use `--init_lora_weights pissa` with `--sft_type lora` to use Pissa.
@@ -482,7 +484,7 @@ The complete list of supported models and datasets can be found at [Supported Mo
482
484
| Mistral<br>Mixtral |[Mistral series models](https://github.com/mistralai/mistral-src)| English | 7B-22B | base model<br>instruct model<br>MoE model |
483
485
| Yi<br>Yi1.5 |[01AI's YI series models](https://github.com/01-ai)| Chinese<br>English | 6B-34B<br>including quantized | base model<br>chat model<br>long text model |
484
486
| InternLM<br>InternLM2<br>InternLM2-Math |[Pujiang AI Lab InternLM series models](https://github.com/InternLM/InternLM)| Chinese<br>English | 1.8B-20B | base model<br>chat model<br>math model |
485
-
| DeepSeek<br>DeepSeek-MoE<br>DeepSeek-Coder<br>DeepSeek-Math |[DeepSeek series models](https://github.com/deepseek-ai)| Chinese<br>English | 1.3B-236B | base model<br>chat model<br>MoE model<br>code model<br>math model |
487
+
| DeepSeek<br>DeepSeek-MoE<br>DeepSeek-Coder<br>DeepSeek-Math<br>DeepSeek-V2|[DeepSeek series models](https://github.com/deepseek-ai)| Chinese<br>English | 1.3B-236B | base model<br>chat model<br>MoE model<br>code model<br>math model |
486
488
| MAMBA |[MAMBA temporal convolution model](https://github.com/state-spaces/mamba)| English | 130M-2.8B | base model |
487
489
| Gemma |[Google Gemma series models](https://github.com/google/gemma_pytorch)| English | 2B-7B | base model<br>instruct model |
488
490
| MiniCPM |[OpenBmB MiniCPM series models](https://github.com/OpenBMB/MiniCPM)| Chinese<br>English | 2B-3B | chat model<br>MoE model |
0 commit comments