Skip to content

Commit 480b4eb

Browse files
committed
Merge branch 'main' into release/2.0
2 parents 032c3a8 + 22f0f05 commit 480b4eb

File tree

18 files changed

+423
-32
lines changed

18 files changed

+423
-32
lines changed

README.md

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,8 @@ To facilitate use by users unfamiliar with deep learning, we provide a Gradio we
3939
Additionally, we are expanding capabilities for other modalities. Currently, we support full-parameter training and LoRA training for AnimateDiff.
4040

4141
## 🎉 News
42+
- 2024.04.22: Support for inference, fine-tuning, and deployment of **chinese-llama-alpaca-2** series models. This includes:chinese-llama-2-1.3b, chinese-llama-2-7b, chinese-llama-2-13b, chinese-alpaca-2-1.3b, chinese-alpaca-2-7b and chinese-alpaca-2-13b along with their corresponding 16k and 64k long text versions.
43+
- 2024.04.22: Support for inference and fine-tuning of Llama3 GPTQ-Int4, GPTQ-Int8, and AWQ series models. Support for inference and fine-tuning of chatglm3-6b-128k, Openbuddy-Llama3.
4244
- 2024.04.20: Support for inference, fine-tuning, and deployment of **Atom** series models. This includes: Atom-7B and Atom-7B-Chat. use [this script](https://github.com/modelscope/swift/blob/main/examples/pytorch/llm/scripts/atom_7b_chat/lora/sft.sh) to train.
4345
- 2024.04.19: Support for single-card, DDP, ZeRO2, and ZeRO3 training and inference with NPU, please refer to [NPU Inference and Fine-tuning Best Practices](docs/source_en/LLM/NPU-best-practice.md).
4446
- 2024.04.19: Support for inference, fine-tuning, and deployment of **Llama3** series models. This includes: Llama-3-8B, Llama-3-8B-Instruct, Llama-3-70B, and Llama-3-70B-Instruct. use [this script](https://github.com/modelscope/swift/blob/main/examples/pytorch/llm/scripts/llama3_8b_instruct/lora/sft.sh) to train.
@@ -441,7 +443,7 @@ CUDA_VISIBLE_DEVICES=0 swift deploy \
441443
| Yuan2 | [Langchao Yuan series models](https://github.com/IEIT-Yuan) | Chinese<br>English | 2B-102B | instruct model |
442444
| XVerse | [XVerse series models](https://github.com/xverse-ai) | Chinese<br>English | 7B-65B | base model<br>chat model<br>long text model<br>MoE model |
443445
| LLaMA2 | [LLaMA2 series models](https://github.com/facebookresearch/llama) | English | 7B-70B<br>including quantized versions | base model<br>chat model |
444-
| LLaMA3 | [LLaMA3 series models](https://github.com/meta-llama/llama3) | English | 8B-70B | base model<br>chat model |
446+
| LLaMA3 | [LLaMA3 series models](https://github.com/meta-llama/llama3) | English | 8B-70B<br>including quantized versions | base model<br>chat model |
445447
| Mistral<br>Mixtral | [Mistral series models](https://github.com/mistralai/mistral-src) | English | 7B-22B | base model<br>instruct model<br>MoE model |
446448
| YI | [01AI's YI series models](https://github.com/01-ai) | Chinese<br>English | 6B-34B<br>including quantized | base model<br>chat model<br>long text model |
447449
| InternLM<br>InternLM2<br>InternLM2-Math | [Pujiang AI Lab InternLM series models](https://github.com/InternLM/InternLM) | Chinese<br>English | 1.8B-20B | base model<br>chat model<br>math model |
@@ -468,6 +470,7 @@ CUDA_VISIBLE_DEVICES=0 swift deploy \
468470
| c4ai-command-r | [c4ai](https://cohere.com/command) | Multilingual | 35B-104B | chat model |
469471
| WizardLM2 | [WizardLM2 series models](https://github.com/nlpxucan/WizardLM) | English | 7B-8x22B<br>including quantized versions | chat model<br>MoE model |
470472
| Atom | [Atom](https://github.com/LlamaFamily/Llama-Chinese) | Chinese | 7B| base model<br>chat model|
473+
| Chinese-LLaMA-Alpaca-2 | [Chinese-LLaMA-Alpaca-2](https://github.com/ymcui/Chinese-LLaMA-Alpaca-2) | Chinese | 1.3B-13B| base model<br>chat model<br>long text model|
471474

472475
#### MLLMs
473476

@@ -564,8 +567,8 @@ make docs
564567
| Document Name |
565568
| ------------------------------------------------------------ |
566569
| [Command Line Arguments](docs/source_en/LLM/Command-line-parameters.md) |
567-
| [Customizing New Models and Datasets](docs/source_en/LLM/Customization.md) |
568570
| [Supported Models and Datasets List](docs/source_en/LLM/Supported-models-datasets.md) |
571+
| [Customizing New Models and Datasets](docs/source_en/LLM/Customization.md) |
569572
| [Runtime Speed and Memory Benchmark](docs/source_en/LLM/Benchmark.md) |
570573

571574

README_CN.md

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -40,6 +40,8 @@ SWIFT支持近**200种LLM和MLLM**(多模态大模型)的训练、推理、
4040
此外,我们也在拓展其他模态的能力,目前我们支持了AnimateDiff的全参数训练和LoRA训练。
4141

4242
## 🎉 新闻
43+
- 2024.04.22: 支持**chinese-llama-alpaca-2**系列模型的推理与微调和部署等. 包括:chinese-llama-2-1.3b, chinese-llama-2-7b, chinese-llama-2-13b, chinese-alpaca-2-1.3b, chinese-alpaca-2-7b和chinese-alpaca-2-13b以及对应的16k和64k长文本模型.
44+
- 2024.04.22: 支持Llama3 GPTQ-Int4, GPTQ-Int8, AWQ系列模型的推理与微调. 支持chatglm3-6b-128k, Openbuddy-llama3的推理与微调.
4345
- 2024.04.20: 支持**Atom**系列模型的推理, 微调和部署等. 包括: Atom-7B and Atom-7B-Chat. 使用[这个脚本](https://github.com/modelscope/swift/blob/main/examples/pytorch/llm/scripts/atom_7b_chat/lora/sft.sh)来开始训练!
4446
- 2024.04.19: 支持NPU的单卡、DDP、ZeRO2和ZeRO3的训练与推理, 可以查看[NPU推理与微调最佳实践](docs/source/LLM/NPU推理与微调最佳实践.md).
4547
- 2024.04.19: 支持**Llama3**系列模型的推理, 微调和部署等. 包括: Llama-3-8B, Llama-3-8B-Instruct, Llama-3-70B, Llama-3-70B-Instruct. 使用[这个脚本](https://github.com/modelscope/swift/blob/main/examples/pytorch/llm/scripts/llama3_8b_instruct/lora/sft.sh)开始训练叭!
@@ -438,7 +440,7 @@ CUDA_VISIBLE_DEVICES=0 swift deploy \
438440
| Yuan2 | [浪潮源系列模型](https://github.com/IEIT-Yuan) | 中文<br>英文 | 2B-102B | instruct模型 |
439441
| XVerse | [元象系列模型](https://github.com/xverse-ai) | 中文<br>英文 | 7B-65B | base模型<br>chat模型<br>长文本模型<br>MoE模型 | |
440442
| LLaMA2 | [LLaMA2系列模型](https://github.com/facebookresearch/llama) | 英文 | 7B-70B<br>包含量化版本 | base模型<br>chat模型 |
441-
| LLaMA3 | [LLaMA3系列模型](https://github.com/meta-llama/llama3) | 英文 | 8B-70B | base模型<br>chat模型 |
443+
| LLaMA3 | [LLaMA3系列模型](https://github.com/meta-llama/llama3) | 英文 | 8B-70B<br>包含量化版本 | base模型<br>chat模型 |
442444
| Mistral<br>Mixtral | [Mistral系列模型](https://github.com/mistralai/mistral-src) | 英文 | 7B-8x22B | base模型<br>instruct模型<br>MoE模型 |
443445
| YI | [01AI的YI系列模型](https://github.com/01-ai) | 中文<br>英文 | 6B-34B<br>包含量化版本 | base模型<br>chat模型<br>长文本模型 |
444446
| InternLM<br>InternLM2<br>InternLM2-Math | [浦江实验室书生浦语系列模型](https://github.com/InternLM/InternLM) | 中文<br>英文 | 1.8B-20B | base模型<br>chat模型<br>数学模型 |
@@ -465,6 +467,7 @@ CUDA_VISIBLE_DEVICES=0 swift deploy \
465467
| c4ai-command-r | [c4ai](https://cohere.com/command) | 多语种 | 35B-104B | chat模型 |
466468
| WizardLM2 | [WizardLM2系列模型](https://github.com/nlpxucan/WizardLM) | 多语种 | 7B-8x22B<br>包含量化版本 | chat模型<br>MoE模型 |
467469
| Atom | [Atom](https://github.com/LlamaFamily/Llama-Chinese) | 中文 | 7B| base模型<br>chat模型|
470+
| Chinese-LLaMA-Alpaca-2 | [Chinese-LLaMA-Alpaca-2](https://github.com/ymcui/Chinese-LLaMA-Alpaca-2) | 中文 | 1.3B-13B| base模型<br>chat模型<br>长文本模型|
468471

469472

470473
#### 多模态大模型
@@ -564,8 +567,8 @@ make docs
564567
| 文档名称 |
565568
| ------------------------------------------------------------ |
566569
| [命令行参数](https://github.com/modelscope/swift/blob/main/docs/source/LLM/%E5%91%BD%E4%BB%A4%E8%A1%8C%E5%8F%82%E6%95%B0.md) |
567-
| [自定义新模型和数据集](https://github.com/modelscope/swift/blob/main/docs/source/LLM/%E8%87%AA%E5%AE%9A%E4%B9%89%E4%B8%8E%E6%8B%93%E5%B1%95.md) |
568570
| [支持的模型和数据集列表](https://github.com/modelscope/swift/blob/main/docs/source/LLM/%E6%94%AF%E6%8C%81%E7%9A%84%E6%A8%A1%E5%9E%8B%E5%92%8C%E6%95%B0%E6%8D%AE%E9%9B%86.md) |
571+
| [自定义新模型和数据集](https://github.com/modelscope/swift/blob/main/docs/source/LLM/%E8%87%AA%E5%AE%9A%E4%B9%89%E4%B8%8E%E6%8B%93%E5%B1%95.md) |
569572
| [运行速度与显存Benchmark](https://github.com/modelscope/swift/blob/main/docs/source/LLM/Benchmark.md) |
570573
| [HuggingFace生态兼容](https://github.com/modelscope/swift/blob/main/docs/source/LLM/HuggingFace%E7%94%9F%E6%80%81%E5%85%BC%E5%AE%B9.md) |
571574

docs/source/LLM/Agent微调最佳实践.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -440,7 +440,7 @@ print()
440440
441441
442442
### 微调
443-
`dataset`换为`ms-agent-for-agentfabric``ms-agent-for-agentfabric-default`
443+
`dataset`换为`ms-agent-for-agentfabric-default``ms-agent-for-agentfabric-addition`
444444
```shell
445445
# Experimental environment: 8GPU
446446
nproc_per_node=8

docs/source/LLM/LLM量化文档.md

Lines changed: 11 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,9 @@ GPU设备: A10, 3090, V100, A100均可.
1414
# 设置pip全局镜像 (加速下载)
1515
pip config set global.index-url https://mirrors.aliyun.com/pypi/simple/
1616
# 安装ms-swift
17-
pip install 'ms-swift[llm]' -U
17+
git clone https://github.com/modelscope/swift.git
18+
cd swift
19+
pip install -e '.[llm]'
1820

1921
# 使用awq量化:
2022
# autoawq和cuda版本有对应关系,请按照`https://github.com/casper-hansen/AutoAWQ`选择版本
@@ -209,6 +211,14 @@ curl http://localhost:8000/v1/chat/completions \
209211
假设你使用lora微调了qwen1half-4b-chat, 模型权重目录为: `output/qwen1half-4b-chat/vx-xxx/checkpoint-xxx`.
210212

211213
```shell
214+
# 推送原始量化模型
215+
CUDA_VISIBLE_DEVICES=0 swift export \
216+
--model_type qwen1half-7b-chat \
217+
--model_id_or_path qwen1half-7b-chat-gptq-int4 \
218+
--push_to_hub true \
219+
--hub_model_id qwen1half-7b-chat-gptq-int4 \
220+
--hub_token '<your-sdk-token>'
221+
212222
# 推送lora增量模型
213223
CUDA_VISIBLE_DEVICES=0 swift export \
214224
--ckpt_dir output/qwen1half-4b-chat/vx-xxx/checkpoint-xxx \

0 commit comments

Comments
 (0)