Skip to content

Commit 46cd9d3

Browse files
authored
Feat 1028 (#122)
1 parent 97d7cd9 commit 46cd9d3

File tree

146 files changed

+924
-230
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

146 files changed

+924
-230
lines changed

README.md

Lines changed: 43 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -51,9 +51,48 @@ Users can check the [documentation of Swift](docs/source/GetStarted/Introduction
5151
## LLM SFT Example
5252
Press [this link](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm) to view the detail documentation of these examples.
5353

54+
### Basic Usage
55+
```bash
56+
git clone https://github.com/modelscope/swift.git
57+
cd swift
58+
pip install .[llm]
59+
```
60+
61+
```python
62+
# Experimental environment: A10, 3090, A100, ...
63+
# 16GB GPU memory
64+
import os
65+
os.environ['CUDA_VISIBLE_DEVICES'] = '0'
66+
67+
import torch
68+
69+
from swift.llm import DatasetName, InferArguments, ModelType, SftArguments
70+
from swift.llm.run import infer_main, sft_main
71+
72+
model_type = ModelType.qwen_7b_chat_int4
73+
sft_args = SftArguments(
74+
model_type=model_type,
75+
eval_steps=50,
76+
train_dataset_sample=2000,
77+
dataset=[DatasetName.leetcode_python_en],
78+
output_dir='output',
79+
gradient_checkpointing=True)
80+
best_ckpt_dir = sft_main(sft_args)
81+
print(f'best_ckpt_dir: {best_ckpt_dir}')
82+
torch.cuda.empty_cache()
83+
infer_args = InferArguments(
84+
model_type=sft_args.model_type,
85+
ckpt_dir=best_ckpt_dir,
86+
dataset=sft_args.dataset,
87+
stream=True,
88+
show_dataset_sample=5)
89+
infer_main(infer_args)
90+
```
91+
92+
5493
### Features
5594
- Supported SFT Methods: [lora](https://arxiv.org/abs/2106.09685), [qlora](https://arxiv.org/abs/2305.14314), full(full parameter fine-tuning)
56-
- Supported Features: quantization, DDP, model parallelism, gradient checkpointing, gradient accumulation, pushing to modelscope hub, custom datasets, multimodal and agent SFT, mutli-round chat, ...
95+
- Supported Features: quantization, DDP, model parallelism, gradient checkpointing, pushing to modelscope hub, custom datasets, multimodal and agent SFT, mutli-round chat, ...
5796
- Supported Models:
5897
- 🔥 qwen series: [qwen-7b](https://modelscope.cn/models/qwen/Qwen-7B/summary), [qwen-7b-chat](https://modelscope.cn/models/qwen/Qwen-7B-Chat/summary), [qwen-14b](https://modelscope.cn/models/qwen/Qwen-14B/summary), [qwen-14b-chat](https://modelscope.cn/models/qwen/Qwen-14B-Chat/summary), [qwen-7b-chat-int4](https://modelscope.cn/models/qwen/Qwen-7B-Chat-Int4/summary), [qwen-14b-chat-int4](https://modelscope.cn/models/qwen/Qwen-14B-Chat-Int4/summary), [qwen-7b-chat-int8](https://modelscope.cn/models/qwen/Qwen-7B-Chat-Int8/summary), [qwen-14b-chat-int8](https://modelscope.cn/models/qwen/Qwen-14B-Chat-Int8/summary)
5998
- 🔥 qwen-vl series: [qwen-vl](https://modelscope.cn/models/qwen/Qwen-VL/summary), [qwen-vl-chat](https://modelscope.cn/models/qwen/Qwen-VL-Chat/summary), [qwen-vl-chat-int4](https://modelscope.cn/models/qwen/Qwen-VL-Chat-Int4/summary)
@@ -65,6 +104,7 @@ Press [this link](https://github.com/modelscope/swift/tree/main/examples/pytorch
65104
- xverse series: [xverse-7b](https://modelscope.cn/models/xverse/XVERSE-7B/summary), [xverse-7b-chat](https://modelscope.cn/models/xverse/XVERSE-7B-Chat/summary), [xverse-13b](https://modelscope.cn/models/xverse/XVERSE-13B/summary), [xverse-13b-chat](https://modelscope.cn/models/xverse/XVERSE-13B-Chat/summary)
66105
- mistral series: [mistral-7b](https://modelscope.cn/models/AI-ModelScope/Mistral-7B-v0.1/summary), [mistral-7b-chat](https://modelscope.cn/models/AI-ModelScope/Mistral-7B-Instruct-v0.1/summary)
67106
- ziya series: [ziya2-13b](https://modelscope.cn/models/Fengshenbang/Ziya2-13B-Base/summary), [ziya2-13b-chat](https://modelscope.cn/models/Fengshenbang/Ziya2-13B-Chat/summary)
107+
- skywork series: [skywork-13b](https://modelscope.cn/models/skywork/Skywork-13B-base/summary), [skywork-13b-chat](https://modelscope.cn/models/skywork/Skywork-13B-chat/summary)
68108
- other: [polylm-13b](https://modelscope.cn/models/damo/nlp_polylm_13b_text_generation/summary), [seqgpt-560m](https://modelscope.cn/models/damo/nlp_seqgpt-560m/summary)
69109
- Supported Datasets:
70110
- NLP:
@@ -81,8 +121,8 @@ Press [this link](https://github.com/modelscope/swift/tree/main/examples/pytorch
81121
- Multi-Modal: 🔥[coco-en](https://modelscope.cn/datasets/modelscope/coco_2014_caption/summary)
82122
- Custom Dataset
83123
- Supported Templates:
84-
- Text Generation: default-generation, chatglm2-generation
85-
- Chat: chatml(qwen), baichuan, chatglm2, chatglm3, llama, openbuddy-llama, default, internlm, xverse
124+
- Text Generation: default-generation, chatglm-generation
125+
- Chat: chatml(qwen), baichuan, chatglm2, chatglm3, llama, openbuddy-llama, default, internlm, xverse, skywork
86126

87127

88128
# Installation

README_CN.md

Lines changed: 43 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -49,9 +49,48 @@ SWIFT(Scalable lightWeight Infrastructure for Fine-Tuning)是一个可扩展
4949
## 大模型微调的例子
5050
可以[在这里](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm) 查看LLM微调的使用文档。
5151

52+
### 简单使用
53+
```bash
54+
git clone https://github.com/modelscope/swift.git
55+
cd swift
56+
pip install .[llm]
57+
```
58+
59+
```python
60+
# Experimental environment: A10, 3090, A100, ...
61+
# 16GB GPU memory
62+
import os
63+
os.environ['CUDA_VISIBLE_DEVICES'] = '0'
64+
65+
import torch
66+
67+
from swift.llm import DatasetName, InferArguments, ModelType, SftArguments
68+
from swift.llm.run import infer_main, sft_main
69+
70+
model_type = ModelType.qwen_7b_chat_int4
71+
sft_args = SftArguments(
72+
model_type=model_type,
73+
eval_steps=50,
74+
train_dataset_sample=2000,
75+
dataset=[DatasetName.leetcode_python_en],
76+
output_dir='output',
77+
gradient_checkpointing=True)
78+
best_ckpt_dir = sft_main(sft_args)
79+
print(f'best_ckpt_dir: {best_ckpt_dir}')
80+
torch.cuda.empty_cache()
81+
infer_args = InferArguments(
82+
model_type=sft_args.model_type,
83+
ckpt_dir=best_ckpt_dir,
84+
dataset=sft_args.dataset,
85+
stream=True,
86+
show_dataset_sample=5)
87+
infer_main(infer_args)
88+
```
89+
90+
5291
### 特性
5392
- 支持的SFT方法: [lora](https://arxiv.org/abs/2106.09685), [qlora](https://arxiv.org/abs/2305.14314), 全参数微调
54-
- 支持的特性: 模型量化, DDP, 模型并行, gradient checkpointing, 梯度累加, 支持推送ModelScope Hub, 自定义数据集, 多模态和Agent SFT, 多轮对话, ...
93+
- 支持的特性: 模型量化, DDP, 模型并行, gradient checkpointing, 支持推送ModelScope Hub, 自定义数据集, 多模态和Agent SFT, 多轮对话, ...
5594
- 支持的模型
5695
- 🔥 qwen 系列: [qwen-7b](https://modelscope.cn/models/qwen/Qwen-7B/summary), [qwen-7b-chat](https://modelscope.cn/models/qwen/Qwen-7B-Chat/summary), [qwen-14b](https://modelscope.cn/models/qwen/Qwen-14B/summary), [qwen-14b-chat](https://modelscope.cn/models/qwen/Qwen-14B-Chat/summary), [qwen-7b-chat-int4](https://modelscope.cn/models/qwen/Qwen-7B-Chat-Int4/summary), [qwen-14b-chat-int4](https://modelscope.cn/models/qwen/Qwen-14B-Chat-Int4/summary), [qwen-7b-chat-int8](https://modelscope.cn/models/qwen/Qwen-7B-Chat-Int8/summary), [qwen-14b-chat-int8](https://modelscope.cn/models/qwen/Qwen-14B-Chat-Int8/summary)
5796
- 🔥 qwen-vl 系列: [qwen-vl](https://modelscope.cn/models/qwen/Qwen-VL/summary), [qwen-vl-chat](https://modelscope.cn/models/qwen/Qwen-VL-Chat/summary), [qwen-vl-chat-int4](https://modelscope.cn/models/qwen/Qwen-VL-Chat-Int4/summary)
@@ -63,6 +102,7 @@ SWIFT(Scalable lightWeight Infrastructure for Fine-Tuning)是一个可扩展
63102
- xverse 系列: [xverse-7b](https://modelscope.cn/models/xverse/XVERSE-7B/summary), [xverse-7b-chat](https://modelscope.cn/models/xverse/XVERSE-7B-Chat/summary), [xverse-13b](https://modelscope.cn/models/xverse/XVERSE-13B/summary), [xverse-13b-chat](https://modelscope.cn/models/xverse/XVERSE-13B-Chat/summary)
64103
- mistral 系列: [mistral-7b](https://modelscope.cn/models/AI-ModelScope/Mistral-7B-v0.1/summary), [mistral-7b-chat](https://modelscope.cn/models/AI-ModelScope/Mistral-7B-Instruct-v0.1/summary)
65104
- ziya 系列: [ziya2-13b](https://modelscope.cn/models/Fengshenbang/Ziya2-13B-Base/summary), [ziya2-13b-chat](https://modelscope.cn/models/Fengshenbang/Ziya2-13B-Chat/summary)
105+
- skywork 系列: [skywork-13b](https://modelscope.cn/models/skywork/Skywork-13B-base/summary), [skywork-13b-chat](https://modelscope.cn/models/skywork/Skywork-13B-chat/summary)
66106
- other: [polylm-13b](https://modelscope.cn/models/damo/nlp_polylm_13b_text_generation/summary), [seqgpt-560m](https://modelscope.cn/models/damo/nlp_seqgpt-560m/summary)
67107
- 支持的数据集:
68108
- NLP:
@@ -79,8 +119,8 @@ SWIFT(Scalable lightWeight Infrastructure for Fine-Tuning)是一个可扩展
79119
- 多模态: 🔥[coco-en](https://modelscope.cn/datasets/modelscope/coco_2014_caption/summary)
80120
- 自定义数据集
81121
- 支持的对话模板:
82-
- 文本生成: default-generation, chatglm2-generation
83-
- 对话: chatml(qwen), baichuan, chatglm2, chatglm3, llama, openbuddy-llama, default, internlm, xverse
122+
- 文本生成: default-generation, chatglm-generation
123+
- 对话: chatml(qwen), baichuan, chatglm2, chatglm3, llama, openbuddy-llama, default, internlm, xverse, skywork
84124

85125

86126
# 安装

0 commit comments

Comments
 (0)