Skip to content

Commit 3d6400b

Browse files
authored
add readme cn (#34)
1 parent 585f6a1 commit 3d6400b

File tree

14 files changed

+426
-82
lines changed

14 files changed

+426
-82
lines changed

.gitignore

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -124,12 +124,12 @@ replace.sh
124124
result.png
125125
result.jpg
126126
result.mp4
127+
runs/
128+
*.out
127129

128130
# Pytorch
129131
*.pth
130132
*.pt
131133

132134
# ast template
133135
ast_index_file.py
134-
135-
runs/

README.md

Lines changed: 52 additions & 46 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,16 @@
1+
<h1>SWIFT(Scalable lightWeight Infrastructure for Fine-Tuning)</h1>
2+
13
<p align="center">
24
<br>
35
<img src="https://modelscope.oss-cn-beijing.aliyuncs.com/modelscope.gif" width="400"/>
46
<br>
5-
<h1>SWIFT(Scalable lightWeight Infrastructure for Fine-Tuning)</h1>
67
<p>
78

9+
<p align="center">
10+
<a href="https://modelscope.cn/home">Modelscope Hub</a>
11+
<br>
12+
<a href="README_CN.md">中文</a>&nbsp | &nbspEnglish
13+
</p>
814

915
# Introduction
1016

@@ -25,11 +31,41 @@ Key features:
2531
## LLM SFT Example
2632
[code link](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm)
2733

28-
1. supported sft method: lora, qlora, full, ...
29-
2. supported models: [**qwen-7b**](https://github.com/QwenLM/Qwen-7B), baichuan-7b, baichuan-13b, chatglm2-6b, chatglm2-6b-32k, llama2-7b, llama2-13b, llama2-70b, openbuddy-llama2-13b, openbuddy-llama-65b, polylm-13b, ...
34+
1. supported sft method: [lora](https://arxiv.org/abs/2106.09685), [qlora](https://arxiv.org/abs/2305.14314), full(full parameter fine tuning), ...
35+
2. supported models: qwen-7b, [qwen-7b-chat](https://github.com/QwenLM/Qwen-7B), qwen-vl, [qwen-vl-chat](https://github.com/QwenLM/Qwen-VL), baichuan-7b, baichuan-13b, baichuan-13b-chat, chatglm2-6b, chatglm2-6b-32k, llama2-7b, llama2-7b-chat, llama2-13b, llama2-13b-chat, llama2-70b, llama2-70b-chat, openbuddy-llama2-13b, openbuddy-llama-65b, polylm-13b
3036
3. supported feature: quantization, ddp, model parallelism(device map), gradient checkpoint, gradient accumulation steps, push to modelscope hub, custom datasets, ...
31-
4. supported datasets: alpaca-en(gpt4), alpaca-zh(gpt4), finance-en, multi-alpaca-all, code-en, instinwild-en, instinwild-zh, ...
37+
4. supported datasets: alpaca-en(gpt4), alpaca-zh(gpt4), finance-en, multi-alpaca-all, code-en, instinwild-en, instinwild-zh, cot-en, cot-zh, coco-en
38+
5. supported templates: chatml(qwen), baichuan, chatglm2, llama, openbuddy_llama, default
39+
40+
# Installation
41+
42+
SWIFT is running in Python environment. Please make sure your python version is higher than 3.8.
43+
44+
Please install SWIFT by the `pip` command:
45+
46+
```shell
47+
pip install ms-swift -U
48+
```
49+
50+
If you want to install SWIFT by source code, please run:
51+
52+
```shell
53+
git clone https://github.com/modelscope/swift.git
54+
cd swift
55+
pip install -e .
56+
```
3257

58+
If you are using source code, please remember install requirements by:
59+
```shell
60+
pip install -r requirements/framework.txt
61+
```
62+
63+
SWIFT requires torch>=1.13.
64+
65+
We also recommend to use SWIFT in our docker image:
66+
```shell
67+
docker pull registry.cn-hangzhou.aliyuncs.com/modelscope-repo/modelscope:ubuntu20.04-cuda11.7.1-py38-torch2.0.1-tf1.15.5-1.8.0
68+
```
3369

3470
# Getting Started
3571

@@ -104,26 +140,26 @@ model = AutoModelForImageClassification.from_pretrained("google/vit-base-patch16
104140

105141
# init lora tuner config
106142
lora_config = LoRAConfig(
107-
r=10, # the rank of the LoRA module
108-
target_modules=['query', 'key', 'value'], # the modules to be replaced with the end of the module name
109-
merge_weights=False # whether to merge weights
143+
r=10, # the rank of the LoRA module
144+
target_modules=['query', 'key', 'value'], # the modules to be replaced with the end of the module name
145+
merge_weights=False # whether to merge weights
110146
)
111147

112148
# init adapter tuner config
113149
adapter_config = AdapterConfig(
114-
dim=768, # the dimension of the hidden states
115-
hidden_pos=0, # the position of the hidden state to passed into the adapter
116-
target_modules=r'.*attention.output.dense$', # the modules to be replaced with regular expression
117-
adapter_length=10 # the length of the adapter length
150+
dim=768, # the dimension of the hidden states
151+
hidden_pos=0, # the position of the hidden state to passed into the adapter
152+
target_modules=r'.*attention.output.dense$', # the modules to be replaced with regular expression
153+
adapter_length=10 # the length of the adapter length
118154
)
119155

120156
# init prompt tuner config
121157
prompt_config = PromptConfig(
122-
dim=768, # the dimension of the hidden states
123-
target_modules=r'.*layer\.\d+$', # the modules to be replaced with regular expression
124-
embedding_pos=0, # the position of the embedding tensor
125-
prompt_length=10, # the length of the prompt tokens
126-
attach_front=False # Whether prompt is attached in front of the embedding
158+
dim=768, # the dimension of the hidden states
159+
target_modules=r'.*layer\.\d+$', # the modules to be replaced with regular expression
160+
embedding_pos=0, # the position of the embedding tensor
161+
prompt_length=10, # the length of the prompt tokens
162+
attach_front=False # Whether prompt is attached in front of the embedding
127163
)
128164

129165
# create model with swift. In practice, you can use any of these tuners or a combination of them.
@@ -179,36 +215,6 @@ output
179215

180216
The config/weights stored in the output dir is the config of `extra_state_keys` and the weights of it. This is different from Peft, which stores the weights and config of the `default` tuner.
181217

182-
# Installation
183-
184-
SWIFT is running in Python environment. Please make sure your python version is higher than 3.8.
185-
186-
Please install SWIFT by the `pip` command:
187-
188-
```shell
189-
pip install swift -U
190-
```
191-
192-
If you want to install SWIFT by source code, please run:
193-
194-
```shell
195-
git clone https://github.com/modelscope/swift.git
196-
cd swift
197-
pip install -e .
198-
```
199-
200-
If you are using source code, please remember install requirements by:
201-
```shell
202-
pip install -r requirements/framework.txt
203-
```
204-
205-
SWIFT requires torch>=1.13.
206-
207-
We also recommend to use SWIFT in our docker image:
208-
```shell
209-
docker pull registry.cn-hangzhou.aliyuncs.com/modelscope-repo/modelscope:ubuntu20.04-cuda11.7.1-py38-torch2.0.1-tf1.15.5-1.8.0
210-
```
211-
212218

213219
# Learn More
214220

README_CN.md

Lines changed: 221 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,221 @@
1+
<h1>SWIFT(Scalable lightWeight Infrastructure for Fine-Tuning)</h1>
2+
3+
<p align="center">
4+
<br>
5+
<img src="https://modelscope.oss-cn-beijing.aliyuncs.com/modelscope.gif" width="400"/>
6+
<br>
7+
<p>
8+
9+
<p align="center">
10+
<a href="https://modelscope.cn/home">魔搭社区</a>
11+
<br>
12+
中文&nbsp | &nbsp<a href="README.md">English</a>
13+
</p>
14+
15+
# 简介
16+
SWIFT(Scalable lightWeight Infrastructure for Fine-Tuning)是一个可扩展的框架,旨在促进轻量级模型的微调。它集成了各种高效的微调方法的实现,采用了参数高效、内存高效和时间高效的方法。SWIFT可以无缝地集成到ModelScope生态系统中,并提供微调各种模型的能力,主要侧重于LLMs和视觉模型。此外,SWIFT与[Peft](https://github.com/huggingface/peft)完全兼容,使用户能够利用熟悉的Peft接口对ModelScope模型进行微调。
17+
18+
目前支持的方法(数量持续增加):
19+
20+
1. LoRA:[LORA: LOW-RANK ADAPTATION OF LARGE LANGUAGE MODELS](https://arxiv.org/abs/2106.09685)
21+
2. Adapter:[Parameter-Efficient Transfer Learning for NLP](http://arxiv.org/abs/1902.00751)
22+
3. Prompt Tuning: [Visual Prompt Tuning](https://arxiv.org/abs/2203.12119)
23+
4. 所有在[Peft](https://github.com/huggingface/peft)上提供的tuners。
24+
25+
关键特点:
26+
1. 通过集成ModelScope库,可以通过model id轻松获取模型。
27+
2. SWIFT提供的tuners可以组合在一起,以便在模型上探索多个tuners,以获得最佳结果。
28+
29+
## 大模型微调的例子
30+
[code link](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm)
31+
32+
1. 支持的sft方法: [lora](https://arxiv.org/abs/2106.09685), [qlora](https://arxiv.org/abs/2305.14314), 全参数微调, ...
33+
2. 支持的模型: qwen-7b, [qwen-7b-chat](https://github.com/QwenLM/Qwen-7B), qwen-vl, [qwen-vl-chat](https://github.com/QwenLM/Qwen-VL), baichuan-7b, baichuan-13b, baichuan-13b-chat, chatglm2-6b, chatglm2-6b-32k, llama2-7b, llama2-7b-chat, llama2-13b, llama2-13b-chat, llama2-70b, llama2-70b-chat, openbuddy-llama2-13b, openbuddy-llama-65b, polylm-13b
34+
3. 支持的特性: 模型量化, DDP, 模型并行(device_map), gradient checkpoint, 梯度累加, 支持推送modelscope hub, 支持自定义数据集, ...
35+
4. 支持的数据集: alpaca-en(gpt4), alpaca-zh(gpt4), finance-en, multi-alpaca-all, code-en, instinwild-en, instinwild-zh, cot-en, cot-zh, coco-en
36+
5. 支持的对话模板: chatml(qwen), baichuan, chatglm2, llama, openbuddy_llama, default
37+
38+
# 安装
39+
40+
SWIFT在Python环境中运行。请确保您的Python版本高于3.8。
41+
42+
请使用pip命令安装SWIFT:
43+
44+
```shell
45+
pip install ms-swift -U
46+
```
47+
48+
如果您想通过源代码安装SWIFT,请运行以下命令:
49+
50+
```shell
51+
git clone https://github.com/modelscope/swift.git
52+
cd swift
53+
pip install -e .
54+
```
55+
56+
如果您在使用源代码,请记得通过以下方式安装所需的依赖项:
57+
```shell
58+
pip install -r requirements/framework.txt
59+
```
60+
61+
SWIFT requires torch>=1.13.
62+
63+
我们还建议在我们的Docker镜像中使用SWIFT
64+
```shell
65+
docker pull registry.cn-hangzhou.aliyuncs.com/modelscope-repo/modelscope:ubuntu20.04-cuda11.7.1-py38-torch2.0.1-tf1.15.5-1.8.0
66+
```
67+
68+
# 快速开始
69+
SWIFT支持多个tuners,包括由[Peft](https://github.com/huggingface/peft)提供的调谐器。要使用这些调谐器,只需调用:
70+
```python
71+
from swift import Swift
72+
model = Swift.prepare_model(model, config, extra_state_keys=['...'])
73+
```
74+
上面的代码片段随机初始化了tuner。输入模型是torch.nn.Module的一个实例,配置是SwiftConfig或PeftConfig的子类实例。extra_state_keys是要训练并存储在输出目录中的额外模块权重(如linear head)。
75+
76+
您可以通过以下方式组合多个tuners:
77+
```python
78+
from swift import Swift, LoRAConfig, PromptConfig
79+
model = Swift.prepare_model(model, {'lora': LoRAConfig(...), 'prompt': PromptConfig(...)})
80+
```
81+
82+
在微调之后,您可以调用save_pretrained和push_to_hub方法:
83+
84+
```python
85+
from swift import push_to_hub
86+
model.save_pretrained('some-output-folder')
87+
push_to_hub('my-group/some-repo-id-modelscope', 'some-output-folder', token='some-ms-token')
88+
```
89+
假设`my-group/some-repo-id-modelscope`是Hub中的model-id,而`some-ms-token`是用于上传的令牌。
90+
91+
使用model-id进行后续推断:
92+
93+
```python
94+
from swift import Swift
95+
model = Swift.from_pretrained(model, 'my-group/some-repo-id-modelscope')
96+
```
97+
98+
下面是一个可运行的示例:
99+
100+
```python
101+
import os
102+
import tempfile
103+
104+
# 请通过`pip install modelscope`安装modelscope
105+
from modelscope import Model
106+
107+
from swift import LoRAConfig, SwiftModel, Swift, push_to_hub
108+
109+
tmp_dir = tempfile.TemporaryDirectory().name
110+
if not os.path.exists(tmp_dir):
111+
os.makedirs(tmp_dir)
112+
113+
114+
model = Model.from_pretrained('modelscope/Llama-2-7b-ms', device_map='auto')
115+
lora_config = LoRAConfig(target_modules=['q_proj', 'k_proj', 'v_proj'])
116+
model: SwiftModel = Swift.prepare_model(model, lora_config)
117+
# 在这里进行一些微调操作
118+
model.save_pretrained(tmp_dir)
119+
120+
push_to_hub('my-group/swift_llama2', output_dir=tmp_dir)
121+
model = Model.from_pretrained('modelscope/Llama-2-7b-ms', device_map='auto')
122+
model = SwiftModel.from_pretrained(model, 'my-group/swift_llama2', device_map='auto')
123+
```
124+
125+
这是一个使用transformers库创建模型,并使用SWIFT进行高效微调的示例。
126+
127+
```python
128+
from swift import Swift, LoRAConfig, AdapterConfig, PromptConfig
129+
from transformers import AutoModelForImageClassification
130+
131+
# 初始vit模型
132+
model = AutoModelForImageClassification.from_pretrained("google/vit-base-patch16-224")
133+
134+
# 初始化LoRA tuner配置
135+
lora_config = LoRAConfig(
136+
r=10, # LoRA模块的rank
137+
target_modules=['query', 'key', 'value'], # 将要被替换的模块的模块名后缀
138+
merge_weights=False # 是否合并权重
139+
)
140+
141+
# 初始化adapter tuner配置
142+
adapter_config = AdapterConfig(
143+
dim=768, # hidden states的维度
144+
hidden_pos=0, # 要传递到adapter的hidden state的位置
145+
target_modules=r'.*attention.output.dense$', # 要使用正则表达式替换的模块
146+
adapter_length=10 # adapter长度
147+
)
148+
149+
# 初始化prompt tuner配置
150+
prompt_config = PromptConfig(
151+
dim=768, # hidden states的维度
152+
target_modules=r'.*layer\.\d+$', # 要使用正则表达式替换的模块
153+
embedding_pos=0, # embedding张量的位置
154+
prompt_length=10, # 提示符token的长度
155+
attach_front=False # 是否将提示符附加在embedding前面
156+
)
157+
158+
# 使用swift创建模型。在实践中,您可以使用其中任何一个调谐器或它们的组合。
159+
model = Swift.prepare_model(model, {"lora_tuner": lora_config, "adapter_tuner": adapter_config, "prompt_tuner": prompt_config})
160+
161+
# 获取模型的可训练参数。
162+
model.get_trainable_parameters()
163+
# 'trainable params: 838,776 || all params: 87,406,432 || trainable%: 0.9596273189597764'
164+
```
165+
166+
您可以在SWIFT中使用Peft提供的功能:
167+
168+
```python
169+
from swift import LoraConfig, Swift
170+
from peft import TaskType
171+
lora_config = LoraConfig(target_modules=['query', 'key', 'value'], task_type=TaskType.CAUSAL_LM)
172+
model_wrapped = Swift.prepare_model(model, lora_config)
173+
174+
# 或者使用from_pretrained从modelscope hub中加载权重。
175+
model_wrapped = Swift.from_pretrained(model, 'some-id-in-the-modelscope-modelhub')
176+
```
177+
178+
或者:
179+
180+
```python
181+
from swift import LoraConfig, get_peft_model, PeftModel
182+
from peft import TaskType
183+
lora_config = LoraConfig(target_modules=['query', 'key', 'value'], task_type=TaskType.CAUSAL_LM)
184+
model_wrapped = get_peft_model(model, lora_config)
185+
186+
# 或者使用from_pretrained从modelscope hub中加载权重。
187+
model_wrapped = PeftModel.from_pretrained(model, 'some-id-in-the-modelscope-modelhub')
188+
```
189+
190+
Swift tuners和Peft tuners之间的保存策略略有不同。您可以通过以下方式为Swift tuners命名:
191+
192+
```python
193+
model = Swift.prepare_model(model, {'default': LoRAConfig(...)})
194+
model.save_pretrained('./output')
195+
```
196+
197+
在输出目录中,您将会得到以下类似的目录结构:
198+
199+
```text
200+
output
201+
|-- default
202+
|-- adapter_config.json
203+
|-- adapter_model.bin
204+
|-- adapter_config.json
205+
|-- adapter_model.bin
206+
```
207+
208+
存储在输出目录中的config/weights是extra_state_keys的配置和权重。这与Peft不同,Peft存储了default调谐器的权重和配置。
209+
210+
211+
# Learn More
212+
213+
- [ModelScope库](https://github.com/modelscope/modelscope/)
214+
215+
ModelScope库是ModelScope项目的模型库,包含大量热门模型。
216+
217+
- [将自己的模型贡献给ModelScope](https://modelscope.cn/docs/ModelScope%E6%A8%A1%E5%9E%8B%E6%8E%A5%E5%85%A5%E6%B5%81%E7%A8%8B%E6%A6%82%E8%A7%88)
218+
219+
# License
220+
221+
本项目使用[Apache License (Version 2.0)](https://github.com/modelscope/modelscope/blob/master/LICENSE)进行许可。

examples/pytorch/llm/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@
1616

1717
## Features
1818
1. supported sft method: [lora](https://arxiv.org/abs/2106.09685), [qlora](https://arxiv.org/abs/2305.14314), full(full parameter fine tuning), ...
19-
2. supported models: [**qwen-7b**](https://github.com/QwenLM/Qwen-7B), qwen-7b-chat, qwen-vl, **qwen-vl-chat**, baichuan-7b, baichuan-13b, baichuan-13b-chat, chatglm2-6b, chatglm2-6b-32k, llama2-7b, llama2-7b-chat, llama2-13b, llama2-13b-chat, llama2-70b, llama2-70b-chat, openbuddy-llama2-13b, openbuddy-llama-65b, polylm-13b
19+
2. supported models: qwen-7b, [qwen-7b-chat](https://github.com/QwenLM/Qwen-7B), qwen-vl, [qwen-vl-chat](https://github.com/QwenLM/Qwen-VL), baichuan-7b, baichuan-13b, baichuan-13b-chat, chatglm2-6b, chatglm2-6b-32k, llama2-7b, llama2-7b-chat, llama2-13b, llama2-13b-chat, llama2-70b, llama2-70b-chat, openbuddy-llama2-13b, openbuddy-llama-65b, polylm-13b
2020
3. supported feature: quantization, ddp, model parallelism(device map), gradient checkpoint, gradient accumulation steps, push to modelscope hub, custom datasets, ...
2121
4. supported datasets: alpaca-en(gpt4), alpaca-zh(gpt4), finance-en, multi-alpaca-all, code-en, instinwild-en, instinwild-zh, cot-en, cot-zh, coco-en
2222
5. supported templates: chatml(qwen), baichuan, chatglm2, llama, openbuddy_llama, default

0 commit comments

Comments
 (0)