Skip to content

Commit c445b33

Browse files
authored
update readme (#175)
1 parent 78445b4 commit c445b33

File tree

17 files changed

+1168
-1746
lines changed

17 files changed

+1168
-1746
lines changed

README.md

Lines changed: 33 additions & 169 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
<h1>SWIFT(Scalable lightWeight Infrastructure for Fine-Tuning)</h1>
1+
# SWIFT (Scalable lightWeight Infrastructure for Fine-Tuning)
22

33
<p align="center">
44
<br>
@@ -12,8 +12,24 @@
1212
<a href="README_CN.md">中文</a>&nbsp | &nbspEnglish
1313
</p>
1414

15-
# 📖 Introduction
15+
<p align="center">
16+
<img src="https://img.shields.io/badge/python-%E2%89%A53.8-5be.svg">
17+
<img src="https://img.shields.io/badge/pytorch-%E2%89%A51.12%20%7C%20%E2%89%A52.0-orange.svg">
18+
<a href="https://github.com/modelscope/modelscope/"><img src="https://img.shields.io/badge/modelscope-%E2%89%A51.9.3-5D91D4.svg"></a>
19+
<a href="https://github.com/modelscope/swift/"><img src="https://img.shields.io/badge/ms--swift-Build from source-6FEBB9.svg"></a>
20+
</p>
1621

22+
## 📖 Table of Contents
23+
- [Introduction](#-introduction)
24+
- [News](#-news)
25+
- [LLM Training and Inference Example](#-llm-training-and-inference-example)
26+
- [Installation](#-installation)
27+
- [Getting Started](#-getting-started)
28+
- [Learn More](#-learn-more)
29+
- [License](#-license)
30+
- [Contact Us](#-contact-us)
31+
32+
## 📝 Introduction
1733
SWIFT (Scalable lightWeight Infrastructure for Fine-Tuning) is an extensible framwork designed to faciliate lightweight model fine-tuning and inference. It integrates implementations for various efficient fine-tuning methods, by embracing approaches that is parameter-efficient, memory-efficient, and time-efficient. SWIFT integrates seamlessly into ModelScope ecosystem and offers the capabilities to finetune various models, with a primary emphasis on LLMs and vision models. Additionally, SWIFT is fully compatible with [PEFT](https://github.com/huggingface/peft), enabling users to leverage the familiar Peft interface to finetune ModelScope models.
1834

1935
Currently supported approches (and counting):
@@ -40,7 +56,7 @@ Key features:
4056
Users can check the [documentation of SWIFT](docs/source/GetStarted/快速使用.md) to get detail tutorials.
4157

4258

43-
### 🎉 News
59+
## 🎉 News
4460
- 🔥 2023.11.24: Support for **yi-34b-chat**, **codefuse-codellama-34b-chat**: The corresponding shell script can be found in [yi_34b_chat](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/yi_34b_chat), [codefuse_codellama_34b_chat](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/codefuse_codellama_34b_chat).
4561
- 🔥 2023.11.18: Support for **tongyi-finance-14b** series models: tongyi-finance-14b, tongyi-finance-14b-chat, tongyi-finance-14b-chat-int4. The corresponding shell script can be found in [tongyi_finance_14b_chat_int4](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/tongyi_finance_14b_chat_int4).
4662
- 🔥 2023.11.16: Added support for more models in **flash attn**: qwen series, qwen-vl series, llama series, openbuddy series, mistral series, yi series, ziya series. Please use the `use_flash_attn` parameter.
@@ -66,8 +82,14 @@ Users can check the [documentation of SWIFT](docs/source/GetStarted/快速使用
6682
- 2023.9.3: Supported **baichuan2** model series: baichuan2-7b, baichuan2-7b-chat, baichuan2-13b, baichuan2-13b-chat.
6783

6884

69-
## ✨ LLM SFT Example
70-
Users can refer to the [LLM fine-tuning documentation](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm) for more detailed information.
85+
## ✨ LLM Training and Inference Example
86+
### Simple Usage
87+
- Quickly perform inference on LLM, see the [LLM Inference Documentation](https://github.com/modelscope/swift/blob/main/docs/source/LLM/LLM推理文档.md).
88+
- Rapidly fine-tune and perform inference on LLM, and build a Web-UI. See the [LLM Fine-tuning Documentation](https://github.com/modelscope/swift/blob/main/docs/source/LLM/LLM微调文档.md).
89+
- View the models and datasets supported by Swift. You can check [supported models and datasets](https://github.com/modelscope/swift/blob/main/docs/source/LLM/支持的模型和数据集.md).
90+
- Expand and customize models, datasets, and dialogue templates in Swift, see [Customization and Expansion](https://github.com/modelscope/swift/blob/main/docs/source/LLM/自定义和拓展.md).
91+
- Check command-line hyperparameters for fine-tuning and inference, see [Command-Line Hyperparameters](https://github.com/modelscope/swift/blob/main/docs/source/LLM/命令行超参数.md)
92+
7193

7294
### Features
7395
- Supported SFT Methods: [lora](https://arxiv.org/abs/2106.09685), [qlora](https://arxiv.org/abs/2305.14314), full(full parameter fine-tuning)
@@ -93,7 +115,7 @@ Users can refer to the [LLM fine-tuning documentation](https://github.com/models
93115
- NLP:
94116
- General: 🔥[alpaca-en](https://modelscope.cn/datasets/AI-ModelScope/alpaca-gpt4-data-en/summary)(gpt4), 🔥[alpaca-zh](https://modelscope.cn/datasets/AI-ModelScope/alpaca-gpt4-data-zh/summary)(gpt4), [multi-alpaca-all](https://www.modelscope.cn/datasets/damo/nlp_polylm_multialpaca_sft/summary), [instinwild-en](https://www.modelscope.cn/datasets/wyj123456/instinwild/summary), [instinwild-zh](https://www.modelscope.cn/datasets/wyj123456/instinwild/summary), [cot-en](https://www.modelscope.cn/datasets/YorickHe/CoT/summary), [cot-zh](https://www.modelscope.cn/datasets/YorickHe/CoT/summary), [firefly-all-zh](https://www.modelscope.cn/datasets/wyj123456/firefly/summary), [instruct-en](https://www.modelscope.cn/datasets/wyj123456/instruct/summary), [gpt4all-en](https://www.modelscope.cn/datasets/wyj123456/GPT4all/summary), [sharegpt-en](https://www.modelscope.cn/datasets/huangjintao/sharegpt/summary), [sharegpt-zh](https://www.modelscope.cn/datasets/huangjintao/sharegpt/summary)
95117
- Agent: [damo-agent-zh](https://modelscope.cn/datasets/damo/MSAgent-Bench/summary), 🔥[damo-agent-mini-zh](https://modelscope.cn/datasets/damo/MSAgent-Bench/summary), 🔥[agent-instruct-all-en](https://modelscope.cn/datasets/ZhipuAI/AgentInstruct/summary)
96-
- Coding: [code-alpaca-en](https://www.modelscope.cn/datasets/wyj123456/code_alpaca_en/summary), [code-python-zh](https://modelscope.cn/datasets/codefuse-ai/CodeExercise-Python-27k/summary), 🔥[leetcode-python-en](https://modelscope.cn/datasets/AI-ModelScope/leetcode-solutions-python/summary)
118+
- Coding: [code-alpaca-en](https://www.modelscope.cn/datasets/wyj123456/code_alpaca_en/summary), [codefuse-python-zh](https://modelscope.cn/datasets/codefuse-ai/CodeExercise-Python-27k/summary), 🔥[leetcode-python-en](https://modelscope.cn/datasets/AI-ModelScope/leetcode-solutions-python/summary)
97119
- Medical: [medical-en](https://www.modelscope.cn/datasets/huangjintao/medical_zh/summary), [medical-zh](https://www.modelscope.cn/datasets/huangjintao/medical_zh/summary), [medical-mini-zh](https://www.modelscope.cn/datasets/huangjintao/medical_zh/summary)
98120
- Law: 🔥[lawyer-llama-zh](https://modelscope.cn/datasets/AI-ModelScope/lawyer_llama_data/summary), [tigerbot-law-zh](https://modelscope.cn/datasets/AI-ModelScope/tigerbot-law-plugin/summary)
99121
- Math: 🔥[blossom-math-zh](https://modelscope.cn/datasets/AI-ModelScope/blossom-math-v2/summary), [school-math-zh](https://modelscope.cn/datasets/AI-ModelScope/school_math_0.25M/summary)
@@ -108,165 +130,7 @@ Users can refer to the [LLM fine-tuning documentation](https://github.com/models
108130
- Chat: default, chatml(qwen), baichuan, chatglm2, chatglm3, llama, openbuddy, internlm, xverse, ziya, skywork, bluelm
109131

110132

111-
### Basic Usage
112-
Quickly fine-tune, infer with LLM, and build a Web-UI.
113-
114-
To see more sh startup scripts, please refer to: [Run SFT and Inference](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm#-run-sft-and-inference)
115-
116-
```bash
117-
git clone https://github.com/modelscope/swift.git
118-
cd swift
119-
pip install -e .
120-
```
121-
122-
123-
#### Run using Python
124-
```python
125-
# Experimental environment: A10, 3090, A100, ...
126-
# 20GB GPU memory
127-
import os
128-
os.environ['CUDA_VISIBLE_DEVICES'] = '0'
129-
130-
import torch
131-
132-
from swift.llm import (
133-
DatasetName, InferArguments, ModelType, SftArguments
134-
)
135-
from swift.llm.run import infer_main, sft_main, web_ui_main
136-
137-
model_type = ModelType.qwen_7b_chat
138-
sft_args = SftArguments(
139-
model_type=model_type,
140-
eval_steps=50,
141-
train_dataset_sample=2000,
142-
dataset=[DatasetName.blossom_math_zh],
143-
output_dir='output',
144-
gradient_checkpointing=True)
145-
result = sft_main(sft_args)
146-
best_model_checkpoint = result['best_model_checkpoint']
147-
print(f'best_model_checkpoint: {best_model_checkpoint}')
148-
torch.cuda.empty_cache()
149-
150-
infer_args = InferArguments(
151-
ckpt_dir=best_model_checkpoint,
152-
load_args_from_ckpt_dir=True,
153-
stream=True,
154-
show_dataset_sample=5)
155-
result = infer_main(infer_args)
156-
print(f'result: {result}')
157-
torch.cuda.empty_cache()
158-
159-
web_ui_main(infer_args)
160-
```
161-
162-
**Single-Sample Inference**:
163-
164-
Inference using LoRA **incremental** weights:
165-
```python
166-
import os
167-
os.environ['CUDA_VISIBLE_DEVICES'] = '0'
168-
169-
from swift.llm import (
170-
get_model_tokenizer, get_template, inference, ModelType, get_default_template_type
171-
)
172-
from swift.tuners import Swift
173-
import torch
174-
175-
model_dir = 'vx_xxx/checkpoint-100'
176-
model_type = ModelType.qwen_7b_chat
177-
template_type = get_default_template_type(model_type)
178-
179-
model, tokenizer = get_model_tokenizer(model_type, torch.bfloat16, {'device_map': 'auto'})
180-
181-
model = Swift.from_pretrained(model, model_dir, inference_mode=True)
182-
template = get_template(template_type, tokenizer)
183-
query = 'xxxxxx'
184-
response, history = inference(model, template, query)
185-
print(f'response: {response}')
186-
print(f'history: {history}')
187-
```
188-
189-
Inference using LoRA **merged** complete weights:
190-
```python
191-
import os
192-
os.environ['CUDA_VISIBLE_DEVICES'] = '0'
193-
194-
from swift.llm import (
195-
get_model_tokenizer, get_template, inference, ModelType, get_default_template_type
196-
)
197-
import torch
198-
199-
model_dir = 'vx_xxx/checkpoint-100-merged'
200-
model_type = ModelType.qwen_7b_chat
201-
template_type = get_default_template_type(model_type)
202-
203-
model, tokenizer = get_model_tokenizer(model_type, torch.bfloat16, {'device_map': 'auto'},
204-
model_dir=model_dir)
205-
206-
template = get_template(template_type, tokenizer)
207-
query = 'xxxxxx'
208-
response, history = inference(model, template, query)
209-
print(f'response: {response}')
210-
print(f'history: {history}')
211-
```
212-
213-
#### Run using Swift CLI
214-
**SFT**:
215-
```bash
216-
# Experimental environment: A10, 3090, A100, ...
217-
# 20GB GPU memory
218-
CUDA_VISIBLE_DEVICES=0 \
219-
swift sft \
220-
--model_id_or_path qwen/Qwen-7B-Chat \
221-
--dataset blossom-math-zh \
222-
--output_dir output \
223-
224-
# Using DDP
225-
# Experimental environment: 2 * 3090
226-
# 2 * 23GB GPU memory
227-
CUDA_VISIBLE_DEVICES=0,1 \
228-
NPROC_PER_NODE=2 \
229-
swift sft \
230-
--model_id_or_path qwen/Qwen-7B-Chat \
231-
--dataset blossom-math-zh \
232-
--output_dir output \
233-
234-
# Using custom dataset
235-
CUDA_VISIBLE_DEVICES=0 \
236-
swift sft \
237-
--model_id_or_path qwen/Qwen-7B-Chat \
238-
--custom_train_dataset_path chatml.jsonl \
239-
--output_dir output \
240-
```
241-
242-
**Inference**:
243-
```bash
244-
# Original Model
245-
CUDA_VISIBLE_DEVICES=0 swift infer --model_id_or_path qwen/Qwen-7B-Chat --dataset blossom-math-zh
246-
247-
# Fine-tuned Model
248-
CUDA_VISIBLE_DEVICES=0 swift infer --ckpt_dir 'xxx/vx_xxx/checkpoint-xxx'
249-
250-
# Merge LoRA incremental weights and perform inference
251-
swift merge-lora --ckpt_dir 'xxx/vx_xxx/checkpoint-xxx'
252-
CUDA_VISIBLE_DEVICES=0 swift infer --ckpt_dir 'xxx/vx_xxx/checkpoint-xxx-merged'
253-
```
254-
255-
**Web-UI**:
256-
```bash
257-
# Original Model
258-
CUDA_VISIBLE_DEVICES=0 swift web-ui --model_id_or_path qwen/Qwen-7B-Chat
259-
260-
# Fine-tuned Model
261-
CUDA_VISIBLE_DEVICES=0 swift web-ui --ckpt_dir 'xxx/vx_xxx/checkpoint-xxx'
262-
263-
# Merge LoRA incremental weights and use web UI
264-
swift merge-lora --ckpt_dir 'xxx/vx_xxx/checkpoint-xxx'
265-
CUDA_VISIBLE_DEVICES=0 swift web-ui --ckpt_dir 'xxx/vx_xxx/checkpoint-xxx-merged'
266-
```
267-
268-
269-
# 🛠️ Installation
133+
## 🛠️ Installation
270134

271135
SWIFT is running in Python environment. Please make sure your python version is higher than 3.8.
272136

@@ -292,7 +156,7 @@ SWIFT requires torch>=1.13.
292156
docker pull registry.cn-hangzhou.aliyuncs.com/modelscope-repo/modelscope:ubuntu20.04-cuda11.8.0-py38-torch2.0.1-tf2.13.0-1.9.1
293157
```
294158

295-
# 🚀 Getting Started
159+
## 🚀 Getting Started
296160

297161
SWIFT supports multiple tuners, as well as tuners provided by [PEFT](https://github.com/huggingface/peft). To use these tuners, simply call:
298162

@@ -430,20 +294,20 @@ output
430294
The config/weights stored in the output dir is the config of `extra_state_keys` and the weights of it. This is different from PEFT, which stores the weights and config of the `default` tuner.
431295

432296

433-
# 🔍 Learn More
297+
## 🔍 Learn More
434298

435299
- [ModelScope library](https://github.com/modelscope/modelscope/)
436300

437301
ModelScope Library is the model library of ModelScope project, which contains a large number of popular models.
438302

439303
- [Contribute your own model to ModelScope](https://modelscope.cn/docs/ModelScope%E6%A8%A1%E5%9E%8B%E6%8E%A5%E5%85%A5%E6%B5%81%E7%A8%8B%E6%A6%82%E8%A7%88)
440304

441-
# License
305+
## License
442306

443307
This project is licensed under the [Apache License (Version 2.0)](https://github.com/modelscope/modelscope/blob/master/LICENSE).
444308

445309

446-
# Contact Us
310+
## Contact Us
447311
You can contact and communicate with us by joining our WeChat Group:
448312

449313
<p align="left">

0 commit comments

Comments
 (0)