Skip to content

Commit 56b331f

Browse files
Replace with loralib and add unload lora interface (#83)
1 parent e63669e commit 56b331f

File tree

30 files changed

+264
-548
lines changed

30 files changed

+264
-548
lines changed

examples/pytorch/llm/README.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -28,13 +28,13 @@
2828
3. supported features: quantization, DDP, model parallelism(device map), gradient checkpointing, gradient accumulation, pushing to modelscope hub, custom datasets, multimodal and agent SFT, mutli-round chat, ...
2929
4. supported datasets:
3030
1. NLP: [alpaca-en](https://modelscope.cn/datasets/AI-ModelScope/alpaca-gpt4-data-en/summary)(gpt4), [alpaca-zh](https://modelscope.cn/datasets/AI-ModelScope/alpaca-gpt4-data-zh/summary)(gpt4), finance-en, multi-alpaca-all, code-en, instinwild-en, instinwild-zh, cot-en, cot-zh, firefly-all-zh, poetry-zh, instruct-en, gpt4all-en, cmnli-zh, [jd-zh](https://modelscope.cn/datasets/DAMO_NLP/jd/summary), [dureader-robust-zh](https://modelscope.cn/datasets/modelscope/DuReader_robust-QG/summary), medical-en, medical-zh, medical-mini-zh, sharegpt-en, sharegpt-zh, [code-python-zh](https://modelscope.cn/datasets/codefuse-ai/CodeExercise-Python-27k/summary), [advertise-gen](https://modelscope.cn/datasets/lvjianjin/AdvertiseGen/summary)
31-
2. agent: [damo-agent-zh](https://modelscope.cn/datasets/damo/MSAgent-Bench/summary), damo-agent-mini-zh
32-
3. multi-modal: [coco-en](https://modelscope.cn/datasets/modelscope/coco_2014_caption/summary)
33-
4. other: [cls-fudan-news-zh](https://modelscope.cn/datasets/damo/zh_cls_fudan-news/files), [ner-jave-zh](https://modelscope.cn/datasets/damo/zh_ner-JAVE/summary)
31+
2. Agent: [damo-agent-zh](https://modelscope.cn/datasets/damo/MSAgent-Bench/summary), damo-agent-mini-zh
32+
3. Multi-Modal: [coco-en](https://modelscope.cn/datasets/modelscope/coco_2014_caption/summary)
33+
4. Other: [cls-fudan-news-zh](https://modelscope.cn/datasets/damo/zh_cls_fudan-news/files), [ner-jave-zh](https://modelscope.cn/datasets/damo/zh_ner-JAVE/summary)
3434
5. supported templates: chatml(qwen), baichuan, chatglm2, llama, openbuddy-llama, default, default-generation
3535

3636
## Prepare the Environment
37-
Experimental environment: V100, A10, 3090, A100, ... (V100 does not support bf16, quantization)
37+
Experimental environment: V100, A10, 3090, A100, ...
3838
```bash
3939
# Installing miniconda
4040
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
@@ -74,7 +74,7 @@ cd swift/examples/pytorch/llm
7474
# If you want to push weights into modelscope hub during training, you need to set '--push_to_hub true'.
7575
# Recommended experimental environment: A100
7676
bash scripts/qwen_7b_chat/lora/sft.sh
77-
bash scripts/qwen_7b_chat/lora/merge_lora_and_infer.sh
77+
bash scripts/qwen_7b_chat/lora/infer.sh
7878

7979
# sft(lora+ddp) and infer qwen-7b-chat, Requires 2*38GB GPU memory.
8080
# Recommended experimental environment: A100
@@ -88,12 +88,12 @@ bash scripts/qwen_7b_chat/lora_mp_ddp/infer.sh
8888

8989
# sft(qlora) and infer qwen-7b-chat, Requires 12GB GPU memory.
9090
# If you want to use quantification, you need to `pip install bitsandbytes -U`
91-
# Recommended experimental environment: A10, 3090
91+
# Recommended experimental environment: V100, A10, 3090
9292
bash scripts/qwen_7b_chat/qlora/sft.sh
93-
bash scripts/qwen_7b_chat/qlora/merge_lora_and_infer.sh
93+
bash scripts/qwen_7b_chat/qlora/infer.sh
9494

9595
# sft(qlora+ddp) and infer qwen-7b-chat, Requires 2*14GB GPU memory.
96-
# Recommended experimental environment: A10, 3090
96+
# Recommended experimental environment: V100, A10, 3090
9797
bash scripts/qwen_7b_chat/qlora_ddp/sft.sh
9898
bash scripts/qwen_7b_chat/qlora_ddp/infer.sh
9999

examples/pytorch/llm/README_CN.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -29,13 +29,13 @@
2929
3. 支持的特性: 模型量化, DDP, 模型并行(device_map), gradient checkpointing, 梯度累加, 支持推送ModelScope Hub, 自定义数据集, 多模态和Agent SFT, 多轮对话, ...
3030
4. 支持的数据集:
3131
1. NLP: [alpaca-en](https://modelscope.cn/datasets/AI-ModelScope/alpaca-gpt4-data-en/summary)(gpt4), [alpaca-zh](https://modelscope.cn/datasets/AI-ModelScope/alpaca-gpt4-data-zh/summary)(gpt4), finance-en, multi-alpaca-all, code-en, instinwild-en, instinwild-zh, cot-en, cot-zh, firefly-all-zh, poetry-zh, instruct-en, gpt4all-en, cmnli-zh, [jd-zh](https://modelscope.cn/datasets/DAMO_NLP/jd/summary), [dureader-robust-zh](https://modelscope.cn/datasets/modelscope/DuReader_robust-QG/summary), medical-en, medical-zh, medical-mini-zh, sharegpt-en, sharegpt-zh, [code-python-zh](https://modelscope.cn/datasets/codefuse-ai/CodeExercise-Python-27k/summary), [advertise-gen](https://modelscope.cn/datasets/lvjianjin/AdvertiseGen/summary)
32-
2. agent: [damo-agent-zh](https://modelscope.cn/datasets/damo/MSAgent-Bench/summary), damo-agent-mini-zh
32+
2. Agent: [damo-agent-zh](https://modelscope.cn/datasets/damo/MSAgent-Bench/summary), damo-agent-mini-zh
3333
3. 多模态: [coco-en](https://modelscope.cn/datasets/modelscope/coco_2014_caption/summary)
3434
4. 其他: [cls-fudan-news-zh](https://modelscope.cn/datasets/damo/zh_cls_fudan-news/files), [ner-jave-zh](https://modelscope.cn/datasets/damo/zh_ner-JAVE/summary)
3535
5. 支持的对话模板: chatml(qwen), baichuan, chatglm2, llama, openbuddy-llama, default, default-generation
3636

3737
## 准备实验环境
38-
实验环境: V100, A10, 3090, A100均可. (V100不支持bf16, 量化)
38+
实验环境: V100, A10, 3090, A100均可.
3939
```bash
4040
# 安装miniconda
4141
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
@@ -76,7 +76,7 @@ cd swift/examples/pytorch/llm
7676
# 如果你想在训练时, 将权重push到modelscope hub中, 你需要设置`--push_to_hub true`.
7777
# 推荐的实验环境: A100
7878
bash scripts/qwen_7b_chat/lora/sft.sh
79-
bash scripts/qwen_7b_chat/lora/merge_lora_and_infer.sh
79+
bash scripts/qwen_7b_chat/lora/infer.sh
8080

8181
# 微调(lora+ddp)+推理 qwen-7b-chat, 需要2卡*38GB显存.
8282
# 推荐的实验环境: A100
@@ -90,12 +90,12 @@ bash scripts/qwen_7b_chat/lora_mp_ddp/infer.sh
9090

9191
# 微调(qlora)+推理 qwen-7b-chat, 需要12GB显存.
9292
# 如果你想要使用量化, 你需要`pip install bitsandbytes -U`
93-
# 推荐的实验环境: 3090, A10
93+
# 推荐的实验环境: V100, 3090, A10
9494
bash scripts/qwen_7b_chat/qlora/sft.sh
95-
bash scripts/qwen_7b_chat/qlora/merge_lora_and_infer.sh
95+
bash scripts/qwen_7b_chat/qlora/infer.sh
9696

9797
# 微调(qlora+ddp)+推理 qwen-7b-chat, 需要2卡*14GB显存.
98-
# 推荐的实验环境: 3090, A10
98+
# 推荐的实验环境: V100, 3090, A10
9999
bash scripts/qwen_7b_chat/qlora_ddp/sft.sh
100100
bash scripts/qwen_7b_chat/qlora_ddp/infer.sh
101101

Original file line numberDiff line numberDiff line change
@@ -1,10 +1,10 @@
11
CUDA_VISIBLE_DEVICES=0 \
22
python src/llm_infer.py \
3-
--model_type baichuan2-7b-chat \
3+
--model_type baichuan2-7b \
44
--sft_type lora \
5-
--template_type baichuan \
5+
--template_type default \
66
--dtype bf16 \
7-
--ckpt_dir "output/baichuan2-7b-chat/vx_xxx/checkpoint-xxx" \
7+
--ckpt_dir "output/baichuan2-7b/vx_xxx/checkpoint-xxx" \
88
--eval_human false \
99
--dataset advertise-gen \
1010
--max_length 2048 \
@@ -15,3 +15,4 @@ python src/llm_infer.py \
1515
--top_k 20 \
1616
--top_p 0.9 \
1717
--do_sample true \
18+
--merge_lora_and_save false \

examples/pytorch/llm/scripts/baichuan2_7b_chat/qlora/sft.sh renamed to examples/pytorch/llm/scripts/baichuan2_7b/qlora/sft.sh

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2,9 +2,9 @@
22
# 12GB GPU memory
33
CUDA_VISIBLE_DEVICES=0 \
44
python src/llm_sft.py \
5-
--model_type baichuan2-7b-chat \
5+
--model_type baichuan2-7b \
66
--sft_type lora \
7-
--template_type baichuan \
7+
--template_type default \
88
--dtype bf16 \
99
--output_dir output \
1010
--dataset advertise-gen \
@@ -29,6 +29,6 @@ python src/llm_sft.py \
2929
--save_total_limit 2 \
3030
--logging_steps 10 \
3131
--push_to_hub false \
32-
--hub_model_id baichuan2-7b-chat-qlora \
32+
--hub_model_id baichuan2-7b-qlora \
3333
--hub_private_repo true \
3434
--hub_token 'your-sdk-token' \

examples/pytorch/llm/scripts/baichuan2_7b_chat/lora_ddp/infer.sh

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,3 +13,4 @@ python src/llm_infer.py \
1313
--top_k 20 \
1414
--top_p 0.9 \
1515
--do_sample true \
16+
--merge_lora_and_save false \

examples/pytorch/llm/scripts/baichuan2_7b_chat/lora_ddp/sft.sh

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
# Experimental environment: 2 * A100
2-
# 2 * 44GB GPU memory
2+
# 2 * 30GB GPU memory
33
nproc_per_node=2
44
CUDA_VISIBLE_DEVICES=0,1 \
55
torchrun \
@@ -19,7 +19,7 @@ torchrun \
1919
--lora_rank 8 \
2020
--lora_alpha 32 \
2121
--lora_dropout_p 0. \
22-
--lora_target_modules W_pack o_proj \
22+
--lora_target_modules ALL \
2323
--gradient_checkpointing false \
2424
--batch_size 1 \
2525
--weight_decay 0. \

examples/pytorch/llm/scripts/internlm_20b/lora_ddp/infer.sh

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,3 +13,4 @@ python src/llm_infer.py \
1313
--top_k 20 \
1414
--top_p 0.9 \
1515
--do_sample true \
16+
--merge_lora_and_save false \

examples/pytorch/llm/scripts/internlm_20b/qlora/infer.sh

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,3 +15,4 @@ python src/llm_infer.py \
1515
--top_k 20 \
1616
--top_p 0.9 \
1717
--do_sample true \
18+
--merge_lora_and_save false \

examples/pytorch/llm/scripts/internlm_20b_chat/lora_ddp/infer.sh

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,3 +13,4 @@ python src/llm_infer.py \
1313
--top_k 20 \
1414
--top_p 0.9 \
1515
--do_sample true \
16+
--merge_lora_and_save false \

examples/pytorch/llm/scripts/qwen_7b/lora_ddp/infer.sh

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,4 @@
1+
# If you want to merge LoRA weights, please set merge_lora_and_save to true.
12
CUDA_VISIBLE_DEVICES=0 \
23
python src/llm_infer.py \
34
--model_type qwen-7b \
@@ -14,3 +15,4 @@ python src/llm_infer.py \
1415
--top_k 20 \
1516
--top_p 0.9 \
1617
--do_sample true \
18+
--merge_lora_and_save false \

0 commit comments

Comments
 (0)