Skip to content

Commit 6fd1c17

Browse files
authored
Feat 0919 (#78)
1 parent 1a8819b commit 6fd1c17

File tree

19 files changed

+321
-433
lines changed

19 files changed

+321
-433
lines changed

.gitignore

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -124,7 +124,7 @@ replace.sh
124124
result.png
125125
result.jpg
126126
result.mp4
127-
runs/
127+
output/
128128
*.out
129129

130130
# Pytorch

README.md

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -40,18 +40,18 @@ Users can check the [documentation of Swift](./docs/Get Started/1.Introduction.m
4040
2. supported models:
4141
1. qwen series: qwen-7b, [qwen-7b-chat](https://github.com/QwenLM/Qwen-7B)
4242
2. qwen-vl series: qwen-vl, [qwen-vl-chat](https://github.com/QwenLM/Qwen-VL)
43-
3. baichuan series: baichuan-7b, baichuan-13b, baichuan-13b-chat, baichuan2-7b, baichuan2-7b-chat, baichuan2-13b, baichuan2-13b-chat
44-
4. chatglm2 series: chatglm2-6b, chatglm2-6b-32k
45-
5. llama series: llama2-7b, llama2-7b-chat, llama2-13b, llama2-13b-chat, llama2-70b, llama2-70b-chat
46-
6. openbuddy-llama series: openbuddy-llama2-13b, openbuddy-llama-65b, openbuddy-llama2-70b
47-
7. internlm series: internlm-7b, internlm-7b-chat, internlm-7b-chat-8k
48-
8. other: polylm-13b, seqgpt-560m
43+
3. baichuan series: baichuan-7b, baichuan-13b, baichuan-13b-chat, baichuan2-7b, [baichuan2-7b-chat](https://modelscope.cn/models/baichuan-inc/Baichuan2-7B-Chat/summary), baichuan2-13b, baichuan2-13b-chat
44+
4. chatglm2 series: [chatglm2-6b](https://modelscope.cn/models/ZhipuAI/chatglm2-6b/summary), chatglm2-6b-32k
45+
5. llama series: llama2-7b, llama2-7b-chat, llama2-13b, llama2-13b-chat, llama2-70b, [llama2-70b-chat](https://modelscope.cn/models/modelscope/Llama-2-70b-chat-ms/summary)
46+
6. openbuddy-llama series: openbuddy-llama2-13b, openbuddy-llama-65b, [openbuddy-llama2-70b](https://modelscope.cn/models/OpenBuddy/openbuddy-llama2-70b-v10.1-bf16/summary)
47+
7. internlm series: internlm-7b, internlm-7b-chat, internlm-7b-chat-8k, [internlm-20b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm-20b/summary), [internlm-20b-chat](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm-20b-chat/summary)
48+
8. other: [polylm-13b](https://modelscope.cn/models/damo/nlp_polylm_13b_text_generation/summary), [seqgpt-560m](https://modelscope.cn/models/damo/nlp_seqgpt-560m/summary)
4949
3. supported features: quantization, DDP, model parallelism(device map), gradient checkpointing, gradient accumulation, pushing to modelscope hub, custom datasets, multimodal and agent SFT, mutli-round chat, ...
5050
4. supported datasets:
51-
1. NLP: alpaca-en(gpt4), alpaca-zh(gpt4), finance-en, multi-alpaca-all, code-en, instinwild-en, instinwild-zh, cot-en, cot-zh, firefly-all-zh, poetry-zh, instruct-en, gpt4all-en, cmnli-zh, jd-zh, dureader-robust-zh, medical-en, medical-zh, medical-mini-zh, sharegpt-en, sharegpt-zh
51+
1. NLP: [alpaca-en](https://modelscope.cn/datasets/AI-ModelScope/alpaca-gpt4-data-en/summary)(gpt4), [alpaca-zh](https://modelscope.cn/datasets/AI-ModelScope/alpaca-gpt4-data-zh/summary)(gpt4), finance-en, multi-alpaca-all, code-en, instinwild-en, instinwild-zh, cot-en, cot-zh, firefly-all-zh, poetry-zh, instruct-en, gpt4all-en, cmnli-zh, [jd-zh](https://modelscope.cn/datasets/DAMO_NLP/jd/summary), [dureader-robust-zh](https://modelscope.cn/datasets/modelscope/DuReader_robust-QG/summary), medical-en, medical-zh, medical-mini-zh, sharegpt-en, sharegpt-zh, [code-python-zh](https://modelscope.cn/datasets/codefuse-ai/CodeExercise-Python-27k/summary), [advertise-gen](https://modelscope.cn/datasets/lvjianjin/AdvertiseGen/summary)
5252
2. agent: [damo-agent-zh](https://modelscope.cn/datasets/damo/MSAgent-Bench/summary), damo-agent-mini-zh
53-
3. multi-modal: coco-en
54-
4. other: cls-fudan-news-zh, ner-jave-zh
53+
3. multi-modal: [coco-en](https://modelscope.cn/datasets/modelscope/coco_2014_caption/summary)
54+
4. other: [cls-fudan-news-zh](https://modelscope.cn/datasets/damo/zh_cls_fudan-news/files), [ner-jave-zh](https://modelscope.cn/datasets/damo/zh_ner-JAVE/summary)
5555
5. supported templates: chatml(qwen), baichuan, chatglm2, llama, openbuddy-llama, default, default-generation
5656

5757
# Installation

README_CN.md

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -38,18 +38,18 @@ SWIFT(Scalable lightWeight Infrastructure for Fine-Tuning)是一个可扩展
3838
2. 支持的模型:
3939
1. qwen 系列: qwen-7b, [qwen-7b-chat](https://github.com/QwenLM/Qwen-7B)
4040
2. qwen-vl 系列: qwen-vl, [qwen-vl-chat](https://github.com/QwenLM/Qwen-VL)
41-
3. baichuan 系列: baichuan-7b, baichuan-13b, baichuan-13b-chat, baichuan2-7b, baichuan2-7b-chat, baichuan2-13b, baichuan2-13b-chat
42-
4. chatglm2 系列: chatglm2-6b, chatglm2-6b-32k
43-
5. llama 系列: llama2-7b, llama2-7b-chat, llama2-13b, llama2-13b-chat, llama2-70b, llama2-70b-chat
44-
6. openbuddy-llama 系列: openbuddy-llama2-13b, openbuddy-llama-65b, openbuddy-llama2-70b
45-
7. internlm 系列: internlm-7b, internlm-7b-chat, internlm-7b-chat-8k
46-
8. other: polylm-13b, seqgpt-560m
41+
3. baichuan 系列: baichuan-7b, baichuan-13b, baichuan-13b-chat, baichuan2-7b, [baichuan2-7b-chat](https://modelscope.cn/models/baichuan-inc/Baichuan2-7B-Chat/summary), baichuan2-13b, baichuan2-13b-chat
42+
4. chatglm2 系列: [chatglm2-6b](https://modelscope.cn/models/ZhipuAI/chatglm2-6b/summary), chatglm2-6b-32k
43+
5. llama 系列: llama2-7b, llama2-7b-chat, llama2-13b, llama2-13b-chat, llama2-70b, [llama2-70b-chat](https://modelscope.cn/models/modelscope/Llama-2-70b-chat-ms/summary)
44+
6. openbuddy-llama 系列: openbuddy-llama2-13b, openbuddy-llama-65b, [openbuddy-llama2-70b](https://modelscope.cn/models/OpenBuddy/openbuddy-llama2-70b-v10.1-bf16/summary)
45+
7. internlm 系列: internlm-7b, internlm-7b-chat, internlm-7b-chat-8k, [internlm-20b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm-20b/summary), [internlm-20b-chat](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm-20b-chat/summary)
46+
8. other: [polylm-13b](https://modelscope.cn/models/damo/nlp_polylm_13b_text_generation/summary), [seqgpt-560m](https://modelscope.cn/models/damo/nlp_seqgpt-560m/summary)
4747
3. 支持的特性: 模型量化, DDP, 模型并行(device_map), gradient checkpointing, 梯度累加, 支持推送ModelScope Hub, 自定义数据集, 多模态和Agent SFT, 多轮对话, ...
4848
4. 支持的数据集:
49-
1. NLP: alpaca-en(gpt4), alpaca-zh(gpt4), finance-en, multi-alpaca-all, code-en, instinwild-en, instinwild-zh, cot-en, cot-zh, firefly-all-zh, poetry-zh, instruct-en, gpt4all-en, cmnli-zh, jd-zh, dureader-robust-zh, medical-en, medical-zh, medical-mini-zh, sharegpt-en, sharegpt-zh
49+
1. NLP: [alpaca-en](https://modelscope.cn/datasets/AI-ModelScope/alpaca-gpt4-data-en/summary)(gpt4), [alpaca-zh](https://modelscope.cn/datasets/AI-ModelScope/alpaca-gpt4-data-zh/summary)(gpt4), finance-en, multi-alpaca-all, code-en, instinwild-en, instinwild-zh, cot-en, cot-zh, firefly-all-zh, poetry-zh, instruct-en, gpt4all-en, cmnli-zh, [jd-zh](https://modelscope.cn/datasets/DAMO_NLP/jd/summary), [dureader-robust-zh](https://modelscope.cn/datasets/modelscope/DuReader_robust-QG/summary), medical-en, medical-zh, medical-mini-zh, sharegpt-en, sharegpt-zh, [code-python-zh](https://modelscope.cn/datasets/codefuse-ai/CodeExercise-Python-27k/summary), [advertise-gen](https://modelscope.cn/datasets/lvjianjin/AdvertiseGen/summary)
5050
2. agent: [damo-agent-zh](https://modelscope.cn/datasets/damo/MSAgent-Bench/summary), damo-agent-mini-zh
51-
3. 多模态: coco-en
52-
4. 其他: cls-fudan-news-zh, ner-jave-zh
51+
3. 多模态: [coco-en](https://modelscope.cn/datasets/modelscope/coco_2014_caption/summary)
52+
4. 其他: [cls-fudan-news-zh](https://modelscope.cn/datasets/damo/zh_cls_fudan-news/files), [ner-jave-zh](https://modelscope.cn/datasets/damo/zh_ner-JAVE/summary)
5353
5. 支持的对话模板: chatml(qwen), baichuan, chatglm2, llama, openbuddy-llama, default, default-generation
5454

5555
# 安装

examples/pytorch/llm/README.md

Lines changed: 12 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -15,22 +15,22 @@
1515
</p>
1616

1717
## Features
18-
1. supported SFT methods: [lora](https://arxiv.org/abs/2106.09685), [qlora](https://arxiv.org/abs/2305.14314), full(full parameter fine-tuning)
18+
1. supported SFT methods: [LoRA](https://arxiv.org/abs/2106.09685), [QLoRA](https://arxiv.org/abs/2305.14314), full(full parameter fine-tuning)
1919
2. supported models:
2020
1. qwen series: qwen-7b, [qwen-7b-chat](https://github.com/QwenLM/Qwen-7B)
2121
2. qwen-vl series: qwen-vl, [qwen-vl-chat](https://github.com/QwenLM/Qwen-VL)
22-
3. baichuan series: baichuan-7b, baichuan-13b, baichuan-13b-chat, baichuan2-7b, baichuan2-7b-chat, baichuan2-13b, baichuan2-13b-chat
23-
4. chatglm2 series: chatglm2-6b, chatglm2-6b-32k
24-
5. llama series: llama2-7b, llama2-7b-chat, llama2-13b, llama2-13b-chat, llama2-70b, llama2-70b-chat
25-
6. openbuddy-llama series: openbuddy-llama2-13b, openbuddy-llama-65b, openbuddy-llama2-70b
26-
7. internlm series: internlm-7b, internlm-7b-chat, internlm-7b-chat-8k, internlm-20b, internlm-20-chat
27-
8. other: polylm-13b, seqgpt-560m
22+
3. baichuan series: baichuan-7b, baichuan-13b, baichuan-13b-chat, baichuan2-7b, [baichuan2-7b-chat](https://modelscope.cn/models/baichuan-inc/Baichuan2-7B-Chat/summary), baichuan2-13b, baichuan2-13b-chat
23+
4. chatglm2 series: [chatglm2-6b](https://modelscope.cn/models/ZhipuAI/chatglm2-6b/summary), chatglm2-6b-32k
24+
5. llama series: llama2-7b, llama2-7b-chat, llama2-13b, llama2-13b-chat, llama2-70b, [llama2-70b-chat](https://modelscope.cn/models/modelscope/Llama-2-70b-chat-ms/summary)
25+
6. openbuddy-llama series: openbuddy-llama2-13b, openbuddy-llama-65b, [openbuddy-llama2-70b](https://modelscope.cn/models/OpenBuddy/openbuddy-llama2-70b-v10.1-bf16/summary)
26+
7. internlm series: internlm-7b, internlm-7b-chat, internlm-7b-chat-8k, [internlm-20b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm-20b/summary), [internlm-20b-chat](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm-20b-chat/summary)
27+
8. other: [polylm-13b](https://modelscope.cn/models/damo/nlp_polylm_13b_text_generation/summary), [seqgpt-560m](https://modelscope.cn/models/damo/nlp_seqgpt-560m/summary)
2828
3. supported features: quantization, DDP, model parallelism(device map), gradient checkpointing, gradient accumulation, pushing to modelscope hub, custom datasets, multimodal and agent SFT, mutli-round chat, ...
2929
4. supported datasets:
30-
1. NLP: alpaca-en(gpt4), alpaca-zh(gpt4), finance-en, multi-alpaca-all, code-en, instinwild-en, instinwild-zh, cot-en, cot-zh, firefly-all-zh, poetry-zh, instruct-en, gpt4all-en, cmnli-zh, jd-zh, dureader-robust-zh, medical-en, medical-zh, medical-mini-zh, sharegpt-en, sharegpt-zh, code-python-zh, advertise-gen
30+
1. NLP: [alpaca-en](https://modelscope.cn/datasets/AI-ModelScope/alpaca-gpt4-data-en/summary)(gpt4), [alpaca-zh](https://modelscope.cn/datasets/AI-ModelScope/alpaca-gpt4-data-zh/summary)(gpt4), finance-en, multi-alpaca-all, code-en, instinwild-en, instinwild-zh, cot-en, cot-zh, firefly-all-zh, poetry-zh, instruct-en, gpt4all-en, cmnli-zh, [jd-zh](https://modelscope.cn/datasets/DAMO_NLP/jd/summary), [dureader-robust-zh](https://modelscope.cn/datasets/modelscope/DuReader_robust-QG/summary), medical-en, medical-zh, medical-mini-zh, sharegpt-en, sharegpt-zh, [code-python-zh](https://modelscope.cn/datasets/codefuse-ai/CodeExercise-Python-27k/summary), [advertise-gen](https://modelscope.cn/datasets/lvjianjin/AdvertiseGen/summary)
3131
2. agent: [damo-agent-zh](https://modelscope.cn/datasets/damo/MSAgent-Bench/summary), damo-agent-mini-zh
32-
3. multi-modal: coco-en
33-
4. other: cls-fudan-news-zh, ner-jave-zh
32+
3. multi-modal: [coco-en](https://modelscope.cn/datasets/modelscope/coco_2014_caption/summary)
33+
4. other: [cls-fudan-news-zh](https://modelscope.cn/datasets/damo/zh_cls_fudan-news/files), [ner-jave-zh](https://modelscope.cn/datasets/damo/zh_ner-JAVE/summary)
3434
5. supported templates: chatml(qwen), baichuan, chatglm2, llama, openbuddy-llama, default, default-generation
3535

3636
## Prepare the Environment
@@ -53,13 +53,11 @@ pip install matplotlib scikit-learn tqdm tensorboard -U
5353
pip install transformers datasets -U
5454
pip install accelerate transformers_stream_generator -U
5555

56-
pip install ms-swift modelscope -U
56+
pip install modelscope -U
5757
# Recommended installation from source code for faster bug fixes
5858
git clone https://github.com/modelscope/swift.git
5959
cd swift
60-
pip install -r requirements.txt
6160
pip install .
62-
# same as modelscope...(git clone ...)
6361
```
6462

6563
## Run SFT and Inference
@@ -108,6 +106,7 @@ bash scripts/qwen_7b_chat/full_mp/infer.sh
108106
# Recommended experimental environment: A100
109107
bash scripts/qwen_7b_chat/full_mp_ddp/sft.sh
110108
bash scripts/qwen_7b_chat/full_mp_ddp/infer.sh
109+
111110
# For more scripts, please see `scripts/` folder.
112111
```
113112

examples/pytorch/llm/README_CN.md

Lines changed: 12 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -16,22 +16,22 @@
1616

1717

1818
## 特性
19-
1. 支持的SFT方法: [lora](https://arxiv.org/abs/2106.09685), [qlora](https://arxiv.org/abs/2305.14314), 全参数微调
19+
1. 支持的SFT方法: [LoRA](https://arxiv.org/abs/2106.09685), [QLoRA](https://arxiv.org/abs/2305.14314), 全参数微调
2020
2. 支持的模型:
2121
1. qwen 系列: qwen-7b, [qwen-7b-chat](https://github.com/QwenLM/Qwen-7B)
2222
2. qwen-vl 系列: qwen-vl, [qwen-vl-chat](https://github.com/QwenLM/Qwen-VL)
23-
3. baichuan 系列: baichuan-7b, baichuan-13b, baichuan-13b-chat, baichuan2-7b, baichuan2-7b-chat, baichuan2-13b, baichuan2-13b-chat
24-
4. chatglm2 系列: chatglm2-6b, chatglm2-6b-32k
25-
5. llama 系列: llama2-7b, llama2-7b-chat, llama2-13b, llama2-13b-chat, llama2-70b, llama2-70b-chat
26-
6. openbuddy-llama 系列: openbuddy-llama2-13b, openbuddy-llama-65b, openbuddy-llama2-70b
27-
7. internlm 系列: internlm-7b, internlm-7b-chat, internlm-7b-chat-8k, internlm-20b, internlm-20-chat
28-
8. other: polylm-13b, seqgpt-560m
23+
3. baichuan 系列: baichuan-7b, baichuan-13b, baichuan-13b-chat, baichuan2-7b, [baichuan2-7b-chat](https://modelscope.cn/models/baichuan-inc/Baichuan2-7B-Chat/summary), baichuan2-13b, baichuan2-13b-chat
24+
4. chatglm2 系列: [chatglm2-6b](https://modelscope.cn/models/ZhipuAI/chatglm2-6b/summary), chatglm2-6b-32k
25+
5. llama 系列: llama2-7b, llama2-7b-chat, llama2-13b, llama2-13b-chat, llama2-70b, [llama2-70b-chat](https://modelscope.cn/models/modelscope/Llama-2-70b-chat-ms/summary)
26+
6. openbuddy-llama 系列: openbuddy-llama2-13b, openbuddy-llama-65b, [openbuddy-llama2-70b](https://modelscope.cn/models/OpenBuddy/openbuddy-llama2-70b-v10.1-bf16/summary)
27+
7. internlm 系列: internlm-7b, internlm-7b-chat, internlm-7b-chat-8k, [internlm-20b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm-20b/summary), [internlm-20b-chat](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm-20b-chat/summary)
28+
8. other: [polylm-13b](https://modelscope.cn/models/damo/nlp_polylm_13b_text_generation/summary), [seqgpt-560m](https://modelscope.cn/models/damo/nlp_seqgpt-560m/summary)
2929
3. 支持的特性: 模型量化, DDP, 模型并行(device_map), gradient checkpointing, 梯度累加, 支持推送ModelScope Hub, 自定义数据集, 多模态和Agent SFT, 多轮对话, ...
3030
4. 支持的数据集:
31-
1. NLP: alpaca-en(gpt4), alpaca-zh(gpt4), finance-en, multi-alpaca-all, code-en, instinwild-en, instinwild-zh, cot-en, cot-zh, firefly-all-zh, poetry-zh, instruct-en, gpt4all-en, cmnli-zh, jd-zh, dureader-robust-zh, medical-en, medical-zh, medical-mini-zh, sharegpt-en, sharegpt-zh, code-python-zh, advertise-gen
31+
1. NLP: [alpaca-en](https://modelscope.cn/datasets/AI-ModelScope/alpaca-gpt4-data-en/summary)(gpt4), [alpaca-zh](https://modelscope.cn/datasets/AI-ModelScope/alpaca-gpt4-data-zh/summary)(gpt4), finance-en, multi-alpaca-all, code-en, instinwild-en, instinwild-zh, cot-en, cot-zh, firefly-all-zh, poetry-zh, instruct-en, gpt4all-en, cmnli-zh, [jd-zh](https://modelscope.cn/datasets/DAMO_NLP/jd/summary), [dureader-robust-zh](https://modelscope.cn/datasets/modelscope/DuReader_robust-QG/summary), medical-en, medical-zh, medical-mini-zh, sharegpt-en, sharegpt-zh, [code-python-zh](https://modelscope.cn/datasets/codefuse-ai/CodeExercise-Python-27k/summary), [advertise-gen](https://modelscope.cn/datasets/lvjianjin/AdvertiseGen/summary)
3232
2. agent: [damo-agent-zh](https://modelscope.cn/datasets/damo/MSAgent-Bench/summary), damo-agent-mini-zh
33-
3. 多模态: coco-en
34-
4. 其他: cls-fudan-news-zh, ner-jave-zh
33+
3. 多模态: [coco-en](https://modelscope.cn/datasets/modelscope/coco_2014_caption/summary)
34+
4. 其他: [cls-fudan-news-zh](https://modelscope.cn/datasets/damo/zh_cls_fudan-news/files), [ner-jave-zh](https://modelscope.cn/datasets/damo/zh_ner-JAVE/summary)
3535
5. 支持的对话模板: chatml(qwen), baichuan, chatglm2, llama, openbuddy-llama, default, default-generation
3636

3737
## 准备实验环境
@@ -55,13 +55,11 @@ pip install matplotlib scikit-learn tqdm tensorboard -U
5555
pip install transformers datasets -U
5656
pip install accelerate transformers_stream_generator -U
5757

58-
pip install ms-swift modelscope -U
59-
# 推荐从源码安装swift和modelscope, 这具有更多的特性和更快的bug修复
58+
pip install modelscope -U
59+
# 推荐从源码安装swift, 这具有更多的特性和更快的bug修复
6060
git clone https://github.com/modelscope/swift.git
6161
cd swift
62-
pip install -r requirements.txt
6362
pip install .
64-
# modelscope类似...(git clone ...)
6563
```
6664

6765
## 微调和推理

examples/pytorch/llm/src/llm_infer.py

Lines changed: 5 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,16 +1,15 @@
11
# Copyright (c) Alibaba, Inc. and its affiliates.
22
import os
3-
# os.environ['CUDA_VISIBLE_DEVICES'] = '0'
4-
from dataclasses import dataclass, field
5-
from typing import Optional
63

4+
# os.environ['CUDA_VISIBLE_DEVICES'] = '0'
75
import torch
86
from transformers import BitsAndBytesConfig, GenerationConfig, TextStreamer
97
from utils import (InferArguments, get_dataset, get_model_tokenizer,
10-
get_preprocess, inference, show_layers)
8+
get_preprocess)
119

1210
from swift import Swift, get_logger
13-
from swift.utils import parse_args, print_model_info, seed_everything
11+
from swift.utils import (inference, parse_args, print_model_info,
12+
seed_everything, show_layers)
1413

1514
logger = get_logger()
1615

@@ -42,7 +41,7 @@ def llm_infer(args: InferArguments) -> None:
4241
model, tokenizer = get_model_tokenizer(
4342
args.model_type, torch_dtype=args.torch_dtype, **kwargs)
4443

45-
# ### Preparing lora
44+
# ### Preparing LoRA
4645
if args.sft_type == 'lora':
4746
model = Swift.from_pretrained(
4847
model, args.ckpt_dir, inference_mode=True)

examples/pytorch/llm/src/llm_sft.py

Lines changed: 9 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -2,26 +2,23 @@
22
import inspect
33
import os
44
# os.environ['CUDA_VISIBLE_DEVICES'] = '0'
5-
from dataclasses import dataclass, field
65
from functools import partial
7-
from typing import List, Optional
86

97
import json
108
import numpy as np
119
import torch
12-
import torch.distributed as dist
1310
from transformers import BitsAndBytesConfig, GenerationConfig
14-
from utils import (SftArguments, broadcast_string, check_json_format,
15-
compute_nlg_metrics, dataset_map, find_all_linear_for_lora,
16-
get_dataset, get_dist_setting, get_model_tokenizer,
17-
get_preprocess, is_ddp_plus_mp, is_dist, is_master,
18-
plot_images, show_layers, sort_by_max_length)
11+
from utils import (SftArguments, dataset_map, get_dataset, get_model_tokenizer,
12+
get_preprocess)
1913

2014
from swift import (LoraConfig, LoRAConfig, Seq2SeqTrainer,
2115
Seq2SeqTrainingArguments, Swift, get_logger)
22-
from swift.utils import (add_version_to_work_dir, parse_args, print_model_info,
23-
seed_everything)
24-
from swift.utils.llm_utils import data_collate_fn, print_example, stat_dataset
16+
from swift.utils import (add_version_to_work_dir, broadcast_string,
17+
check_json_format, compute_nlg_metrics,
18+
data_collate_fn, get_dist_setting, is_ddp_plus_mp,
19+
is_dist, is_master, parse_args, plot_images,
20+
print_example, print_model_info, seed_everything,
21+
show_layers, sort_by_max_length, stat_dataset)
2522

2623
logger = get_logger()
2724

@@ -56,6 +53,7 @@ def llm_sft(args: SftArguments) -> None:
5653
model, tokenizer = get_model_tokenizer(
5754
args.model_type, torch_dtype=args.torch_dtype, **kwargs)
5855

56+
# ### Preparing LoRA
5957
if args.resume_from_ckpt is None:
6058
if args.sft_type == 'lora':
6159
if 'ALL' in args.lora_target_modules:

0 commit comments

Comments
 (0)