Skip to content

Commit a30d235

Browse files
committed
Fix bug 1206 (#202)
1 parent 7ba16a0 commit a30d235

File tree

9 files changed

+31
-28
lines changed

9 files changed

+31
-28
lines changed

docs/source/LLM/LLM微调文档.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -116,7 +116,7 @@ cd examples/pytorch/llm
116116
- 如果你想要使用基于**auto_gptq**的量化, 你需要先安装对应cuda版本的[auto_gptq](https://github.com/PanQiWei/AutoGPTQ): `pip install auto_gptq -U`.
117117
> 使用auto_gptq的模型可以查看[LLM支持的模型](https://github.com/modelscope/swift/blob/main/docs/source/LLM/支持的模型和数据集.md#模型). 建议使用auto_gptq, 而不是bnb.
118118
- 如果你想要使用deepspeed, 你需要`pip install deepspeed -U`. 使用deepspeed可以**节约显存**, 但可能会略微降低训练速度.
119-
- 如果你的训练涉及到知识编辑的内容, 例如: [自我认知微调](https://github.com/modelscope/swift/blob/main/docs/source/LLM/自我认知微调最佳实践.md), 你需要在MLP上也加上LoRA, 否则可能会效果不佳. 你可以简单传入参数`--lora_target_modules ALL`来对所有的linear(qkvo, mlp)加上lora, 这通常是效果最好的.
119+
- 如果你的训练涉及到**知识编辑**的内容, 例如: [自我认知微调](https://github.com/modelscope/swift/blob/main/docs/source/LLM/自我认知微调最佳实践.md), 你需要在MLP上也加上LoRA, 否则可能会效果不佳. 你可以简单传入参数`--lora_target_modules ALL`来对所有的linear(qkvo, mlp)加上lora, **这通常是效果最好的**.
120120
- 如果你使用的是**V100**等较老的GPU, 你需要设置`--dtype AUTO`或者`--dtype fp16`, 因为其不支持bf16.
121121
- 如果你的机器是A100等高性能显卡, 且使用的是qwen系列模型, 推荐你安装[**flash-attn**](https://github.com/Dao-AILab/flash-attention), 这将会加快训练和推理的速度以及显存占用(A10, 3090, V100等显卡不支持flash-attn进行训练). 支持flash-attn的模型可以查看[LLM支持的模型](https://github.com/modelscope/swift/blob/main/docs/source/LLM/支持的模型和数据集.md#模型)
122122
- 如果你要进行**二次预训练**, **多轮对话**, 你可以参考[自定义与拓展](https://github.com/modelscope/swift/blob/main/docs/source/LLM/自定义与拓展.md#注册数据集的方式)

docs/source/LLM/命令行参数.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,7 @@
3535
- `--bnb_4bit_comp_dtype`: 在进行4bit量化时, 我们需要在模型的forward和backward时, 将其进行反量化. 该参数用于指定反量化后的torch_dtype. 默认为`'AUTO'`, 即与`dtype`保持一致. 可选择的值包括: 'fp16', 'bf16', 'fp32'. 当quantization_bit为0时, 该参数无效.
3636
- `--bnb_4bit_quant_type`: 4bit量化时的量化方式, 默认是`'nf4'`. 可选择的值包括: 'nf4', 'fp4'. 当quantization_bit为0时, 该参数无效.
3737
- `--bnb_4bit_use_double_quant`: 是否在4bit量化时开启double量化, 默认为`True`. 当quantization_bit为0时, 该参数无效.
38-
- `--lora_target_modules`: 指定lora模块, 默认为`None`. 如果lora_target_modules为None, 或者传入`'DEFAULT'` or `'AUTO'`, 则根据`model_type`查找`MODEL_MAPPING`中的`lora_target_modules`(默认指定为qkv). 如果传入`ALL`, 则将所有的Linear层都指定为lora模块(不含head). 该参数只有当`sft_type`指定为'lora'时才生效.
38+
- `--lora_target_modules`: 指定lora模块, 默认为`None`. 如果lora_target_modules为None, 或者传入`'DEFAULT'` or `'AUTO'`, 则根据`model_type`查找`MODEL_MAPPING`中的`lora_target_modules`(默认指定为qkv). 如果传入`ALL`, 则将所有的Linear层都指定为lora模块(不含head). 如果内存允许, 建议设置成'ALL'. 该参数只有当`sft_type`指定为'lora'时才生效.
3939
- `--lora_rank`: 默认为`8`. 只有当`sft_type`指定为'lora'时才生效.
4040
- `--lora_alpha`: 默认为`32`. 只有当`sft_type`指定为'lora'时才生效.
4141
- `--lora_dropout_p`: 默认为`0.05`, 只有当`sft_type`指定为'lora'时才生效.

docs/source/LLM/支持的模型和数据集.md

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -15,23 +15,23 @@
1515
| --------- | -------- | --------------------------- | ---------------- | ------------------ | -------- |
1616
|qwen-1_8b|[qwen/Qwen-1_8B](https://modelscope.cn/models/qwen/Qwen-1_8B/summary)|c_attn|default-generation|✔||
1717
|qwen-1_8b-chat|[qwen/Qwen-1_8B-Chat](https://modelscope.cn/models/qwen/Qwen-1_8B-Chat/summary)|c_attn|chatml|✔||
18-
|qwen-1_8b-chat-int4|[qwen/Qwen-1_8B-Chat-Int4](https://modelscope.cn/models/qwen/Qwen-1_8B-Chat-Int4/summary)|c_attn|chatml|✔|auto_gptq>=0.4.2|
19-
|qwen-1_8b-chat-int8|[qwen/Qwen-1_8B-Chat-Int8](https://modelscope.cn/models/qwen/Qwen-1_8B-Chat-Int8/summary)|c_attn|chatml|✔|auto_gptq>=0.4.2|
18+
|qwen-1_8b-chat-int4|[qwen/Qwen-1_8B-Chat-Int4](https://modelscope.cn/models/qwen/Qwen-1_8B-Chat-Int4/summary)|c_attn|chatml|✔|auto_gptq>=0.5|
19+
|qwen-1_8b-chat-int8|[qwen/Qwen-1_8B-Chat-Int8](https://modelscope.cn/models/qwen/Qwen-1_8B-Chat-Int8/summary)|c_attn|chatml|✔|auto_gptq>=0.5|
2020
|qwen-7b|[qwen/Qwen-7B](https://modelscope.cn/models/qwen/Qwen-7B/summary)|c_attn|default-generation|✔||
2121
|qwen-7b-chat|[qwen/Qwen-7B-Chat](https://modelscope.cn/models/qwen/Qwen-7B-Chat/summary)|c_attn|chatml|✔||
22-
|qwen-7b-chat-int4|[qwen/Qwen-7B-Chat-Int4](https://modelscope.cn/models/qwen/Qwen-7B-Chat-Int4/summary)|c_attn|chatml|✔|auto_gptq>=0.4.2|
23-
|qwen-7b-chat-int8|[qwen/Qwen-7B-Chat-Int8](https://modelscope.cn/models/qwen/Qwen-7B-Chat-Int8/summary)|c_attn|chatml|✔|auto_gptq>=0.4.2|
22+
|qwen-7b-chat-int4|[qwen/Qwen-7B-Chat-Int4](https://modelscope.cn/models/qwen/Qwen-7B-Chat-Int4/summary)|c_attn|chatml|✔|auto_gptq>=0.5|
23+
|qwen-7b-chat-int8|[qwen/Qwen-7B-Chat-Int8](https://modelscope.cn/models/qwen/Qwen-7B-Chat-Int8/summary)|c_attn|chatml|✔|auto_gptq>=0.5|
2424
|qwen-14b|[qwen/Qwen-14B](https://modelscope.cn/models/qwen/Qwen-14B/summary)|c_attn|default-generation|✔||
2525
|qwen-14b-chat|[qwen/Qwen-14B-Chat](https://modelscope.cn/models/qwen/Qwen-14B-Chat/summary)|c_attn|chatml|✔||
26-
|qwen-14b-chat-int4|[qwen/Qwen-14B-Chat-Int4](https://modelscope.cn/models/qwen/Qwen-14B-Chat-Int4/summary)|c_attn|chatml|✔|auto_gptq>=0.4.2|
27-
|qwen-14b-chat-int8|[qwen/Qwen-14B-Chat-Int8](https://modelscope.cn/models/qwen/Qwen-14B-Chat-Int8/summary)|c_attn|chatml|✔|auto_gptq>=0.4.2|
26+
|qwen-14b-chat-int4|[qwen/Qwen-14B-Chat-Int4](https://modelscope.cn/models/qwen/Qwen-14B-Chat-Int4/summary)|c_attn|chatml|✔|auto_gptq>=0.5|
27+
|qwen-14b-chat-int8|[qwen/Qwen-14B-Chat-Int8](https://modelscope.cn/models/qwen/Qwen-14B-Chat-Int8/summary)|c_attn|chatml|✔|auto_gptq>=0.5|
2828
|qwen-72b|[qwen/Qwen-72B](https://modelscope.cn/models/qwen/Qwen-72B/summary)|c_attn|default-generation|✔||
2929
|qwen-72b-chat|[qwen/Qwen-72B-Chat](https://modelscope.cn/models/qwen/Qwen-72B-Chat/summary)|c_attn|chatml|✔||
30-
|qwen-72b-chat-int4|[qwen/Qwen-72B-Chat-Int4](https://modelscope.cn/models/qwen/Qwen-72B-Chat-Int4/summary)|c_attn|chatml|✔|auto_gptq>=0.4.2|
31-
|qwen-72b-chat-int8|[qwen/Qwen-72B-Chat-Int8](https://modelscope.cn/models/qwen/Qwen-72B-Chat-Int8/summary)|c_attn|chatml|✔|auto_gptq>=0.4.2|
30+
|qwen-72b-chat-int4|[qwen/Qwen-72B-Chat-Int4](https://modelscope.cn/models/qwen/Qwen-72B-Chat-Int4/summary)|c_attn|chatml|✔|auto_gptq>=0.5|
31+
|qwen-72b-chat-int8|[qwen/Qwen-72B-Chat-Int8](https://modelscope.cn/models/qwen/Qwen-72B-Chat-Int8/summary)|c_attn|chatml|✔|auto_gptq>=0.5|
3232
|qwen-vl|[qwen/Qwen-VL](https://modelscope.cn/models/qwen/Qwen-VL/summary)|c_attn|default-generation|✔||
3333
|qwen-vl-chat|[qwen/Qwen-VL-Chat](https://modelscope.cn/models/qwen/Qwen-VL-Chat/summary)|c_attn|chatml|✔||
34-
|qwen-vl-chat-int4|[qwen/Qwen-VL-Chat-Int4](https://modelscope.cn/models/qwen/Qwen-VL-Chat-Int4/summary)|c_attn|chatml|✔|auto_gptq>=0.4.2|
34+
|qwen-vl-chat-int4|[qwen/Qwen-VL-Chat-Int4](https://modelscope.cn/models/qwen/Qwen-VL-Chat-Int4/summary)|c_attn|chatml|✔|auto_gptq>=0.5|
3535
|qwen-audio|[qwen/Qwen-Audio](https://modelscope.cn/models/qwen/Qwen-Audio/summary)|c_attn|default-generation|✔||
3636
|qwen-audio-chat|[qwen/Qwen-Audio-Chat](https://modelscope.cn/models/qwen/Qwen-Audio-Chat/summary)|c_attn|chatml|✔||
3737
|chatglm2-6b|[ZhipuAI/chatglm2-6b](https://modelscope.cn/models/ZhipuAI/chatglm2-6b/summary)|query_key_value|chatglm2|✘||
@@ -87,7 +87,7 @@
8787
|seqgpt-560m|[damo/nlp_seqgpt-560m](https://modelscope.cn/models/damo/nlp_seqgpt-560m/summary)|query_key_value|default-generation|✘||
8888
|tongyi-finance-14b|[TongyiFinance/Tongyi-Finance-14B](https://modelscope.cn/models/TongyiFinance/Tongyi-Finance-14B/summary)|c_attn|default-generation|✔||
8989
|tongyi-finance-14b-chat|[TongyiFinance/Tongyi-Finance-14B-Chat](https://modelscope.cn/models/TongyiFinance/Tongyi-Finance-14B-Chat/summary)|c_attn|chatml|✔||
90-
|tongyi-finance-14b-chat-int4|[TongyiFinance/Tongyi-Finance-14B-Chat-Int4](https://modelscope.cn/models/TongyiFinance/Tongyi-Finance-14B-Chat-Int4/summary)|c_attn|chatml|✔|auto_gptq>=0.4.2|
90+
|tongyi-finance-14b-chat-int4|[TongyiFinance/Tongyi-Finance-14B-Chat-Int4](https://modelscope.cn/models/TongyiFinance/Tongyi-Finance-14B-Chat-Int4/summary)|c_attn|chatml|✔|auto_gptq>=0.5|
9191
|codefuse-codellama-34b-chat|[codefuse-ai/CodeFuse-CodeLlama-34B](https://modelscope.cn/models/codefuse-ai/CodeFuse-CodeLlama-34B/summary)|q_proj, k_proj, v_proj|codefuse-codellama|✔||
9292

9393

docs/source/LLM/自我认知微调最佳实践.md

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -176,7 +176,6 @@ swift sft \
176176
--self_cognition_sample 500 \
177177
--model_name 小黄 'Xiao Huang' \
178178
--model_author 魔搭 ModelScope \
179-
--gradient_accumulation_steps 8 \
180179
```
181180

182181
## 微调后推理

swift/llm/infer.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
import datetime as dt
33
import os
44
import shutil
5-
from typing import Tuple
5+
from typing import Literal, Tuple
66

77
import json
88
import torch
@@ -158,8 +158,8 @@ def llm_infer(args: InferArguments) -> None:
158158
if args.save_result and args.ckpt_dir is not None:
159159
time = dt.datetime.now().strftime('%Y%m%d-%H%M%S')
160160
jsonl_path = os.path.join(args.ckpt_dir, f'infer_result_{time}.jsonl')
161-
input_mode: Literal['S', 'M'] = 'S'
162161
if args.eval_human:
162+
input_mode: Literal['S', 'M'] = 'S'
163163
logger.info('Input `exit` to exit the conversation.')
164164
logger.info('Input `multi-line` to switch to multi-line input mode.')
165165
if template.support_multi_round:

swift/llm/utils/argument.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -296,6 +296,7 @@ class InferArguments:
296296
ckpt_dir: Optional[str] = field(
297297
default=None, metadata={'help': '/path/to/your/vx_xxx/checkpoint-xxx'})
298298
load_args_from_ckpt_dir: bool = True
299+
load_dataset_config: bool = True
299300
eval_human: bool = False # False: eval val_dataset
300301

301302
seed: int = 42
@@ -609,7 +610,7 @@ def load_from_ckpt_dir(args: InferArguments) -> None:
609610
'bnb_4bit_comp_dtype', 'bnb_4bit_quant_type',
610611
'bnb_4bit_use_double_quant'
611612
]
612-
if not args.eval_human:
613+
if not args.eval_human and args.load_dataset_config:
613614
imported_keys += [
614615
'dataset', 'dataset_seed', 'dataset_test_ratio',
615616
'check_dataset_strategy', 'custom_train_dataset_path',

swift/llm/utils/model.py

Lines changed: 11 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -796,7 +796,7 @@ def fix_qwen_inplace_bug(model) -> None:
796796
*args, **kwargs).clone()
797797
else:
798798
__old_forward = first_drop.forward
799-
first_drop.forwad = lambda *args, **kwargs: __old_forward(
799+
first_drop.forward = lambda *args, **kwargs: __old_forward(
800800
*args, **kwargs).clone()
801801
first_drop.__old_forward = __old_forward
802802

@@ -882,7 +882,7 @@ def get_model_tokenizer_qwen_audio(model_dir: str,
882882
'qwen/Qwen-1_8B-Chat-Int8',
883883
LoRATM.qwen,
884884
TemplateType.chatml,
885-
requires=['auto_gptq>=0.4.2'],
885+
requires=['auto_gptq>=0.5'],
886886
torch_dtype=torch.float16,
887887
function_kwargs={'bits': 8},
888888
support_flash_attn=True)
@@ -891,7 +891,7 @@ def get_model_tokenizer_qwen_audio(model_dir: str,
891891
'qwen/Qwen-1_8B-Chat-Int4',
892892
LoRATM.qwen,
893893
TemplateType.chatml,
894-
requires=['auto_gptq>=0.4.2'],
894+
requires=['auto_gptq>=0.5'],
895895
torch_dtype=torch.float16,
896896
function_kwargs={'bits': 4},
897897
support_flash_attn=True)
@@ -900,7 +900,7 @@ def get_model_tokenizer_qwen_audio(model_dir: str,
900900
'qwen/Qwen-72B-Chat-Int8',
901901
LoRATM.qwen,
902902
TemplateType.chatml,
903-
requires=['auto_gptq>=0.4.2'],
903+
requires=['auto_gptq>=0.5'],
904904
torch_dtype=torch.float16,
905905
function_kwargs={'bits': 8},
906906
support_flash_attn=True)
@@ -909,7 +909,7 @@ def get_model_tokenizer_qwen_audio(model_dir: str,
909909
'qwen/Qwen-72B-Chat-Int4',
910910
LoRATM.qwen,
911911
TemplateType.chatml,
912-
requires=['auto_gptq>=0.4.2'],
912+
requires=['auto_gptq>=0.5'],
913913
torch_dtype=torch.float16,
914914
function_kwargs={'bits': 4},
915915
support_flash_attn=True)
@@ -918,7 +918,7 @@ def get_model_tokenizer_qwen_audio(model_dir: str,
918918
'TongyiFinance/Tongyi-Finance-14B-Chat-Int4',
919919
LoRATM.qwen,
920920
TemplateType.chatml,
921-
requires=['auto_gptq>=0.4.2'],
921+
requires=['auto_gptq>=0.5'],
922922
torch_dtype=torch.float16,
923923
function_kwargs={'bits': 4},
924924
support_flash_attn=True)
@@ -927,7 +927,7 @@ def get_model_tokenizer_qwen_audio(model_dir: str,
927927
'qwen/Qwen-VL-Chat-Int4',
928928
LoRATM.qwen,
929929
TemplateType.chatml,
930-
requires=['auto_gptq>=0.4.2'],
930+
requires=['auto_gptq>=0.5'],
931931
torch_dtype=torch.float16,
932932
support_flash_attn=True,
933933
function_kwargs={
@@ -939,7 +939,7 @@ def get_model_tokenizer_qwen_audio(model_dir: str,
939939
'qwen/Qwen-14B-Chat-Int8',
940940
LoRATM.qwen,
941941
TemplateType.chatml,
942-
requires=['auto_gptq>=0.4.2'],
942+
requires=['auto_gptq>=0.5'],
943943
torch_dtype=torch.float16,
944944
function_kwargs={'bits': 8},
945945
support_flash_attn=True)
@@ -948,7 +948,7 @@ def get_model_tokenizer_qwen_audio(model_dir: str,
948948
'qwen/Qwen-7B-Chat-Int8',
949949
LoRATM.qwen,
950950
TemplateType.chatml,
951-
requires=['auto_gptq>=0.4.2'],
951+
requires=['auto_gptq>=0.5'],
952952
torch_dtype=torch.float16,
953953
function_kwargs={'bits': 8},
954954
support_flash_attn=True)
@@ -957,7 +957,7 @@ def get_model_tokenizer_qwen_audio(model_dir: str,
957957
'qwen/Qwen-14B-Chat-Int4',
958958
LoRATM.qwen,
959959
TemplateType.chatml,
960-
requires=['auto_gptq>=0.4.2'],
960+
requires=['auto_gptq>=0.5'],
961961
torch_dtype=torch.float16,
962962
function_kwargs={'bits': 4},
963963
support_flash_attn=True)
@@ -966,7 +966,7 @@ def get_model_tokenizer_qwen_audio(model_dir: str,
966966
'qwen/Qwen-7B-Chat-Int4',
967967
LoRATM.qwen,
968968
TemplateType.chatml,
969-
requires=['auto_gptq>=0.4.2'],
969+
requires=['auto_gptq>=0.5'],
970970
torch_dtype=torch.float16,
971971
function_kwargs={'bits': 4},
972972
support_flash_attn=True)

swift/llm/utils/utils.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -180,6 +180,9 @@ def dataset_map(
180180
if audio_info is not None:
181181
audio_info.pop('input_audios', None)
182182
data.append(d)
183+
if len(data) == 0:
184+
logger.info('len(dataset): 0')
185+
return None
183186
return LLMDataset(data)
184187

185188

tests/llm/test_run.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -102,7 +102,7 @@ def test_vl_audio(self):
102102
train_dataset_sample=200,
103103
dataset=[dataset],
104104
output_dir=output_dir,
105-
gradient_checkpointing=False)
105+
gradient_checkpointing=True)
106106
output = sft_main(sft_args)
107107
print(output)
108108
best_model_checkpoint = output['best_model_checkpoint']

0 commit comments

Comments
 (0)