Skip to content

Commit a95da87

Browse files
authored
new feature: save_infer_result_to_jsonl (#163)
1 parent fdf1962 commit a95da87

File tree

24 files changed

+301
-166
lines changed

24 files changed

+301
-166
lines changed

README.md

Lines changed: 41 additions & 39 deletions
Large diffs are not rendered by default.

README_CN.md

Lines changed: 40 additions & 39 deletions
Large diffs are not rendered by default.

examples/pytorch/llm/README.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -50,7 +50,7 @@
5050
- Multi-Modal: 🔥[coco-en](https://modelscope.cn/datasets/modelscope/coco_2014_caption/summary)
5151
- Custom Dataset
5252
- Supported Templates:
53-
- Text Generation: default-generation, chatglm-generation
53+
- Text Generation: default-generation, default-generation-bos, chatglm-generation
5454
- Chat: default, chatml(qwen), baichuan, chatglm2, chatglm3, llama, openbuddy, internlm, xverse, ziya, skywork, bluelm
5555

5656

@@ -111,7 +111,7 @@ infer_args = InferArguments(
111111
ckpt_dir=best_ckpt_dir,
112112
load_args_from_ckpt_dir=True,
113113
stream=True,
114-
show_dataset_sample=5)
114+
val_dataset_sample=5)
115115
infer_main(infer_args)
116116
torch.cuda.empty_cache()
117117
web_ui_main(infer_args)
@@ -208,12 +208,12 @@ bash scripts/qwen_7b_chat/lora_ddp_ds/infer.sh
208208
bash scripts/qwen_7b_chat/lora_mp_ddp/sft.sh
209209
bash scripts/qwen_7b_chat/lora_mp_ddp/infer.sh
210210

211-
# sft(full+mp) and infer qwen-7b-chat, Requires 2*75GB GPU memory.
211+
# sft(full+mp) and infer qwen-7b-chat, Requires 2*55GB GPU memory.
212212
# Recommended experimental environment: A100
213213
bash scripts/qwen_7b_chat/full_mp/sft.sh
214214
bash scripts/qwen_7b_chat/full_mp/infer.sh
215215

216-
# sft(full+mp+ddp) and infer qwen-7b-chat, Requires 4*75GB GPU memory.
216+
# sft(full+mp+ddp) and infer qwen-7b-chat, Requires 4*55GB GPU memory.
217217
# Recommended experimental environment: A100
218218
bash scripts/qwen_7b_chat/full_mp_ddp/sft.sh
219219
bash scripts/qwen_7b_chat/full_mp_ddp/infer.sh
@@ -594,7 +594,7 @@ The template initialization function retrieves the complete chat template based
594594
- `--dataset`: Default value is `'blossom-math-zh'`. For specific parameter details, please refer to the `sft.sh Command Line Arguments`. This parameter only takes effect when `eval_human` is set to False.
595595
- `--dataset_seed`: Default value is `42`. For specific parameter details, please refer to the `sft.sh Command Line Arguments`. This parameter only takes effect when `eval_human` is set to False.
596596
- `--dataset_test_ratio`: Default value is `0.01`. For specific parameter details, please refer to the `sft.sh Command Line Arguments`. This parameter only takes effect when `eval_human` is set to False.
597-
- `--show_dataset_sample`: Indicates the number of samples from the validation set to evaluate and display. Default value is `10`. This parameter only takes effect when `eval_human` is set to False.
597+
- `--val_dataset_sample`: Indicates the number of samples from the validation set to evaluate and display. Default value is `10`. This parameter only takes effect when `eval_human` is set to False.
598598
- `--system`: Default value is `'you are a helpful assistant!'`. For specific parameter details, please refer to the `sft.sh Command Line Arguments`.
599599
- `--max_length`: Default value is `2048`. For specific parameter details, please refer to the `sft.sh Command Line Arguments`.
600600
- `--check_dataset_strategy`: The default value is `'none'`, For specific parameter details, please refer to the `sft.sh Command Line Arguments`.

examples/pytorch/llm/README_CN.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -50,7 +50,7 @@
5050
- 多模态: 🔥[coco-en](https://modelscope.cn/datasets/modelscope/coco_2014_caption/summary)
5151
- 自定义数据集
5252
- 支持的对话模板:
53-
- 文本生成: default-generation, chatglm-generation
53+
- 文本生成: default-generation, default-generation-bos, chatglm-generation
5454
- 对话: default, chatml(qwen), baichuan, chatglm2, chatglm3, llama, openbuddy, internlm, xverse, ziya, skywork, bluelm
5555

5656
## 🛠️ 准备实验环境
@@ -110,7 +110,7 @@ infer_args = InferArguments(
110110
ckpt_dir=best_ckpt_dir,
111111
load_args_from_ckpt_dir=True,
112112
stream=True,
113-
show_dataset_sample=5)
113+
val_dataset_sample=5)
114114
infer_main(infer_args)
115115
torch.cuda.empty_cache()
116116
web_ui_main(infer_args)
@@ -207,12 +207,12 @@ bash scripts/qwen_7b_chat/lora_ddp_ds/infer.sh
207207
bash scripts/qwen_7b_chat/lora_mp_ddp/sft.sh
208208
bash scripts/qwen_7b_chat/lora_mp_ddp/infer.sh
209209

210-
# 微调(full+mp)+推理 qwen-7b-chat, 需要2卡*75G显存.
210+
# 微调(full+mp)+推理 qwen-7b-chat, 需要2卡*55G显存.
211211
# 推荐的实验环境: A100
212212
bash scripts/qwen_7b_chat/full_mp/sft.sh
213213
bash scripts/qwen_7b_chat/full_mp/infer.sh
214214

215-
# 微调(full+mp+ddp)+推理 qwen-7b-chat, 需要4卡*75G显存.
215+
# 微调(full+mp+ddp)+推理 qwen-7b-chat, 需要4卡*55G显存.
216216
# 推荐的实验环境: A100
217217
bash scripts/qwen_7b_chat/full_mp_ddp/sft.sh
218218
bash scripts/qwen_7b_chat/full_mp_ddp/infer.sh
@@ -597,7 +597,7 @@ if __name__ == '__main__':
597597
- `--dataset`: 默认值为`'blossom-math-zh'`, 具体的参数介绍可以在`sft.sh命令行参数`中查看. 该参数只有在`eval_human`设置为False时才生效.
598598
- `--dataset_seed`: 默认值为`42`, 具体的参数介绍可以在`sft.sh命令行参数`中查看. 该参数只有在`eval_human`设置为False时才生效.
599599
- `--dataset_test_ratio`: 默认值为`0.01`, 具体的参数介绍可以在`sft.sh命令行参数`中查看. 该参数只有在`eval_human`设置为False时才生效.
600-
- `--show_dataset_sample`: 表示想要评估和展示的验证集的数量, 默认值为`10`. 该参数只有在`eval_human`设置为False时才生效.
600+
- `--val_dataset_sample`: 表示想要评估和展示的验证集的数量, 默认值为`10`. 该参数只有在`eval_human`设置为False时才生效.
601601
- `--system`: 默认值为`'you are a helpful assistant!'`. 具体的参数介绍可以在`sft.sh命令行参数`中查看.
602602
- `--max_length`: 默认值为`2048`. 具体的参数介绍可以在`sft.sh命令行参数`中查看.
603603
- `--check_dataset_strategy`: 默认值为`'none'`, 具体的参数介绍可以在`sft.sh命令行参数`中查看.

examples/pytorch/llm/scripts/baichuan2_7b_chat/lora_ddp/sft.sh

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
# Experimental environment: 2 * A100
2-
# 2 * 30GB GPU memory
2+
# 2 * 28GB GPU memory
33
nproc_per_node=2
44

55
PYTHONPATH=../../.. \
@@ -25,7 +25,7 @@ torchrun \
2525
--lora_alpha 32 \
2626
--lora_dropout_p 0.05 \
2727
--lora_target_modules ALL \
28-
--gradient_checkpointing false \
28+
--gradient_checkpointing true \
2929
--batch_size 1 \
3030
--weight_decay 0.01 \
3131
--learning_rate 1e-4 \

examples/pytorch/llm/scripts/internlm_20b/lora_ddp/sft.sh

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
1-
# Experimental environment: A100
1+
# Experimental environment: 2 * A100
2+
# 2 * 56GB GPU memory
23
nproc_per_node=2
34

45
PYTHONPATH=../../.. \
@@ -11,7 +12,7 @@ torchrun \
1112
--model_revision master \
1213
--sft_type lora \
1314
--tuner_backend swift \
14-
--template_type default-generation \
15+
--template_type default-generation-bos \
1516
--dtype AUTO \
1617
--output_dir output \
1718
--ddp_backend nccl \
@@ -23,8 +24,8 @@ torchrun \
2324
--lora_rank 8 \
2425
--lora_alpha 32 \
2526
--lora_dropout_p 0.05 \
26-
--lora_target_modules q_proj k_proj v_proj \
27-
--gradient_checkpointing false \
27+
--lora_target_modules DEFAULT \
28+
--gradient_checkpointing true \
2829
--batch_size 1 \
2930
--weight_decay 0.01 \
3031
--learning_rate 1e-4 \

examples/pytorch/llm/scripts/internlm_20b/qlora/sft.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ python llm_sft.py \
77
--model_revision master \
88
--sft_type lora \
99
--tuner_backend swift \
10-
--template_type default-generation \
10+
--template_type default-generation-bos \
1111
--dtype AUTO \
1212
--output_dir output \
1313
--dataset advertise-gen-zh \

examples/pytorch/llm/scripts/qwen_7b_chat/full_mp/sft.sh

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,5 @@
11
# Experimental environment: 2 * A100
2-
# 2 * 75GB GPU memory (use flash_attn)
3-
# You need to install flash_attn or set gradient_checkpointing to True,
4-
# otherwise it may result in an OOM (Out of Memory) error.
2+
# 2 * 55GB GPU memory (use flash_attn)
53
PYTHONPATH=../../.. \
64
CUDA_VISIBLE_DEVICES=0,1 \
75
python llm_sft.py \
@@ -16,7 +14,7 @@ python llm_sft.py \
1614
--num_train_epochs 1 \
1715
--max_length 8192 \
1816
--check_dataset_strategy warning \
19-
--gradient_checkpointing false \
17+
--gradient_checkpointing true \
2018
--batch_size 1 \
2119
--weight_decay 0.01 \
2220
--learning_rate 2e-5 \

examples/pytorch/llm/scripts/qwen_7b_chat/full_mp_ddp/sft.sh

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,5 @@
11
# Experimental environment: 4 * A100
2-
# 4 * 75GB GPU memory (use flash_attn)
3-
# You need to install flash_attn or set gradient_checkpointing to True,
4-
# otherwise it may result in an OOM (Out of Memory) error.
2+
# 4 * 55GB GPU memory (use flash_attn)
53
nproc_per_node=2
64

75
PYTHONPATH=../../.. \
@@ -21,7 +19,7 @@ torchrun \
2119
--num_train_epochs 1 \
2220
--max_length 8192 \
2321
--check_dataset_strategy warning \
24-
--gradient_checkpointing false \
22+
--gradient_checkpointing true \
2523
--batch_size 1 \
2624
--weight_decay 0.01 \
2725
--learning_rate 2e-5 \

examples/pytorch/llm/scripts/skywork_13b/qlora/sft.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ python llm_sft.py \
77
--model_revision master \
88
--sft_type lora \
99
--tuner_backend swift \
10-
--template_type default-generation \
10+
--template_type default-generation-bos \
1111
--dtype AUTO \
1212
--output_dir output \
1313
--dataset advertise-gen-zh \

0 commit comments

Comments
 (0)