Skip to content

Commit b441875

Browse files
authored
update template (#286)
1 parent 4d4cd4e commit b441875

File tree

36 files changed

+86
-82
lines changed

36 files changed

+86
-82
lines changed

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -170,7 +170,7 @@ Users can check the [documentation of SWIFT](docs/source/GetStarted/快速使用
170170
- Custom Dataset
171171
- Supported Templates:
172172
- Text Generation: default-generation, default-generation-bos, chatglm-generation
173-
- Chat: default, chatml, baichuan, chatglm2, chatglm3, llama, openbuddy, internlm, yi, xverse, ziya, skywork, bluelm, zephyr, sus, deepseek
173+
- Chat: default, qwen, baichuan, chatglm2, chatglm3, llama, openbuddy, internlm, yi, xverse, ziya, skywork, bluelm, zephyr, sus, deepseek
174174

175175
## 🔥SCEdit
176176

README_CN.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -168,7 +168,7 @@ SWIFT(Scalable lightWeight Infrastructure for Fine-Tuning)是一个可扩展
168168
- 自定义数据集
169169
- 支持的对话模板:
170170
- 文本生成: default-generation, default-generation-bos, chatglm-generation
171-
- 对话: default, chatml, baichuan, chatglm2, chatglm3, llama, openbuddy, internlm, yi, xverse, ziya, skywork, bluelm, zephyr, sus, deepseek
171+
- 对话: default, qwen, baichuan, chatglm2, chatglm3, llama, openbuddy, internlm, yi, xverse, ziya, skywork, bluelm, zephyr, sus, deepseek
172172

173173
## 🔥SCEdit
174174

docs/source/LLM/LLM推理文档.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -39,7 +39,7 @@ from swift.utils import seed_everything
3939

4040
model_type = ModelType.qwen_7b_chat
4141
template_type = get_default_template_type(model_type)
42-
print(f'template_type: {template_type}') # template_type: chatml
42+
print(f'template_type: {template_type}') # template_type: qwen
4343

4444

4545
kwargs = {}
@@ -101,7 +101,7 @@ from swift.utils import seed_everything
101101

102102
model_type = ModelType.qwen_7b_chat_int4
103103
template_type = get_default_template_type(model_type)
104-
print(f'template_type: {template_type}') # template_type: chatml
104+
print(f'template_type: {template_type}') # template_type: qwen
105105

106106
model, tokenizer = get_model_tokenizer(model_type, model_kwargs={'device_map': 'auto'})
107107

@@ -179,7 +179,7 @@ from swift.utils import seed_everything
179179

180180
model_type = ModelType.qwen_7b_chat
181181
template_type = get_default_template_type(model_type)
182-
print(f'template_type: {template_type}') # template_type: chatml
182+
print(f'template_type: {template_type}') # template_type: qwen
183183

184184
model, tokenizer = get_model_tokenizer(model_type, model_kwargs={'device_map': 'auto'})
185185

@@ -220,7 +220,7 @@ from swift.utils import seed_everything
220220

221221
model_type = ModelType.qwen_vl_chat
222222
template_type = get_default_template_type(model_type)
223-
print(f'template_type: {template_type}') # template_type: chatml
223+
print(f'template_type: {template_type}') # template_type: qwen
224224

225225
model, tokenizer = get_model_tokenizer(model_type, model_kwargs={'device_map': 'auto'})
226226

@@ -262,7 +262,7 @@ from swift.utils import seed_everything
262262

263263
model_type = ModelType.qwen_audio_chat
264264
template_type = get_default_template_type(model_type)
265-
print(f'template_type: {template_type}') # template_type: chatml
265+
print(f'template_type: {template_type}') # template_type: qwen
266266

267267
model, tokenizer = get_model_tokenizer(model_type, model_kwargs={'device_map': 'auto'})
268268

docs/source/LLM/支持的模型和数据集.md

Lines changed: 17 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -15,26 +15,26 @@
1515
| Model Type | Model ID | Default Lora Target Modules | Default Template | Support Flash Attn | Support VLLM | Requires |
1616
| --------- | -------- | --------------------------- | ---------------- | ------------------ | ------------ | -------- |
1717
|qwen-1_8b|[qwen/Qwen-1_8B](https://modelscope.cn/models/qwen/Qwen-1_8B/summary)|c_attn|default-generation|✔|✔||
18-
|qwen-1_8b-chat|[qwen/Qwen-1_8B-Chat](https://modelscope.cn/models/qwen/Qwen-1_8B-Chat/summary)|c_attn|chatml|✔|✔||
19-
|qwen-1_8b-chat-int4|[qwen/Qwen-1_8B-Chat-Int4](https://modelscope.cn/models/qwen/Qwen-1_8B-Chat-Int4/summary)|c_attn|chatml|✔|✘|auto_gptq>=0.5|
20-
|qwen-1_8b-chat-int8|[qwen/Qwen-1_8B-Chat-Int8](https://modelscope.cn/models/qwen/Qwen-1_8B-Chat-Int8/summary)|c_attn|chatml|✔|✘|auto_gptq>=0.5|
18+
|qwen-1_8b-chat|[qwen/Qwen-1_8B-Chat](https://modelscope.cn/models/qwen/Qwen-1_8B-Chat/summary)|c_attn|qwen|✔|✔||
19+
|qwen-1_8b-chat-int4|[qwen/Qwen-1_8B-Chat-Int4](https://modelscope.cn/models/qwen/Qwen-1_8B-Chat-Int4/summary)|c_attn|qwen|✔|✘|auto_gptq>=0.5|
20+
|qwen-1_8b-chat-int8|[qwen/Qwen-1_8B-Chat-Int8](https://modelscope.cn/models/qwen/Qwen-1_8B-Chat-Int8/summary)|c_attn|qwen|✔|✘|auto_gptq>=0.5|
2121
|qwen-7b|[qwen/Qwen-7B](https://modelscope.cn/models/qwen/Qwen-7B/summary)|c_attn|default-generation|✔|✔||
22-
|qwen-7b-chat|[qwen/Qwen-7B-Chat](https://modelscope.cn/models/qwen/Qwen-7B-Chat/summary)|c_attn|chatml|✔|✔||
23-
|qwen-7b-chat-int4|[qwen/Qwen-7B-Chat-Int4](https://modelscope.cn/models/qwen/Qwen-7B-Chat-Int4/summary)|c_attn|chatml|✔|✘|auto_gptq>=0.5|
24-
|qwen-7b-chat-int8|[qwen/Qwen-7B-Chat-Int8](https://modelscope.cn/models/qwen/Qwen-7B-Chat-Int8/summary)|c_attn|chatml|✔|✘|auto_gptq>=0.5|
22+
|qwen-7b-chat|[qwen/Qwen-7B-Chat](https://modelscope.cn/models/qwen/Qwen-7B-Chat/summary)|c_attn|qwen|✔|✔||
23+
|qwen-7b-chat-int4|[qwen/Qwen-7B-Chat-Int4](https://modelscope.cn/models/qwen/Qwen-7B-Chat-Int4/summary)|c_attn|qwen|✔|✘|auto_gptq>=0.5|
24+
|qwen-7b-chat-int8|[qwen/Qwen-7B-Chat-Int8](https://modelscope.cn/models/qwen/Qwen-7B-Chat-Int8/summary)|c_attn|qwen|✔|✘|auto_gptq>=0.5|
2525
|qwen-14b|[qwen/Qwen-14B](https://modelscope.cn/models/qwen/Qwen-14B/summary)|c_attn|default-generation|✔|✔||
26-
|qwen-14b-chat|[qwen/Qwen-14B-Chat](https://modelscope.cn/models/qwen/Qwen-14B-Chat/summary)|c_attn|chatml|✔|✔||
27-
|qwen-14b-chat-int4|[qwen/Qwen-14B-Chat-Int4](https://modelscope.cn/models/qwen/Qwen-14B-Chat-Int4/summary)|c_attn|chatml|✔|✘|auto_gptq>=0.5|
28-
|qwen-14b-chat-int8|[qwen/Qwen-14B-Chat-Int8](https://modelscope.cn/models/qwen/Qwen-14B-Chat-Int8/summary)|c_attn|chatml|✔|✘|auto_gptq>=0.5|
26+
|qwen-14b-chat|[qwen/Qwen-14B-Chat](https://modelscope.cn/models/qwen/Qwen-14B-Chat/summary)|c_attn|qwen|✔|✔||
27+
|qwen-14b-chat-int4|[qwen/Qwen-14B-Chat-Int4](https://modelscope.cn/models/qwen/Qwen-14B-Chat-Int4/summary)|c_attn|qwen|✔|✘|auto_gptq>=0.5|
28+
|qwen-14b-chat-int8|[qwen/Qwen-14B-Chat-Int8](https://modelscope.cn/models/qwen/Qwen-14B-Chat-Int8/summary)|c_attn|qwen|✔|✘|auto_gptq>=0.5|
2929
|qwen-72b|[qwen/Qwen-72B](https://modelscope.cn/models/qwen/Qwen-72B/summary)|c_attn|default-generation|✔|✔||
30-
|qwen-72b-chat|[qwen/Qwen-72B-Chat](https://modelscope.cn/models/qwen/Qwen-72B-Chat/summary)|c_attn|chatml|✔|✔||
31-
|qwen-72b-chat-int4|[qwen/Qwen-72B-Chat-Int4](https://modelscope.cn/models/qwen/Qwen-72B-Chat-Int4/summary)|c_attn|chatml|✔|✘|auto_gptq>=0.5|
32-
|qwen-72b-chat-int8|[qwen/Qwen-72B-Chat-Int8](https://modelscope.cn/models/qwen/Qwen-72B-Chat-Int8/summary)|c_attn|chatml|✔|✘|auto_gptq>=0.5|
30+
|qwen-72b-chat|[qwen/Qwen-72B-Chat](https://modelscope.cn/models/qwen/Qwen-72B-Chat/summary)|c_attn|qwen|✔|✔||
31+
|qwen-72b-chat-int4|[qwen/Qwen-72B-Chat-Int4](https://modelscope.cn/models/qwen/Qwen-72B-Chat-Int4/summary)|c_attn|qwen|✔|✘|auto_gptq>=0.5|
32+
|qwen-72b-chat-int8|[qwen/Qwen-72B-Chat-Int8](https://modelscope.cn/models/qwen/Qwen-72B-Chat-Int8/summary)|c_attn|qwen|✔|✘|auto_gptq>=0.5|
3333
|qwen-vl|[qwen/Qwen-VL](https://modelscope.cn/models/qwen/Qwen-VL/summary)|c_attn|default-generation|✔|✘||
34-
|qwen-vl-chat|[qwen/Qwen-VL-Chat](https://modelscope.cn/models/qwen/Qwen-VL-Chat/summary)|c_attn|chatml|✔|✘||
35-
|qwen-vl-chat-int4|[qwen/Qwen-VL-Chat-Int4](https://modelscope.cn/models/qwen/Qwen-VL-Chat-Int4/summary)|c_attn|chatml|✔|✘|auto_gptq>=0.5|
34+
|qwen-vl-chat|[qwen/Qwen-VL-Chat](https://modelscope.cn/models/qwen/Qwen-VL-Chat/summary)|c_attn|qwen|✔|✘||
35+
|qwen-vl-chat-int4|[qwen/Qwen-VL-Chat-Int4](https://modelscope.cn/models/qwen/Qwen-VL-Chat-Int4/summary)|c_attn|qwen|✔|✘|auto_gptq>=0.5|
3636
|qwen-audio|[qwen/Qwen-Audio](https://modelscope.cn/models/qwen/Qwen-Audio/summary)|c_attn|default-generation|✔|✘||
37-
|qwen-audio-chat|[qwen/Qwen-Audio-Chat](https://modelscope.cn/models/qwen/Qwen-Audio-Chat/summary)|c_attn|chatml|✔|✘||
37+
|qwen-audio-chat|[qwen/Qwen-Audio-Chat](https://modelscope.cn/models/qwen/Qwen-Audio-Chat/summary)|c_attn|qwen|✔|✘||
3838
|chatglm2-6b|[ZhipuAI/chatglm2-6b](https://modelscope.cn/models/ZhipuAI/chatglm2-6b/summary)|query_key_value|chatglm2|✘|✔||
3939
|chatglm2-6b-32k|[ZhipuAI/chatglm2-6b-32k](https://modelscope.cn/models/ZhipuAI/chatglm2-6b-32k/summary)|query_key_value|chatglm2|✘|✔||
4040
|chatglm3-6b-base|[ZhipuAI/chatglm3-6b-base](https://modelscope.cn/models/ZhipuAI/chatglm3-6b-base/summary)|query_key_value|chatglm-generation|✘|✔||
@@ -100,8 +100,8 @@
100100
|polylm-13b|[damo/nlp_polylm_13b_text_generation](https://modelscope.cn/models/damo/nlp_polylm_13b_text_generation/summary)|c_attn|default-generation|✘|✘||
101101
|seqgpt-560m|[damo/nlp_seqgpt-560m](https://modelscope.cn/models/damo/nlp_seqgpt-560m/summary)|query_key_value|default-generation|✘|✔||
102102
|tongyi-finance-14b|[TongyiFinance/Tongyi-Finance-14B](https://modelscope.cn/models/TongyiFinance/Tongyi-Finance-14B/summary)|c_attn|default-generation|✔|✔||
103-
|tongyi-finance-14b-chat|[TongyiFinance/Tongyi-Finance-14B-Chat](https://modelscope.cn/models/TongyiFinance/Tongyi-Finance-14B-Chat/summary)|c_attn|chatml|✔|✔||
104-
|tongyi-finance-14b-chat-int4|[TongyiFinance/Tongyi-Finance-14B-Chat-Int4](https://modelscope.cn/models/TongyiFinance/Tongyi-Finance-14B-Chat-Int4/summary)|c_attn|chatml|✔|✘|auto_gptq>=0.5|
103+
|tongyi-finance-14b-chat|[TongyiFinance/Tongyi-Finance-14B-Chat](https://modelscope.cn/models/TongyiFinance/Tongyi-Finance-14B-Chat/summary)|c_attn|qwen|✔|✔||
104+
|tongyi-finance-14b-chat-int4|[TongyiFinance/Tongyi-Finance-14B-Chat-Int4](https://modelscope.cn/models/TongyiFinance/Tongyi-Finance-14B-Chat-Int4/summary)|c_attn|qwen|✔|✘|auto_gptq>=0.5|
105105
|codefuse-codellama-34b-chat|[codefuse-ai/CodeFuse-CodeLlama-34B](https://modelscope.cn/models/codefuse-ai/CodeFuse-CodeLlama-34B/summary)|q_proj, k_proj, v_proj|codefuse-codellama|✔|✔||
106106
|deepseek-coder-1_3b|[deepseek-ai/deepseek-coder-1.3b-base](https://modelscope.cn/models/deepseek-ai/deepseek-coder-1.3b-base/summary)|q_proj, k_proj, v_proj|default-generation-bos|✔|✔||
107107
|deepseek-coder-1_3b-chat|[deepseek-ai/deepseek-coder-1.3b-instruct](https://modelscope.cn/models/deepseek-ai/deepseek-coder-1.3b-instruct/summary)|q_proj, k_proj, v_proj|deepseek-coder|✔|✔||

examples/pytorch/llm/scripts/qwen_14b_chat/lora_ddp_ds/sft.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ torchrun \
1212
--model_revision master \
1313
--sft_type lora \
1414
--tuner_backend swift \
15-
--template_type chatml \
15+
--template_type qwen \
1616
--dtype AUTO \
1717
--output_dir output \
1818
--ddp_backend nccl \

examples/pytorch/llm/scripts/qwen_14b_chat/qlora/sft.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ python llm_sft.py \
88
--model_revision master \
99
--sft_type lora \
1010
--tuner_backend swift \
11-
--template_type chatml \
11+
--template_type qwen \
1212
--dtype AUTO \
1313
--output_dir output \
1414
--dataset blossom-math-zh \

examples/pytorch/llm/scripts/qwen_14b_chat/qlora_ddp_ds/sft.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ torchrun \
1313
--model_revision master \
1414
--sft_type lora \
1515
--tuner_backend swift \
16-
--template_type chatml \
16+
--template_type qwen \
1717
--dtype AUTO \
1818
--output_dir output \
1919
--ddp_backend nccl \

examples/pytorch/llm/scripts/qwen_14b_chat_int4/qlora/sft.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ python llm_sft.py \
77
--model_revision master \
88
--sft_type lora \
99
--tuner_backend swift \
10-
--template_type chatml \
10+
--template_type qwen \
1111
--dtype fp16 \
1212
--output_dir output \
1313
--dataset leetcode-python-en \

examples/pytorch/llm/scripts/qwen_14b_chat_int4/qlora_ddp_ds/sft.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ torchrun \
1212
--model_revision master \
1313
--sft_type lora \
1414
--tuner_backend swift \
15-
--template_type chatml \
15+
--template_type qwen \
1616
--dtype fp16 \
1717
--output_dir output \
1818
--ddp_backend nccl \

examples/pytorch/llm/scripts/qwen_14b_chat_int8/qlora/sft.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ python llm_sft.py \
77
--model_revision master \
88
--sft_type lora \
99
--tuner_backend swift \
10-
--template_type chatml \
10+
--template_type qwen \
1111
--dtype fp16 \
1212
--output_dir output \
1313
--dataset blossom-math-zh \

0 commit comments

Comments
 (0)