Skip to content

Commit fac1035

Browse files
authored
update qwen2 (#355)
1 parent 82a5ae7 commit fac1035

File tree

6 files changed

+148
-27
lines changed

6 files changed

+148
-27
lines changed

README.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -74,6 +74,8 @@ Users can check the [documentation of SWIFT](docs/source/GetStarted/快速使用
7474
- 🔥2024.1.12: Support **deepseek-moe** series: deepseek-moe-16b, [deepseek-moe-16b-chat](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/deepseek_moe_16b_chat).
7575
- 🔥2024.1.4: Support for **VLLM deployment**, compatible with the **OpenAI API** style. For more details, please refer to [VLLM Inference Acceleration and Deployment](https://github.com/modelscope/swift/blob/main/docs/source/LLM/VLLM推理加速与部署.md#部署)
7676
- 2024.1.4: Update [Benchmark](https://github.com/modelscope/swift/blob/main/docs/source/LLM/Benchmark.md) to facilitate viewing the training speed and GPU memory required for different models.
77+
<details><summary>More</summary>
78+
7779
- 🔥 2023.12.29: Support web-ui for training and inference, use `swift web-ui` after the installation of ms-swift.
7880
- 🔥 2023.12.29: Support DPO RLHF(Reinforcement Learning from Human Feedback) and two datasets: AI-ModelScope/stack-exchange-paired and AI-ModelScope/hh-rlhf for this task. Check [this documentation](https://github.com/modelscope/swift/blob/main/docs/source/LLM/LLM%E4%BA%BA%E7%B1%BB%E5%AF%B9%E9%BD%90%E8%AE%AD%E7%BB%83%E6%96%87%E6%A1%A3.md) to start training!
7981
- 🔥 2023.12.28: Support SCEdit! This framework can easily reduce memory usage in training and inference, and replace ControlNet for controllable image generating scenarios, view the following chapter for details.
@@ -87,8 +89,6 @@ Users can check the [documentation of SWIFT](docs/source/GetStarted/快速使用
8789
- 2023.12.7: Support [Multi-Node DDP training](https://github.com/modelscope/swift/blob/main/docs/source/LLM/LLM%E5%BE%AE%E8%B0%83%E6%96%87%E6%A1%A3.md#%E4%BD%BF%E7%94%A8cli).
8890
- 2023.12.4: Supported models: zephyr-7b-beta-chat, openbuddy-zephyr-7b-chat. Supported datasets: hc3-zh, hc3-en.
8991
- 🔥 2023.12.2: [Best Practices for Self-cognition Fine-tuning](https://github.com/modelscope/swift/blob/main/docs/source/LLM/自我认知微调最佳实践.md), **10 minutes for self-cognition fine-tuning for LLM**, creating a LLM that is specific to oneself.
90-
<details><summary>More</summary>
91-
9292
- 🔥 2023.11.30: Support for training and inference of the **qwen-1_8b**, **qwen-72b**, and **qwen-audio** model series. The corresponding shell scripts can be viewed at [qwen_1_8b_chat](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_1_8b_chat), [qwen_72b_chat](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_72b_chat), [qwen_audio_chat](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_audio_chat).
9393
- 🔥 2023.11.29: Support the training and inference for **AnimateDiff**
9494
- 🔥 2023.11.24: Support for **yi-34b-chat**, **codefuse-codellama-34b-chat**: The corresponding shell script can be found in [yi_34b_chat](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/yi_34b_chat), [codefuse_codellama_34b_chat](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/codefuse_codellama_34b_chat).
@@ -218,7 +218,7 @@ app_ui_main(infer_args)
218218
- [zephyr](https://github.com/huggingface/alignment-handbook) series: zephyr-7b-beta-chat.
219219
- [ziya](https://github.com/IDEA-CCNL/Fengshenbang-LM) series: ziya2-13b, ziya2-13b-chat.
220220
- [skywork](https://github.com/SkyworkAI/Skywork) series: skywork-13b, skywork-13b-chat.
221-
- other: [polylm-13b](https://github.com/DAMO-NLP-MT/PolyLM), [seqgpt-560m](https://github.com/Alibaba-NLP/SeqGPT), [sus-34b-chat](https://github.com/SUSTech-IDEA/SUS-Chat).
221+
- other: [polylm-13b](https://github.com/DAMO-NLP-MT/PolyLM), [seqgpt-560m](https://github.com/Alibaba-NLP/SeqGPT), [sus-34b-chat](https://github.com/SUSTech-IDEA/SUS-Chat), [openbmb-minicpm-2b](https://github.com/OpenBMB/CPM-Bee).
222222
- Financial:
223223
- [tongyi-finance](https://github.com/QwenLM/Qwen) series: tongyi-finance-14b, tongyi-finance-14b-chat, tongyi-finance-14b-chat-int4.
224224
- Coding:
@@ -248,7 +248,7 @@ app_ui_main(infer_args)
248248
- Custom Dataset
249249
- Supported Templates:
250250
- Text Generation: default-generation, default-generation-bos, chatglm-generation.
251-
- Chat: default, qwen, baichuan, chatglm2, chatglm3, llama, openbuddy, internlm, internlm2, yi, yuan, xverse, ziya, skywork, bluelm, zephyr, sus, deepseek, deepseek-coder, codefuse-codellama, codefuse, cogagent-chat, cogagent-instruct, yi-vl, internlm-xcomposer2.
251+
- Chat: default, qwen, baichuan, chatglm2, chatglm3, llama, openbuddy, internlm, internlm2, yi, yuan, xverse, ziya, skywork, bluelm, zephyr, sus, deepseek, deepseek-coder, codefuse-codellama, codefuse, cogagent-chat, cogagent-instruct, yi-vl, internlm-xcomposer2, openbmb.
252252

253253

254254
## 🔥SCEdit

README_CN.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -72,6 +72,8 @@ SWIFT(Scalable lightWeight Infrastructure for Fine-Tuning)是一个可扩展
7272
- 🔥2024.1.12: 支持**deepseek-moe**系列: deepseek-moe-16b, [deepseek-moe-16b-chat](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/deepseek_moe_16b_chat).
7373
- 🔥2024.1.4: 支持**VLLM部署**, 兼容**OpenAI API**样式, 具体可以查看[VLLM推理加速与部署](https://github.com/modelscope/swift/blob/main/docs/source/LLM/VLLM推理加速与部署.md#部署).
7474
- 2024.1.4: 更新[Benchmark](https://github.com/modelscope/swift/blob/main/docs/source/LLM/Benchmark.md), 方便查看不同模型训练的速度和所需显存.
75+
<details><summary>更多</summary>
76+
7577
- 🔥 2023.12.29: 支持web-ui进行sft训练和推理,安装ms-swift后使用`swift web-ui`开启
7678
- 🔥 2023.12.29: 支持 DPO RLHF(Reinforcement Learning from Human Feedback) 和两个用于此任务的数据集: AI-ModelScope/stack-exchange-paired 以及 AI-ModelScope/hh-rlhf. 查看[文档](https://github.com/modelscope/swift/blob/main/docs/source/LLM/LLM%E4%BA%BA%E7%B1%BB%E5%AF%B9%E9%BD%90%E8%AE%AD%E7%BB%83%E6%96%87%E6%A1%A3.md)开启训练!
7779
- 🔥 2023.12.28: 支持SCEdit! 该tuner可显著降低U-Net中的显存占用,并支持低显存可控图像生成(取代ControlNet),阅读下面的章节来了解详细信息
@@ -85,8 +87,6 @@ SWIFT(Scalable lightWeight Infrastructure for Fine-Tuning)是一个可扩展
8587
- 2023.12.7: 支持[Multi-Node DDP训练](https://github.com/modelscope/swift/blob/main/docs/source/LLM/LLM%E5%BE%AE%E8%B0%83%E6%96%87%E6%A1%A3.md#%E4%BD%BF%E7%94%A8cli).
8688
- 2023.12.5: 支持模型: zephyr-7b-beta-chat, openbuddy-zephyr-7b-chat. 支持数据集: hc3-zh, hc3-en.
8789
- 🔥 2023.12.2: [自我认知微调最佳实践](https://github.com/modelscope/swift/blob/main/docs/source/LLM/自我认知微调最佳实践.md), **10分钟对大模型进行自我认知微调**, 创建专属于自己的大模型.
88-
<details><summary>更多</summary>
89-
9090
- 🔥 2023.11.30: 支持**qwen-1_8b**, **qwen-72b**, **qwen-audio**系列模型的训练的推理. 对应的sh脚本可以查看[qwen_1_8b_chat](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_1_8b_chat), [qwen_72b_chat](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_72b_chat), [qwen_audio_chat](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_audio_chat)
9191
- 🔥 2023.11.29: 支持**AnimateDiff**的训练和推理
9292
- 🔥 2023.11.24: 支持**yi-34b-chat**, **codefuse-codellama-34b-chat**模型. 对应的sh脚本可以查看[yi_34b_chat](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/yi_34b_chat), [codefuse_codellama_34b_chat](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/codefuse_codellama_34b_chat).
@@ -218,7 +218,7 @@ app_ui_main(infer_args)
218218
- [zephyr](https://github.com/huggingface/alignment-handbook) 系列: zephyr-7b-beta-chat.
219219
- [ziya](https://github.com/IDEA-CCNL/Fengshenbang-LM) 系列: ziya2-13b, ziya2-13b-chat.
220220
- [skywork](https://github.com/SkyworkAI/Skywork) 系列: skywork-13b, skywork-13b-chat.
221-
- other: [polylm-13b](https://github.com/DAMO-NLP-MT/PolyLM), [seqgpt-560m](https://github.com/Alibaba-NLP/SeqGPT), [sus-34b-chat](https://github.com/SUSTech-IDEA/SUS-Chat).
221+
- other: [polylm-13b](https://github.com/DAMO-NLP-MT/PolyLM), [seqgpt-560m](https://github.com/Alibaba-NLP/SeqGPT), [sus-34b-chat](https://github.com/SUSTech-IDEA/SUS-Chat), [openbmb-minicpm-2b](https://github.com/OpenBMB/CPM-Bee).
222222
- 金融:
223223
- [tongyi-finance](https://github.com/QwenLM/Qwen) 系列: tongyi-finance-14b, tongyi-finance-14b-chat, tongyi-finance-14b-chat-int4.
224224
- 代码:
@@ -248,7 +248,7 @@ app_ui_main(infer_args)
248248
- 自定义数据集
249249
- 支持的对话模板:
250250
- 文本生成: default-generation, default-generation-bos, chatglm-generation.
251-
- 对话: default, qwen, baichuan, chatglm2, chatglm3, llama, openbuddy, internlm, internlm2, yi, yuan, xverse, ziya, skywork, bluelm, zephyr, sus, deepseek, deepseek-coder, codefuse-codellama, codefuse, cogagent-chat, cogagent-instruct, yi-vl, internlm-xcomposer2.
251+
- 对话: default, qwen, baichuan, chatglm2, chatglm3, llama, openbuddy, internlm, internlm2, yi, yuan, xverse, ziya, skywork, bluelm, zephyr, sus, deepseek, deepseek-coder, codefuse-codellama, codefuse, cogagent-chat, cogagent-instruct, yi-vl, internlm-xcomposer2, openbmb.
252252

253253

254254
## 🔥SCEdit

docs/source/LLM/支持的模型和数据集.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -122,9 +122,10 @@
122122
|skywork-13b|[skywork/Skywork-13B-base](https://modelscope.cn/models/skywork/Skywork-13B-base/summary)|q_proj, k_proj, v_proj|default-generation-bos|&#x2718;|&#x2718;||
123123
|skywork-13b-chat|[skywork/Skywork-13B-chat](https://modelscope.cn/models/skywork/Skywork-13B-chat/summary)|q_proj, k_proj, v_proj|skywork|&#x2718;|&#x2718;||
124124
|zephyr-7b-beta-chat|[modelscope/zephyr-7b-beta](https://modelscope.cn/models/modelscope/zephyr-7b-beta/summary)|q_proj, k_proj, v_proj|zephyr|&#x2714;|&#x2714;|transformers>=4.34|
125-
|sus-34b-chat|[SUSTC/SUS-Chat-34B](https://modelscope.cn/models/SUSTC/SUS-Chat-34B/summary)|q_proj, k_proj, v_proj|sus|&#x2714;|&#x2714;||
126125
|polylm-13b|[damo/nlp_polylm_13b_text_generation](https://modelscope.cn/models/damo/nlp_polylm_13b_text_generation/summary)|c_attn|default-generation|&#x2718;|&#x2718;||
127126
|seqgpt-560m|[damo/nlp_seqgpt-560m](https://modelscope.cn/models/damo/nlp_seqgpt-560m/summary)|query_key_value|default-generation|&#x2718;|&#x2714;||
127+
|openbmb-minicpm-2b|[OpenBMB/miniCPM-bf16](https://modelscope.cn/models/OpenBMB/miniCPM-bf16/summary)|q_proj, k_proj, v_proj|openbmb|&#x2714;|&#x2718;||
128+
|sus-34b-chat|[SUSTC/SUS-Chat-34B](https://modelscope.cn/models/SUSTC/SUS-Chat-34B/summary)|q_proj, k_proj, v_proj|sus|&#x2714;|&#x2714;||
128129
|tongyi-finance-14b|[TongyiFinance/Tongyi-Finance-14B](https://modelscope.cn/models/TongyiFinance/Tongyi-Finance-14B/summary)|c_attn|default-generation|&#x2714;|&#x2714;||
129130
|tongyi-finance-14b-chat|[TongyiFinance/Tongyi-Finance-14B-Chat](https://modelscope.cn/models/TongyiFinance/Tongyi-Finance-14B-Chat/summary)|c_attn|qwen|&#x2714;|&#x2714;||
130131
|tongyi-finance-14b-chat-int4|[TongyiFinance/Tongyi-Finance-14B-Chat-Int4](https://modelscope.cn/models/TongyiFinance/Tongyi-Finance-14B-Chat-Int4/summary)|c_attn|qwen|&#x2714;|&#x2718;|auto_gptq>=0.5|

swift/llm/utils/model.py

Lines changed: 126 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -51,6 +51,13 @@ class ModelType:
5151
qwen_72b_chat = 'qwen-72b-chat'
5252
qwen_72b_chat_int4 = 'qwen-72b-chat-int4'
5353
qwen_72b_chat_int8 = 'qwen-72b-chat-int8'
54+
# qwen2
55+
qwen2_beta_0_5b = 'qwen2-beta-0_5b'
56+
qwen2_beta_1_8b = 'qwen2-beta-1_8b'
57+
qwen2_beta_4b = 'qwen2-beta-4b'
58+
qwen2_beta_7b = 'qwen2-beta-7b'
59+
qwen2_beta_14b = 'qwen2-beta-14b'
60+
qwen2_beta_72b = 'qwen2-beta-72b'
5461
# qwen-vl
5562
qwen_vl = 'qwen-vl'
5663
qwen_vl_chat = 'qwen-vl-chat'
@@ -165,11 +172,11 @@ class ModelType:
165172
skywork_13b_chat = 'skywork-13b-chat'
166173
# zephyr
167174
zephyr_7b_beta_chat = 'zephyr-7b-beta-chat'
168-
# sus
169-
sus_34b_chat = 'sus-34b-chat'
170175
# other
171176
polylm_13b = 'polylm-13b'
172177
seqgpt_560m = 'seqgpt-560m'
178+
openbmb_minicpm_2b = 'openbmb-minicpm-2b'
179+
sus_34b_chat = 'sus-34b-chat'
173180

174181
# domain-specific
175182
# financial
@@ -210,6 +217,7 @@ class LoRATM(NamedTuple):
210217
chatglm = ['query_key_value']
211218
llama2 = ['q_proj', 'k_proj', 'v_proj']
212219
qwen = ['c_attn']
220+
qwen2 = llama2
213221
polylm = ['c_attn']
214222
bloom = ['query_key_value']
215223
cogagent = [
@@ -492,8 +500,13 @@ def get_model_tokenizer_baichuan2_13b(model_dir: str,
492500
gradient_checkpointing = model_config.gradient_checkpointing
493501
if isinstance(gradient_checkpointing, (tuple, list)):
494502
model_config.gradient_checkpointing = gradient_checkpointing[0]
495-
return get_model_tokenizer_baichuan2(model_dir, torch_dtype, model_kwargs,
496-
load_model, model_config, **kwargs)
503+
return get_model_tokenizer_baichuan2(
504+
model_dir,
505+
torch_dtype,
506+
model_kwargs,
507+
load_model,
508+
model_config=model_config,
509+
**kwargs)
497510

498511

499512
def patch_baichuan2_lm_head_forward(self, hidden_states: Tensor) -> Tensor:
@@ -527,9 +540,13 @@ def get_model_tokenizer_baichuan2(model_dir: str,
527540
load_model: bool = True,
528541
model_config=None,
529542
**kwargs):
530-
model, tokenizer = get_model_tokenizer_from_repo(model_dir, torch_dtype,
531-
model_kwargs, load_model,
532-
model_config, **kwargs)
543+
model, tokenizer = get_model_tokenizer_from_repo(
544+
model_dir,
545+
torch_dtype,
546+
model_kwargs,
547+
load_model,
548+
model_config=model_config,
549+
**kwargs)
533550
if model is not None:
534551
new_forward = MethodType(patch_baichuan2_lm_head_forward,
535552
model.lm_head)
@@ -669,6 +686,54 @@ def cross_entropy_forward(self, inputs: Tensor,
669686
return model, tokenizer
670687

671688

689+
@register_model(
690+
ModelType.qwen2_beta_0_5b,
691+
'qwen/Qwen2-beta-0_5B',
692+
LoRATM.qwen2,
693+
TemplateType.default_generation,
694+
support_flash_attn=True,
695+
support_vllm=True,
696+
requires=['transformers>=4.37'])
697+
@register_model(
698+
ModelType.qwen2_beta_1_8b,
699+
'qwen/Qwen2-beta-1_8B',
700+
LoRATM.qwen2,
701+
TemplateType.default_generation,
702+
support_flash_attn=True,
703+
support_vllm=True,
704+
requires=['transformers>=4.37'])
705+
@register_model(
706+
ModelType.qwen2_beta_4b,
707+
'qwen/Qwen2-beta-4B',
708+
LoRATM.qwen2,
709+
TemplateType.default_generation,
710+
support_flash_attn=True,
711+
support_vllm=True,
712+
requires=['transformers>=4.37'])
713+
@register_model(
714+
ModelType.qwen2_beta_7b,
715+
'qwen/Qwen2-beta-7B',
716+
LoRATM.qwen2,
717+
TemplateType.default_generation,
718+
support_flash_attn=True,
719+
support_vllm=True,
720+
requires=['transformers>=4.37'])
721+
@register_model(
722+
ModelType.qwen2_beta_14b,
723+
'qwen/Qwen2-beta-14B',
724+
LoRATM.qwen2,
725+
TemplateType.default_generation,
726+
support_flash_attn=True,
727+
support_vllm=True,
728+
requires=['transformers>=4.37'])
729+
@register_model(
730+
ModelType.qwen2_beta_72b,
731+
'qwen/Qwen2-beta-72B',
732+
LoRATM.qwen2,
733+
TemplateType.default_generation,
734+
support_flash_attn=True,
735+
support_vllm=True,
736+
requires=['transformers>=4.37'])
672737
@register_model(
673738
ModelType.deepseek_coder_1_3b,
674739
'deepseek-ai/deepseek-coder-1.3b-base',
@@ -916,8 +981,13 @@ def get_model_tokenizer_with_flash_attn(model_dir: str,
916981
model_config._attn_implementation = 'flash_attention_2'
917982
else:
918983
model_config._flash_attn_2_enabled = use_flash_attn
919-
return get_model_tokenizer_from_repo(model_dir, torch_dtype, model_kwargs,
920-
load_model, model_config, **kwargs)
984+
return get_model_tokenizer_from_repo(
985+
model_dir,
986+
torch_dtype,
987+
model_kwargs,
988+
load_model,
989+
model_config=model_config,
990+
**kwargs)
921991

922992

923993
@register_model(
@@ -1116,9 +1186,13 @@ def get_model_tokenizer_llama2(model_dir: str,
11161186
model_config = AutoConfig.from_pretrained(
11171187
model_dir, trust_remote_code=True)
11181188
model_config.pretraining_tp = 1
1119-
return get_model_tokenizer_with_flash_attn(model_dir, torch_dtype,
1120-
model_kwargs, load_model,
1121-
model_config, **kwargs)
1189+
return get_model_tokenizer_with_flash_attn(
1190+
model_dir,
1191+
torch_dtype,
1192+
model_kwargs,
1193+
load_model,
1194+
model_config=model_config,
1195+
**kwargs)
11221196

11231197

11241198
@register_model(ModelType.polylm_13b, 'damo/nlp_polylm_13b_text_generation',
@@ -1169,9 +1243,13 @@ def get_model_tokenizer_qwen(model_dir: str,
11691243
if use_flash_attn is None:
11701244
use_flash_attn = 'auto'
11711245
model_config.use_flash_attn = use_flash_attn
1172-
model, tokenizer = get_model_tokenizer_from_repo(model_dir, torch_dtype,
1173-
model_kwargs, load_model,
1174-
model_config, **kwargs)
1246+
model, tokenizer = get_model_tokenizer_from_repo(
1247+
model_dir,
1248+
torch_dtype,
1249+
model_kwargs,
1250+
load_model,
1251+
model_config=model_config,
1252+
**kwargs)
11751253
try:
11761254
# fix mp+ddp bug
11771255
model.transformer.registered_causal_mask = model.transformer.registered_causal_mask.cuda(
@@ -1574,8 +1652,13 @@ def get_model_tokenizer_phi(model_dir: str,
15741652
model_dir, trust_remote_code=True)
15751653
use_flash_attn = kwargs.pop('use_flash_attn', False)
15761654
model_config.flash_attn = use_flash_attn
1577-
return get_model_tokenizer_from_repo(model_dir, torch_dtype, model_kwargs,
1578-
load_model, model_config, **kwargs)
1655+
return get_model_tokenizer_from_repo(
1656+
model_dir,
1657+
torch_dtype,
1658+
model_kwargs,
1659+
load_model,
1660+
model_config=model_config,
1661+
**kwargs)
15791662

15801663

15811664
@register_model(
@@ -1762,6 +1845,32 @@ def get_model_tokenizer_yi_vl(model_dir: str,
17621845
return model, tokenizer
17631846

17641847

1848+
@register_model(
1849+
ModelType.openbmb_minicpm_2b,
1850+
'OpenBMB/miniCPM-bf16',
1851+
LoRATM.llama2,
1852+
TemplateType.openbmb,
1853+
support_flash_attn=True,
1854+
support_gradient_checkpointing=False)
1855+
def get_model_tokenizer_openbmb(model_dir: str,
1856+
torch_dtype: Dtype,
1857+
model_kwargs: Dict[str, Any],
1858+
load_model: bool = True,
1859+
**kwargs):
1860+
model_config = AutoConfig.from_pretrained(
1861+
model_dir, trust_remote_code=True)
1862+
use_flash_attn = kwargs.pop('use_flash_attn', False)
1863+
if use_flash_attn:
1864+
model_config._attn_implementation = 'flash_attention_2'
1865+
return get_model_tokenizer_from_repo(
1866+
model_dir,
1867+
torch_dtype,
1868+
model_kwargs,
1869+
load_model,
1870+
model_config=model_config,
1871+
**kwargs)
1872+
1873+
17651874
def fix_transformers_upgrade(module: PreTrainedModel) -> None:
17661875
# from 4.35, transformers changes its arguments of _set_gradient_checkpointing
17671876
if version.parse(transformers.__version__) >= version.parse('4.35'):

swift/llm/utils/template.py

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -48,6 +48,7 @@ class TemplateType:
4848
cogagent_chat = 'cogagent-chat'
4949
cogagent_instruct = 'cogagent-instruct'
5050
orion = 'orion'
51+
openbmb = 'openbmb'
5152
# compatibility. (Deprecated)
5253
chatml = 'chatml'
5354

@@ -907,6 +908,10 @@ def data_collator(self,
907908
infer_media_type='dialogue',
908909
lazy_tokenize=True)
909910

911+
register_template(
912+
TemplateType.openbmb,
913+
Template(['<s>{{SYSTEM}}'], ['<用户>{{QUERY}}<AI>'], [], ['</s>'], ''))
914+
910915

911916
def get_template(
912917
template_type: str,

swift/llm/utils/vllm_utils.py

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -77,7 +77,13 @@ def get_vllm_engine(model_type: str,
7777
destroy_model_parallel()
7878
except ImportError:
7979
pass
80-
llm_engine = llm_engine_cls.from_engine_args(engine_args)
80+
try:
81+
llm_engine = llm_engine_cls.from_engine_args(engine_args)
82+
except ValueError:
83+
logger.warning(
84+
f'The current version of VLLM does not support {model_type}. '
85+
'Please upgrade VLLM or specify `--infer_backend pt`.')
86+
raise
8187
llm_engine.engine_args = engine_args
8288
llm_engine.model_dir = model_dir
8389
llm_engine.model_type = model_type

0 commit comments

Comments
 (0)