Skip to content

Commit eea5463

Browse files
authored
Support yi vl (#345)
1 parent e7b6a6f commit eea5463

27 files changed

+919
-737
lines changed

README.md

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -62,6 +62,7 @@ Users can check the [documentation of SWIFT](docs/source/GetStarted/快速使用
6262

6363

6464
## 🎉 News
65+
- 2024.1.26: Support [yi-vl-6b-chat](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/yi_vl_6b_chat), yi-vl-34b-chat.
6566
- 2024.1.24: Support codefuse-codegeex2-6b-chat, codefuse-qwen-14b-chat.
6667
- 2024.1.23: Support orion series: orion-14b, [orion-14b-chat](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/orion_14b_chat).
6768
- 2024.1.20: Support [xverse-13b-256k](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/xverse_13b_256k), xverse-65b-v2, xverse-65b-chat.
@@ -154,7 +155,8 @@ Here is a simple introduction of web-ui:
154155
- Multi-Modal:
155156
- qwen-vl series: qwen-vl, qwen-vl-chat, qwen-vl-chat-int4.
156157
- qwen-audio series: qwen-audio, qwen-audio-chat.
157-
- cogagent series: cogagent-chat, cogagent-vqa.
158+
- yi-vl series: yi-vl-6b-chat, yi-vl-34b-chat.
159+
- cogagent series: cogagent-18b-chat, cogagent-18b-instruct.
158160
- General:
159161
- qwen series: qwen-1_8b, qwen-1_8b-chat, qwen-1_8b-chat-int4, qwen-1_8b-chat-int8, qwen-7b, qwen-7b-chat, qwen-7b-chat-int4, qwen-7b-chat-int8, qwen-14b, qwen-14b-chat, qwen-14b-chat-int4, qwen-14b-chat-int8, qwen-72b, qwen-72b-chat, qwen-72b-chat-int4, qwen-72b-chat-int8.
160162
- chatglm series: chatglm2-6b, chatglm2-6b-32k, chatglm3-6b-base, chatglm3-6b, chatglm3-6b-32k.
@@ -200,7 +202,7 @@ Here is a simple introduction of web-ui:
200202
- Custom Dataset
201203
- Supported Templates:
202204
- Text Generation: default-generation, default-generation-bos, chatglm-generation.
203-
- Chat: default, qwen, baichuan, chatglm2, chatglm3, llama, openbuddy, internlm, internlm2, yi, yuan, xverse, ziya, skywork, bluelm, zephyr, sus, deepseek, deepseek-coder, codefuse-codellama, codefuse, cogagent.
205+
- Chat: default, qwen, baichuan, chatglm2, chatglm3, llama, openbuddy, internlm, internlm2, yi, yuan, xverse, ziya, skywork, bluelm, zephyr, sus, deepseek, deepseek-coder, codefuse-codellama, codefuse, cogagent-chat, cogagent-instruct.
204206

205207

206208
## 🔥SCEdit

README_CN.md

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -60,6 +60,7 @@ SWIFT(Scalable lightWeight Infrastructure for Fine-Tuning)是一个可扩展
6060
用户可以查看 [SWIFT官方文档](docs/source/GetStarted/快速使用.md) 来了解详细信息。
6161

6262
## 🎉 新闻
63+
- 2024.1.26: 支持[yi-vl-6b-chat](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/yi_vl_6b_chat), yi-vl-34b-chat.
6364
- 2024.1.24: 支持codefuse-codegeex2-6b-chat, codefuse-qwen-14b-chat.
6465
- 2024.1.23: 支持orion系列: orion-14b, [orion-14b-chat](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/orion_14b_chat).
6566
- 2024.1.20: 支持[xverse-13b-256k](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/xverse_13b_256k), xverse-65b-v2, xverse-65b-chat.
@@ -154,7 +155,8 @@ swift web-ui
154155
- 多模态:
155156
- qwen-vl 系列: qwen-vl, qwen-vl-chat, qwen-vl-chat-int4.
156157
- qwen-audio 系列: qwen-audio, qwen-audio-chat.
157-
- cogagent 系列: cogagent-chat, cogagent-vqa.
158+
- yi-vl 系列: yi-vl-6b-chat, yi-vl-34b-chat.
159+
- cogagent 系列: cogagent-18b-chat, cogagent-18b-instruct.
158160
- 通用:
159161
- qwen 系列: qwen-1_8b, qwen-1_8b-chat, qwen-1_8b-chat-int4, qwen-1_8b-chat-int8, qwen-7b, qwen-7b-chat, qwen-7b-chat-int4, qwen-7b-chat-int8, qwen-14b, qwen-14b-chat, qwen-14b-chat-int4, qwen-14b-chat-int8, qwen-72b, qwen-72b-chat, qwen-72b-chat-int4, qwen-72b-chat-int8.
160162
- chatglm 系列: chatglm2-6b, chatglm2-6b-32k, chatglm3-6b-base, chatglm3-6b, chatglm3-6b-32k.
@@ -200,7 +202,7 @@ swift web-ui
200202
- 自定义数据集
201203
- 支持的对话模板:
202204
- 文本生成: default-generation, default-generation-bos, chatglm-generation.
203-
- 对话: default, qwen, baichuan, chatglm2, chatglm3, llama, openbuddy, internlm, internlm2, yi, yuan, xverse, ziya, skywork, bluelm, zephyr, sus, deepseek, deepseek-coder, codefuse-codellama, codefuse, cogagent.
205+
- 对话: default, qwen, baichuan, chatglm2, chatglm3, llama, openbuddy, internlm, internlm2, yi, yuan, xverse, ziya, skywork, bluelm, zephyr, sus, deepseek, deepseek-coder, codefuse-codellama, codefuse, cogagent-chat, cogagent-instruct.
204206

205207

206208
## 🔥SCEdit

docs/source/LLM/VLLM推理加速与部署.md

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -249,6 +249,18 @@ CUDA_VISIBLE_DEVICES=0 swift deploy --model_type qwen-7b-chat
249249

250250
**客户端:**
251251

252+
测试:
253+
```bash
254+
curl http://localhost:8000/v1/chat/completions \
255+
-H "Content-Type: application/json" \
256+
-d '{
257+
"model": "qwen-7b-chat",
258+
"messages": [{"role": "user", "content": "晚上睡不着觉怎么办?"}],
259+
"max_tokens": 256,
260+
"temperature": 0
261+
}'
262+
```
263+
252264
使用swift:
253265
```python
254266
from swift.llm import get_model_list_client, XRequestConfig, inference_client
@@ -340,6 +352,19 @@ CUDA_VISIBLE_DEVICES=0 swift deploy --model_type qwen-7b
340352

341353
**客户端:**
342354

355+
测试:
356+
```bash
357+
curl http://localhost:8000/v1/completions \
358+
-H "Content-Type: application/json" \
359+
-d '{
360+
"model": "qwen-7b",
361+
"prompt": "浙江 -> 杭州\n安徽 -> 合肥\n四川 ->",
362+
"max_tokens": 32,
363+
"temperature": 0.1,
364+
"seed": 42
365+
}'
366+
```
367+
343368
使用swift:
344369
```python
345370
from swift.llm import get_model_list_client, XRequestConfig, inference_client

docs/source/LLM/命令行参数.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -75,7 +75,7 @@
7575
- `--hub_token`: 推送时需要的SDK token. 可以从[https://modelscope.cn/my/myaccesstoken](https://modelscope.cn/my/myaccesstoken)获取, 默认为`None`, 即从环境变量`MODELSCOPE_API_TOKEN`中获取. 该参数只有在`push_to_hub`设置为True时才生效.
7676
- `--test_oom_error`: 用于检测训练是否会发生OOM, 默认为`False`. 如果设置为True, 则会将训练集按max_length倒序进行排列, 方便OOM的测试. 该参数一般用于测试, 请谨慎设置.
7777
- `--disable_tqdm`: 是否不启用tqdm, 这在`nohup`启动脚本时很有用. 默认为`False`, 即为启动tqdm.
78-
- `--lazy_tokenize`: 用于延迟对文本进行编码, 减少预处理的等待并减少内存占用, 这在处理大数据集时很有用. 默认为`False`, 即在`trainer.train()`之前提前对所有文本进行预处理.
78+
- `--lazy_tokenize`: 如果设置为False, 则在`trainer.train()`之前提前对所有文本进行预处理. 如果设置为True, 则延迟对文本进行编码, 减少预处理的等待并减少内存占用, 这在处理大数据集时很有用. 默认为`None`, 即我们会根据template的类型进行智能选择, LLM的模型通常设置为False, 多模态的模型通常设置为True(避免图片和音频加载导致过多的内存占用).
7979
- `--preprocess_num_proc`: 在对数据集预处理时(对文本进行tokenize), 使用多进程. 默认为`1`. 与`lazy_tokenize`命令行参数一样, 用于解决预处理速度慢的问题. 但该策略无法减少内存占用, 所以如果当数据集巨大时, 建议使用`lazy_tokenize`. 推荐设置的值: 4, 8. 请注意: 当使用qwen-audio时, 该参数会强制设置为1, 因为qwen-audio的预处理函数中使用了torch的多进程, 会造成不兼容问题.
8080
- `--use_flash_attn`: 是否使用flash attn, 默认为`None`. 安装flash_attn的步骤可以查看[https://github.com/Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention). 支持flash_attn的模型可以查看[LLM支持的模型](./支持的模型和数据集.md#模型).
8181
- `--ignore_args_error`: 是否忽略命令行传参错误抛出的Error, 默认为`False`. 如果需要拷贝代码到notebook中运行, 需要设置成True.

docs/source/LLM/支持的模型和数据集.md

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -33,8 +33,8 @@
3333
|qwen-vl|[qwen/Qwen-VL](https://modelscope.cn/models/qwen/Qwen-VL/summary)|c_attn|default-generation|✔|✘||
3434
|qwen-vl-chat|[qwen/Qwen-VL-Chat](https://modelscope.cn/models/qwen/Qwen-VL-Chat/summary)|c_attn|qwen|✔|✘||
3535
|qwen-vl-chat-int4|[qwen/Qwen-VL-Chat-Int4](https://modelscope.cn/models/qwen/Qwen-VL-Chat-Int4/summary)|c_attn|qwen|✔|✘|auto_gptq>=0.5|
36-
|qwen-audio|[qwen/Qwen-Audio](https://modelscope.cn/models/qwen/Qwen-Audio/summary)|c_attn|default-generation|✔|✘||
37-
|qwen-audio-chat|[qwen/Qwen-Audio-Chat](https://modelscope.cn/models/qwen/Qwen-Audio-Chat/summary)|c_attn|qwen|✔|✘||
36+
|qwen-audio|[qwen/Qwen-Audio](https://modelscope.cn/models/qwen/Qwen-Audio/summary)|c_attn|qwen-audio-generation|✔|✘||
37+
|qwen-audio-chat|[qwen/Qwen-Audio-Chat](https://modelscope.cn/models/qwen/Qwen-Audio-Chat/summary)|c_attn|qwen-audio|✔|✘||
3838
|chatglm2-6b|[ZhipuAI/chatglm2-6b](https://modelscope.cn/models/ZhipuAI/chatglm2-6b/summary)|query_key_value|chatglm2|✘|✔||
3939
|chatglm2-6b-32k|[ZhipuAI/chatglm2-6b-32k](https://modelscope.cn/models/ZhipuAI/chatglm2-6b-32k/summary)|query_key_value|chatglm2|✘|✔||
4040
|chatglm3-6b-base|[ZhipuAI/chatglm3-6b-base](https://modelscope.cn/models/ZhipuAI/chatglm3-6b-base/summary)|query_key_value|chatglm-generation|✘|✔||
@@ -53,6 +53,8 @@
5353
|yi-34b|[01ai/Yi-34B](https://modelscope.cn/models/01ai/Yi-34B/summary)|q_proj, k_proj, v_proj|default-generation|✔|✔||
5454
|yi-34b-200k|[01ai/Yi-34B-200K](https://modelscope.cn/models/01ai/Yi-34B-200K/summary)|q_proj, k_proj, v_proj|default-generation|✔|✔||
5555
|yi-34b-chat|[01ai/Yi-34B-Chat](https://modelscope.cn/models/01ai/Yi-34B-Chat/summary)|q_proj, k_proj, v_proj|yi|✔|✔||
56+
|yi-vl-6b-chat|[01ai/Yi-VL-6B](https://modelscope.cn/models/01ai/Yi-VL-6B/summary)|q_proj, k_proj, v_proj|yi-vl|✘|✘|transformers>=4.34|
57+
|yi-vl-34b-chat|[01ai/Yi-VL-34B](https://modelscope.cn/models/01ai/Yi-VL-34B/summary)|q_proj, k_proj, v_proj|yi-vl|✘|✘|transformers>=4.34|
5658
|internlm-7b|[Shanghai_AI_Laboratory/internlm-7b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm-7b/summary)|q_proj, k_proj, v_proj|default-generation-bos|✘|✔||
5759
|internlm-7b-chat|[Shanghai_AI_Laboratory/internlm-chat-7b-v1_1](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm-chat-7b-v1_1/summary)|q_proj, k_proj, v_proj|internlm|✘|✔||
5860
|internlm-7b-chat-8k|[Shanghai_AI_Laboratory/internlm-chat-7b-8k](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm-chat-7b-8k/summary)|q_proj, k_proj, v_proj|internlm|✘|✔||
@@ -131,8 +133,8 @@
131133
|deepseek-coder-33b|[deepseek-ai/deepseek-coder-33b-base](https://modelscope.cn/models/deepseek-ai/deepseek-coder-33b-base/summary)|q_proj, k_proj, v_proj|default-generation-bos|✔|✔||
132134
|deepseek-coder-33b-instruct|[deepseek-ai/deepseek-coder-33b-instruct](https://modelscope.cn/models/deepseek-ai/deepseek-coder-33b-instruct/summary)|q_proj, k_proj, v_proj|deepseek-coder|✔|✔||
133135
|phi2-3b|[AI-ModelScope/phi-2](https://modelscope.cn/models/AI-ModelScope/phi-2/summary)|Wqkv|default-generation|✔|✔||
134-
|cogagent-chat|[ZhipuAI/cogagent-chat](https://modelscope.cn/models/ZhipuAI/cogagent-chat/summary)|vision_expert_query_key_value, vision_expert_dense, language_expert_query_key_value, language_expert_dense, query, key_value, dense|cogagent|✘|✘||
135-
|cogagent-vqa|[ZhipuAI/cogagent-vqa](https://modelscope.cn/models/ZhipuAI/cogagent-vqa/summary)|vision_expert_query_key_value, vision_expert_dense, language_expert_query_key_value, language_expert_dense, query, key_value, dense|cogagent|✘|✘||
136+
|cogagent-18b-chat|[ZhipuAI/cogagent-chat](https://modelscope.cn/models/ZhipuAI/cogagent-chat/summary)|vision_expert_query_key_value, vision_expert_dense, language_expert_query_key_value, language_expert_dense, query, key_value, dense|cogagent-chat|✘|✘||
137+
|cogagent-18b-instruct|[ZhipuAI/cogagent-vqa](https://modelscope.cn/models/ZhipuAI/cogagent-vqa/summary)|vision_expert_query_key_value, vision_expert_dense, language_expert_query_key_value, language_expert_dense, query, key_value, dense|cogagent-instruct|✘|✘||
136138

137139

138140
## 数据集

docs/source/LLM/自定义与拓展.md

Lines changed: 11 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -125,12 +125,19 @@ AAAAA,BBBBB,CCCCC
125125
{"query": "AAAAA", "response": "BBBBB", "rejected_response": "CCCCC"}
126126
```
127127

128-
**CogAgent模型**
128+
**CogAgent 系列**
129129

130130
```jsonl
131-
{"query": "55555", "response": "66666", "image": "some-local-image-path"}
132-
{"query": "eeeee", "response": "fffff", "history": [], "image": "some-http-image-path"}
133-
{"query": "EEEEE", "response": "FFFFF", "history": [["AAAAA", "BBBBB"], ["CCCCC", "DDDDD"]], "image": "some-local-image-path"}
131+
{"query": "55555", "response": "66666", "images": ["image_path"]}
132+
{"query": "eeeee", "response": "fffff", "history": [], "images": ["image_path"]}
133+
{"query": "EEEEE", "response": "FFFFF", "history": [["AAAAA", "BBBBB"], ["CCCCC", "DDDDD"]], "images": ["image_path"]}
134+
```
135+
136+
**Yi-VL 系列**
137+
```jsonl
138+
{"query": "55555", "response": "66666", "images": ["image_path"]}
139+
{"query": "eeeee", "response": "fffff", "history": [], "images": ["image_path"]}
140+
{"query": "EEEEE", "response": "FFFFF", "history": [["AAAAA", "BBBBB"], ["CCCCC", "DDDDD"]], "images": ["image_path", "image_path2", "image_path3"]}
134141
```
135142

136143
image字段支持本地图片文件和http可访问的image url两类。

docs/source/cources/data_processing.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -140,7 +140,7 @@ template: Template = get_template(
140140
'qwen',
141141
tokenizer,
142142
max_length=256)
143-
resp = template.encode({'query': 'How are you?', "response": "I am fine"})
143+
resp = template.encode({'query': 'How are you?', "response": "I am fine"})[0]
144144
print(resp)
145145
# {'input_ids': [151644, 8948, 198, 2610, 525, 264, 10950, 17847, 13, 151645, 198, 151644, 872, 198, 4340, 525, 498, 30, 151645, 198, 151644, 77091, 198, 40, 1079, 6915, 151645], 'labels': [-100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, 40, 1079, 6915, 151645]}
146146
```
Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
# Experimental environment: V100, A10, 3090
2+
CUDA_VISIBLE_DEVICES=0 \
3+
swift infer \
4+
--ckpt_dir "output/yi-vl-6b-chat/vx_xxx/checkpoint-xxx" \
5+
--load_dataset_config true \
6+
--max_length 2048 \
7+
--use_flash_attn false \
8+
--max_new_tokens 2048 \
9+
--temperature 0.5 \
10+
--top_p 0.7 \
11+
--repetition_penalty 1. \
12+
--do_sample true \
13+
--merge_lora_and_save false \
Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,31 @@
1+
# Experimental environment: V100, A10, 3090
2+
# 18GB GPU memory
3+
CUDA_VISIBLE_DEVICES=0 \
4+
swift sft \
5+
--model_type yi-vl-6b-chat \
6+
--sft_type lora \
7+
--tuner_backend swift \
8+
--template_type AUTO \
9+
--dtype AUTO \
10+
--output_dir output \
11+
--dataset coco-mini-en \
12+
--train_dataset_sample -1 \
13+
--num_train_epochs 1 \
14+
--max_length 2048 \
15+
--check_dataset_strategy warning \
16+
--lora_rank 8 \
17+
--lora_alpha 32 \
18+
--lora_dropout_p 0.05 \
19+
--lora_target_modules DEFAULT \
20+
--gradient_checkpointing true \
21+
--batch_size 1 \
22+
--weight_decay 0.01 \
23+
--learning_rate 1e-4 \
24+
--gradient_accumulation_steps 16 \
25+
--max_grad_norm 0.5 \
26+
--warmup_ratio 0.03 \
27+
--eval_steps 100 \
28+
--save_steps 100 \
29+
--save_total_limit 2 \
30+
--logging_steps 10 \
31+
--use_flash_attn false \

swift/llm/deploy.py

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,5 @@
11
# Copyright (c) Alibaba, Inc. and its affiliates.
22
import time
3-
from copy import deepcopy
43
from dataclasses import asdict
54
from http import HTTPStatus
65
from typing import List, Optional, Union
@@ -88,7 +87,7 @@ async def inference_vllm_async(request: Union[ChatCompletionRequest,
8887
f'the model `{llm_engine.model_type}` is in text generation format. '
8988
'Please use the `completions` API.')
9089
example = messages_to_history(request.messages)
91-
input_ids = template.encode(example)['input_ids']
90+
input_ids = template.encode(example)[0]['input_ids']
9291
request_id = f'chatcmpl-{random_uuid()}'
9392
else:
9493
if not is_generation_template(template.template_type):
@@ -97,7 +96,7 @@ async def inference_vllm_async(request: Union[ChatCompletionRequest,
9796
f'The chat template `{template.template_type}` corresponding to '
9897
f'the model `{llm_engine.model_type}` is in chat format. '
9998
'Please use the `chat.completions` API.')
100-
input_ids = template.encode({'query': request.prompt})['input_ids']
99+
input_ids = template.encode({'query': request.prompt})[0]['input_ids']
101100
request_id = f'cmpl-{random_uuid()}'
102101

103102
error_msg = await check_length(request, input_ids)

0 commit comments

Comments
 (0)