Skip to content

Commit 8a794ff

Browse files
authored
Support minicpm-v-v2_5-chat (#970)
1 parent 21f803d commit 8a794ff

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

53 files changed

+339
-117
lines changed

README.md

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -45,7 +45,8 @@ Additionally, we are expanding capabilities for other modalities. Currently, we
4545
SWIFT has rich documentations for users, please check [here](https://github.com/modelscope/swift/tree/main/docs/source_en/LLM).
4646

4747
## 🎉 News
48-
- 🔥2024.05.20: Support for inferencing and fine-tuning cogvlm2-llama3-chinese-chat-19B, cogvlm2-llama3-chat-19B. you can refer to [cogvlm2 Best Practices](docs/source_en/Multi-Modal/cogvlm2-best-practice.md).
48+
- 🔥2024.05.21: Inference and fine-tuning support for MiniCPM-Llama3-V-2_5 are now available. For more details, please refer to [minicpm-v-2.5 Best Practice](docs/source/Multi-Modal/minicpm-v-2.5最佳实践.md).
49+
- 🔥2024.05.20: Support for inferencing and fine-tuning cogvlm2-llama3-chinese-chat-19B, cogvlm2-llama3-chat-19B. you can refer to [cogvlm2 Best Practice](docs/source_en/Multi-Modal/cogvlm2-best-practice.md).
4950
- 🔥2024.05.17: Support peft=0.11.0. Meanwhile support 3 new tuners: `BOFT`, `Vera` and `Pissa`. use `--sft_type boft/vera` to use BOFT or Vera, use `--init_lora_weights pissa` with `--sft_type lora` to use Pissa.
5051
- 2024.05.16: Supports Llava-Next (Stronger) series models. For best practice, you can refer to [here](https://github.com/modelscope/swift/tree/main/docs/source_en/Multi-Modal/llava-best-practice.md).
5152
- 🔥2024.05.13: Support Yi-1.5 series models,use `--model_type yi-1_5-9b-chat` to begin!
@@ -61,7 +62,7 @@ SWIFT has rich documentations for users, please check [here](https://github.com/
6162
- 2024.04.22: Support for inference, fine-tuning, and deployment of **chinese-llama-alpaca-2** series models. This includes:chinese-llama-2-1.3b, chinese-llama-2-7b, chinese-llama-2-13b, chinese-alpaca-2-1.3b, chinese-alpaca-2-7b and chinese-alpaca-2-13b along with their corresponding 16k and 64k long text versions.
6263
- 2024.04.22: Support for inference and fine-tuning of Llama3 GPTQ-Int4, GPTQ-Int8, and AWQ series models. Support for inference and fine-tuning of chatglm3-6b-128k, Openbuddy-Llama3.
6364
- 2024.04.20: Support for inference, fine-tuning, and deployment of **Atom** series models. This includes: Atom-7B and Atom-7B-Chat. use [this script](https://github.com/modelscope/swift/blob/main/examples/pytorch/llm/scripts/atom_7b_chat/lora/sft.sh) to train.
64-
- 2024.04.19: Support for single-card, DDP, ZeRO2, and ZeRO3 training and inference with NPU, please refer to [NPU Inference and Fine-tuning Best Practices](docs/source_en/LLM/NPU-best-practice.md).
65+
- 2024.04.19: Support for single-card, DDP, ZeRO2, and ZeRO3 training and inference with NPU, please refer to [NPU Inference and Fine-tuning Best Practice](docs/source_en/LLM/NPU-best-practice.md).
6566
- 2024.04.19: Support for inference, fine-tuning, and deployment of **Llama3** series models. This includes: Llama-3-8B, Llama-3-8B-Instruct, Llama-3-70B, and Llama-3-70B-Instruct. use [this script](https://github.com/modelscope/swift/blob/main/examples/pytorch/llm/scripts/llama3_8b_instruct/lora/sft.sh) to train.
6667
<details><summary>More</summary>
6768

@@ -517,7 +518,7 @@ The complete list of supported models and datasets can be found at [Supported Mo
517518
| YI-VL | [01AI's YI series vision models](https://github.com/01-ai) | Chinese<br>English | 6B-34B | chat model |
518519
| XComposer2 | [Pujiang AI Lab InternLM vision model](https://github.com/InternLM/InternLM) | Chinese<br>English | 7B | chat model |
519520
| DeepSeek-VL | [DeepSeek series vision models](https://github.com/deepseek-ai) | Chinese<br>English | 1.3B-7B | chat model |
520-
| MiniCPM-V | [OpenBmB MiniCPM vision model](https://github.com/OpenBMB/MiniCPM) | Chinese<br>English | 3B | chat model |
521+
| MiniCPM-V<br>MiniCPM-V-2<br>MiniCPM-V-2_5 | [OpenBmB MiniCPM vision model](https://github.com/OpenBMB/MiniCPM) | Chinese<br>English | 3B-9B | chat model |
521522
| CogVLM<br>CogVLM2<br>CogAgent | [Zhipu ChatGLM visual QA and Agent model](https://github.com/THUDM/) | Chinese<br>English | 17B-19B | chat model |
522523
| Llava | [Llava series models](https://github.com/haotian-liu/LLaVA) | English | 7B-34B | chat model |
523524
| Llava-Next | [Llava-Next series models](https://github.com/LLaVA-VL/LLaVA-NeXT) | Chinese<br>English | 8B-110B | chat model |

README_CN.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -46,6 +46,7 @@ SWIFT支持近**200种LLM和MLLM**(多模态大模型)的训练、推理、
4646
SWIFT具有丰富的文档体系,如有使用问题请请查看[这里](https://github.com/modelscope/swift/tree/main/docs/source/LLM).
4747

4848
## 🎉 新闻
49+
- 🔥2024.05.21: 支持 MiniCPM-Llama3-V-2_5 的推理与微调, 可以查看[minicpm-v-2.5最佳实践](docs/source/Multi-Modal/minicpm-v-2.5最佳实践.md).
4950
- 🔥2024.05.20: 支持 cogvlm2-llama3-chinese-chat-19B, cogvlm2-llama3-chat-19B 的推理与微调, 可以查看[cogvlm2最佳实践](docs/source/Multi-Modal/cogvlm2最佳实践.md).
5051
- 🔥2024.05.17: 支持peft=0.11.0. 同时支持了三个新的tuner方法: `BOFT`, `Vera``Pissa`. 使用 `--sft_type boft/vera` 开启BOFT或者Vera, 使用 `--init_lora_weights pissa` 以及 `--sft_type lora` 来使用 Pissa.
5152
- 2024.05.16: 支持Llava-Next (Stronger)系列模型,最佳实践可以查看[这里](https://github.com/modelscope/swift/tree/main/docs/source/Multi-Modal/llava最佳实践.md).
@@ -517,7 +518,7 @@ CUDA_VISIBLE_DEVICES=0 swift deploy \
517518
| YI-VL | [01AI的YI系列视觉模型](https://github.com/01-ai) | 中文<br>英文 | 6B-34B | chat模型 |
518519
| XComposer2 | [浦江实验室书生浦语视觉模型](https://github.com/InternLM/InternLM) | 中文<br>英文 | 7B | chat模型 |
519520
| DeepSeek-VL | [幻方系列视觉模型](https://github.com/deepseek-ai) | 中文<br>英文 | 1.3B-7B | chat模型 |
520-
| MiniCPM-V | [OpenBmB MiniCPM视觉模型](https://github.com/OpenBMB/MiniCPM) | 中文<br>英文 | 3B | chat模型 |
521+
| MiniCPM-V<br>MiniCPM-V-2<br>MiniCPM-V-2_5 | [OpenBmB MiniCPM视觉模型](https://github.com/OpenBMB/MiniCPM) | 中文<br>英文 | 3B-9B | chat模型 |
521522
| CogVLM<br>CogVLM2<br>CogAgent | [智谱ChatGLM视觉问答和Agent模型](https://github.com/THUDM/) | 中文<br>英文 | 17B-19B | chat模型 |
522523
| Llava | [Llava系列模型](https://github.com/haotian-liu/LLaVA) | 英文 | 7B-34B | chat模型 |
523524
| Llava-Next | [Llava-Next系列模型](https://github.com/LLaVA-VL/LLaVA-NeXT) | 中文<br>英文 | 8B-110B | chat模型 |

docs/source/LLM/LLM微调文档.md

Lines changed: 3 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -56,8 +56,7 @@ from swift.llm import (
5656
model_type = ModelType.qwen_7b_chat
5757
sft_args = SftArguments(
5858
model_type=model_type,
59-
train_dataset_sample=2000,
60-
dataset=[DatasetName.blossom_math_zh],
59+
dataset=[f'{DatasetName.blossom_math_zh}#2000'],
6160
output_dir='output')
6261
result = sft_main(sft_args)
6362
best_model_checkpoint = result['best_model_checkpoint']
@@ -66,8 +65,7 @@ torch.cuda.empty_cache()
6665

6766
infer_args = InferArguments(
6867
ckpt_dir=best_model_checkpoint,
69-
load_dataset_config=True,
70-
val_dataset_sample=10)
68+
load_dataset_config=True)
7169
# merge_lora(infer_args, device_map='cpu')
7270
result = infer_main(infer_args)
7371
torch.cuda.empty_cache()
@@ -87,7 +85,7 @@ CUDA_VISIBLE_DEVICES=0 swift sft \
8785
# 使用自己的数据集
8886
CUDA_VISIBLE_DEVICES=0 swift sft \
8987
--model_id_or_path qwen/Qwen-7B-Chat \
90-
--custom_train_dataset_path chatml.jsonl \
88+
--dataset chatml.jsonl \
9189
--output_dir output \
9290

9391
# 使用DDP

docs/source/LLM/LLM量化文档.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -124,11 +124,11 @@ OMP_NUM_THREADS=14 CUDA_VISIBLE_DEVICES=0 swift export \
124124
--model_type qwen1half-7b-chat --quant_bits 4 \
125125
--dataset alpaca-zh alpaca-en sharegpt-gpt4-mini --quant_method gptq
126126

127-
# awq: 使用自定义量化数据集 (`--custom_val_dataset_path`参数不进行使用)
127+
# awq: 使用自定义量化数据集
128128
# gptq同理
129129
CUDA_VISIBLE_DEVICES=0 swift export \
130130
--model_type qwen1half-7b-chat --quant_bits 4 \
131-
--custom_train_dataset_path xxx.jsonl \
131+
--dataset xxx.jsonl \
132132
--quant_method awq
133133

134134
# 推理 swift量化产生的模型

docs/source/LLM/VLLM推理加速与部署.md

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -527,15 +527,13 @@ CUDA_VISIBLE_DEVICES=0,1,2,3 \
527527
NPROC_PER_NODE=4 \
528528
swift sft \
529529
--model_type llama2-7b-chat \
530-
--dataset sharegpt-gpt4-mini \
531-
--train_dataset_sample 1000 \
530+
--dataset self-cognition#500 sharegpt-gpt4-mini#1000 \
532531
--logging_steps 5 \
533532
--max_length 4096 \
534533
--learning_rate 5e-5 \
535534
--warmup_ratio 0.4 \
536535
--output_dir output \
537536
--lora_target_modules ALL \
538-
--self_cognition_sample 500 \
539537
--model_name 小黄 'Xiao Huang' \
540538
--model_author 魔搭 ModelScope \
541539
```

docs/source/LLM/支持的模型和数据集.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -204,7 +204,8 @@
204204
|minicpm-2b-128k|[OpenBMB/MiniCPM-2B-128k](https://modelscope.cn/models/OpenBMB/MiniCPM-2B-128k/summary)|q_proj, k_proj, v_proj|chatml|&#x2714;|&#x2714;|transformers>=4.36.0|-|[openbmb/MiniCPM-2B-128k](https://huggingface.co/openbmb/MiniCPM-2B-128k)|
205205
|minicpm-moe-8x2b|[OpenBMB/MiniCPM-MoE-8x2B](https://modelscope.cn/models/OpenBMB/MiniCPM-MoE-8x2B/summary)|q_proj, k_proj, v_proj|minicpm|&#x2714;|&#x2714;|transformers>=4.36.0|-|[openbmb/MiniCPM-MoE-8x2B](https://huggingface.co/openbmb/MiniCPM-MoE-8x2B)|
206206
|minicpm-v-3b-chat|[OpenBMB/MiniCPM-V](https://modelscope.cn/models/OpenBMB/MiniCPM-V/summary)|q_proj, k_proj, v_proj|minicpm-v|&#x2714;|&#x2718;||-|[openbmb/MiniCPM-V](https://huggingface.co/openbmb/MiniCPM-V)|
207-
|minicpm-v-v2|[OpenBMB/MiniCPM-V-2](https://modelscope.cn/models/OpenBMB/MiniCPM-V-2/summary)|q_proj, k_proj, v_proj|minicpm-v|&#x2714;|&#x2718;|timm|-|[openbmb/MiniCPM-V-2](https://huggingface.co/openbmb/MiniCPM-V-2)|
207+
|minicpm-v-v2-chat|[OpenBMB/MiniCPM-V-2](https://modelscope.cn/models/OpenBMB/MiniCPM-V-2/summary)|q_proj, k_proj, v_proj|minicpm-v|&#x2714;|&#x2718;|timm|-|[openbmb/MiniCPM-V-2](https://huggingface.co/openbmb/MiniCPM-V-2)|
208+
|minicpm-v-v2_5-chat|[OpenBMB/MiniCPM-Llama3-V-2_5](https://modelscope.cn/models/OpenBMB/MiniCPM-Llama3-V-2_5/summary)|q_proj, k_proj, v_proj|minicpm-v-v2_5|&#x2714;|&#x2718;|timm|-|[openbmb/MiniCPM-Llama3-V-2_5](https://huggingface.co/openbmb/MiniCPM-Llama3-V-2_5)|
208209
|openbuddy-llama2-13b-chat|[OpenBuddy/openbuddy-llama2-13b-v8.1-fp16](https://modelscope.cn/models/OpenBuddy/openbuddy-llama2-13b-v8.1-fp16/summary)|q_proj, k_proj, v_proj|openbuddy|&#x2714;|&#x2714;||-|[OpenBuddy/openbuddy-llama2-13b-v8.1-fp16](https://huggingface.co/OpenBuddy/openbuddy-llama2-13b-v8.1-fp16)|
209210
|openbuddy-llama3-8b-chat|[OpenBuddy/openbuddy-llama3-8b-v21.1-8k](https://modelscope.cn/models/OpenBuddy/openbuddy-llama3-8b-v21.1-8k/summary)|q_proj, k_proj, v_proj|openbuddy2|&#x2714;|&#x2714;||-|[OpenBuddy/openbuddy-llama3-8b-v21.1-8k](https://huggingface.co/OpenBuddy/openbuddy-llama3-8b-v21.1-8k)|
210211
|openbuddy-llama-65b-chat|[OpenBuddy/openbuddy-llama-65b-v8-bf16](https://modelscope.cn/models/OpenBuddy/openbuddy-llama-65b-v8-bf16/summary)|q_proj, k_proj, v_proj|openbuddy|&#x2714;|&#x2714;||-|[OpenBuddy/openbuddy-llama-65b-v8-bf16](https://huggingface.co/OpenBuddy/openbuddy-llama-65b-v8-bf16)|

docs/source/Multi-Modal/cogvlm2最佳实践.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -154,7 +154,7 @@ road:
154154
# 70GB GPU memory
155155
CUDA_VISIBLE_DEVICES=0 swift sft \
156156
--model_type cogvlm2-19b-chat \
157-
--dataset coco-mini-en-2 \
157+
--dataset coco-en-2-mini \
158158
```
159159

160160
[自定义数据集](../LLM/自定义与拓展.md#-推荐命令行参数的形式)支持json, jsonl样式, 以下是自定义数据集的例子:

docs/source/Multi-Modal/cogvlm最佳实践.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -133,7 +133,7 @@ road:
133133
# 50GB GPU memory
134134
CUDA_VISIBLE_DEVICES=0 swift sft \
135135
--model_type cogvlm-17b-chat \
136-
--dataset coco-mini-en-2 \
136+
--dataset coco-en-2-mini \
137137
```
138138

139139
[自定义数据集](../LLM/自定义与拓展.md#-推荐命令行参数的形式)支持json, jsonl样式, 以下是自定义数据集的例子:

docs/source/Multi-Modal/deepseek-vl最佳实践.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -160,7 +160,7 @@ LoRA微调:
160160
# 20GB GPU memory
161161
CUDA_VISIBLE_DEVICES=0 swift sft \
162162
--model_type deepseek-vl-7b-chat \
163-
--dataset coco-mini-en \
163+
--dataset coco-en-mini \
164164
```
165165

166166
全参数微调:
@@ -169,7 +169,7 @@ CUDA_VISIBLE_DEVICES=0 swift sft \
169169
# 4 * 70GB GPU memory
170170
NPROC_PER_NODE=4 CUDA_VISIBLE_DEVICES=0,1,2,3 swift sft \
171171
--model_type deepseek-vl-7b-chat \
172-
--dataset coco-mini-en \
172+
--dataset coco-en-mini \
173173
--sft_type full \
174174
--use_flash_attn true \
175175
--deepspeed default-zero2

docs/source/Multi-Modal/index.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
4. [Deepseek-VL最佳实践](deepseek-vl最佳实践.md)
99
5. [Yi-VL最佳实践.md](yi-vl最佳实践.md)
1010
6. [Internlm2-Xcomposers最佳实践](internlm-xcomposer2最佳实践.md)
11-
7. [MiniCPM-V最佳实践](minicpm-v最佳实践.md), [MiniCPM-V-2最佳实践](minicpm-v-2最佳实践.md)
11+
7. [MiniCPM-V最佳实践](minicpm-v最佳实践.md), [MiniCPM-V-2最佳实践](minicpm-v-2最佳实践.md), [MiniCPM-V-2.5最佳实践](minicpm-v-2.5最佳实践.md)
1212
8. [CogVLM最佳实践](cogvlm最佳实践.md), [CogVLM2最佳实践](cogvlm2最佳实践.md)
1313
9. [mPLUG-Owl2最佳实践](mplug-owl2最佳实践.md)
1414
10. [InternVL-Chat-V1.5最佳实践](internvl最佳实践.md)

0 commit comments

Comments
 (0)