You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
- 2024.09.06: Support fine-tuning and inference for mplug-owl3. Use `swift infer --model_type mplug-owl3-7b-chat` to experience it.
58
+
- 2024.09.06: Support fine-tuning and inference for mplug-owl3. Best practices can be found [here](https://github.com/modelscope/ms-swift/issues/1969).
59
59
- 2024.09.05: Support for the minicpm3-4b model. Experience it using `swift infer --model_type minicpm3-4b`.
60
60
- 2024.09.05: Support for the yi-coder series models. Experience it using `swift infer --model_type yi-coder-1_5b-chat`.
61
61
- 🔥2024.08.30: Support for inference and fine-tuning of the qwen2-vl series models: qwen2-vl-2b-instruct, qwen2-vl-7b-instruct. The best practices can be found [here](docs/source_en/Multi-Modal/qwen2-vl-best-practice.md).
Copy file name to clipboardExpand all lines: docs/source_en/Multi-Modal/index.md
+5-5Lines changed: 5 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -14,15 +14,15 @@ A single round of dialogue can contain multiple images (or no images):
14
14
2.[Qwen-Audio Best Practice](qwen-audio-best-practice.md), [Qwen2-Audio Best Practice](https://github.com/modelscope/ms-swift/issues/1653)
15
15
3.[Llava Best Practice](llava-best-practice.md), [LLava Video Best Practice](llava-video-best-practice.md)
16
16
4.[InternVL Series Best Practice](internvl-best-practice.md)
17
-
5.[Deepseek-VL Best Practice](deepseek-vl-best-practice.md)
18
-
6.[Internlm2-Xcomposers Best Practice](internlm-xcomposer2-best-practice.md)
19
-
7.[Phi3-Vision Best Practice](phi3-vision-best-practice.md), [Phi3.5-Vision Best Practice](https://github.com/modelscope/ms-swift/issues/1809).
20
-
17
+
5.[MiniCPM-V Best Practice](minicpm-v-best-practice.md), [MiniCPM-V-2.6 Best Practice](https://github.com/modelscope/ms-swift/issues/1613)
18
+
6.[Deepseek-VL Best Practice](deepseek-vl-best-practice.md)
19
+
7.[Internlm2-Xcomposers Best Practice](internlm-xcomposer2-best-practice.md)
20
+
8.[Phi3-Vision Best Practice](phi3-vision-best-practice.md), [Phi3.5-Vision Best Practice](https://github.com/modelscope/ms-swift/issues/1809).
21
+
9.[mPLUG-Owl3 Best Practice](https://github.com/modelscope/ms-swift/issues/1969)
21
22
22
23
A single round of dialogue can only contain one image:
23
24
1.[Yi-VL Best Practice.md](yi-vl-best-practice.md)
24
25
2.[Florence Best Practice.md](florence-best-pratice.md)
25
26
26
27
The entire conversation revolves around one image.
27
28
1.[CogVLM Best Practice](cogvlm-best-practice.md), [CogVLM2 Best Practice](cogvlm2-best-practice.md), [GLM4V Best Practice](glm4v-best-practice.md), [CogVLM2-Video Best Practice](cogvlm2-video-best-practice.md)
28
-
2.[MiniCPM-V Best Practice](minicpm-v-best-practice.md), [MiniCPM-V-2.6 Best Practice](https://github.com/modelscope/ms-swift/issues/1613)
0 commit comments