Skip to content

Commit c6c1cdf

Browse files
authored
support LLava-Next(Stronger) model (#933)
1 parent 9cff868 commit c6c1cdf

File tree

8 files changed

+119
-16
lines changed

8 files changed

+119
-16
lines changed

README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -41,6 +41,7 @@ Additionally, we are expanding capabilities for other modalities. Currently, we
4141
SWIFT has rich documentations for users, please check [here](https://github.com/modelscope/swift/tree/main/docs/source_en/LLM).
4242

4343
## 🎉 News
44+
- 2024.05.16: Supports Llava-Next (Stronger) series models. For best practice, you can refer to [here](https://github.com/modelscope/swift/tree/main/docs/source_en/Multi-Modal/llava-best-practice.md).
4445
- 🔥2024.05.13: Support Yi-1.5 series models,use `--model_type yi-1_5-9b-chat` to begin!
4546
- 2024.05.11: Support for qlora training and quantized inference using [hqq](https://github.com/mobiusml/hqq) and [eetq](https://github.com/NetEase-FuXi/EETQ). For more information, see the [LLM Quantization Documentation](https://github.com/modelscope/swift/tree/main/docs/source_en/LLM/LLM-quantization.md).
4647
- 2024.05.10: Support split a sequence to multiple GPUs to reduce memory usage. Use this feature by `pip install .[seq_parallel]`, then add `--sequence_parallel_size n` to your DDP script to begin!
@@ -514,6 +515,7 @@ The complete list of supported models and datasets can be found at [Supported Mo
514515
| MiniCPM-V | [OpenBmB MiniCPM vision model](https://github.com/OpenBMB/MiniCPM) | Chinese<br>English | 3B | chat model |
515516
| CogVLM<br>CogAgent | [Zhipu ChatGLM visual QA and Agent model](https://github.com/THUDM/) | English | 17B-18B | chat model |
516517
| Llava | [Llava series models](https://github.com/haotian-liu/LLaVA) | English | 7B-34B | chat model |
518+
| Llava-Next | [Llava-Next series models](https://github.com/LLaVA-VL/LLaVA-NeXT) | Chinese<br>English | 8B-110B | chat model |
517519
| mPLUG-Owl | [mPLUG-Owl series models](https://github.com/X-PLUG/mPLUG-Owl) | English | 11B | chat model |
518520
| InternVL | [InternVL](https://github.com/OpenGVLab/InternVL) | Chinese<br>English | 25.5B<br>including quantized version | chat model |
519521
| Llava-llama3 | [xtuner](https://huggingface.co/xtuner/llava-llama-3-8b-v1_1-transformers) | English | 8B | chat model |

README_CN.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -42,6 +42,7 @@ SWIFT支持近**200种LLM和MLLM**(多模态大模型)的训练、推理、
4242
SWIFT具有丰富的文档体系,如有使用问题请请查看[这里](https://github.com/modelscope/swift/tree/main/docs/source/LLM).
4343

4444
## 🎉 新闻
45+
- 2024.05.16: 支持Llava-Next (Stronger)系列模型,最佳实践可以查看[这里](https://github.com/modelscope/swift/tree/main/docs/source/Multi-Modal/llava最佳实践.md).
4546
- 🔥2024.05.13: 支持Yi-1.5系列模型,使用`--model_type yi-1_5-9b-chat`等开始体验
4647
- 2024.05.11: 支持使用[hqq](https://github.com/mobiusml/hqq)[eetq](https://github.com/NetEase-FuXi/EETQ)进行qlora训练和量化推理,可以查看[LLM量化文档](https://github.com/modelscope/swift/tree/main/docs/source/LLM/LLM量化文档.md)
4748
- 2024.05.10: 支持序列并行. 先安装`pip install .[seq_parallel]`, 之后在DDP环境中添加`--sequence_parallel_size n`即可使用!
@@ -514,6 +515,7 @@ CUDA_VISIBLE_DEVICES=0 swift deploy \
514515
| MiniCPM-V | [OpenBmB MiniCPM视觉模型](https://github.com/OpenBMB/MiniCPM) | 中文<br>英文 | 3B | chat模型 |
515516
| CogVLM<br>CogAgent | [智谱ChatGLM视觉问答和Agent模型](https://github.com/THUDM/) | 英文 | 17B-18B | chat模型 |
516517
| Llava | [Llava系列模型](https://github.com/haotian-liu/LLaVA) | 英文 | 7B-34B | chat模型 |
518+
| Llava-Next | [Llava-Next系列模型](https://github.com/LLaVA-VL/LLaVA-NeXT) | 中文<br>英文 | 8B-110B | chat模型 |
517519
| mPLUG-Owl | [mPLUG-Owl系列模型](https://github.com/X-PLUG/mPLUG-Owl) | 英文 | 11B | chat模型 |
518520
| InternVL | [InternVL](https://github.com/OpenGVLab/InternVL) | 中文<br>英文 | 25.5B<br>包含量化版本 | chat模型 |
519521
| Llava-llama3 | [xtuner](https://huggingface.co/xtuner/llava-llama-3-8b-v1_1-transformers) | 英文 | 8B | chat model |

docs/source/LLM/支持的模型和数据集.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -123,6 +123,9 @@
123123
|atom-7b-chat|[FlagAlpha/Atom-7B-Chat](https://modelscope.cn/models/FlagAlpha/Atom-7B-Chat/summary)|q_proj, k_proj, v_proj|atom|&#x2714;|&#x2714;||-|[FlagAlpha/Atom-7B-Chat](https://huggingface.co/FlagAlpha/Atom-7B-Chat)|
124124
|llava1d6-mistral-7b-instruct|[AI-ModelScope/llava-v1.6-mistral-7b](https://modelscope.cn/models/AI-ModelScope/llava-v1.6-mistral-7b/summary)|q_proj, k_proj, v_proj|llava-mistral-instruct|&#x2714;|&#x2718;|transformers>=4.34|multi-modal, vision|[liuhaotian/llava-v1.6-mistral-7b](https://huggingface.co/liuhaotian/llava-v1.6-mistral-7b)|
125125
|llava1d6-yi-34b-instruct|[AI-ModelScope/llava-v1.6-34b](https://modelscope.cn/models/AI-ModelScope/llava-v1.6-34b/summary)|q_proj, k_proj, v_proj|llava-yi-instruct|&#x2714;|&#x2718;||multi-modal, vision|[liuhaotian/llava-v1.6-34b](https://huggingface.co/liuhaotian/llava-v1.6-34b)|
126+
|llama3-llava-next-8b|[AI-Modelscope/llama3-llava-next-8b](https://modelscope.cn/models/AI-Modelscope/llama3-llava-next-8b/summary)|q_proj, k_proj, v_proj|llama-llava-next|&#x2714;|&#x2718;||multi-modal, vision|[lmms-lab/llama3-llava-next-8b](https://huggingface.co/lmms-lab/llama3-llava-next-8b)|
127+
|llava-next-72b|[AI-Modelscope/llava-next-72b](https://modelscope.cn/models/AI-Modelscope/llava-next-72b/summary)|q_proj, k_proj, v_proj|llava-qwen-instruct|&#x2714;|&#x2718;||multi-modal, vision|[lmms-lab/llava-next-72b](https://huggingface.co/lmms-lab/llava-next-72b)|
128+
|llava-next-110b|[AI-Modelscope/llava-next-110b](https://modelscope.cn/models/AI-Modelscope/llava-next-110b/summary)|q_proj, k_proj, v_proj|llava-qwen-instruct|&#x2714;|&#x2718;||multi-modal, vision|[lmms-lab/llava-next-110b](https://huggingface.co/lmms-lab/llava-next-110b)|
126129
|yi-6b|[01ai/Yi-6B](https://modelscope.cn/models/01ai/Yi-6B/summary)|q_proj, k_proj, v_proj|default-generation|&#x2714;|&#x2714;||-|[01-ai/Yi-6B](https://huggingface.co/01-ai/Yi-6B)|
127130
|yi-6b-200k|[01ai/Yi-6B-200K](https://modelscope.cn/models/01ai/Yi-6B-200K/summary)|q_proj, k_proj, v_proj|default-generation|&#x2714;|&#x2714;||-|[01-ai/Yi-6B-200K](https://huggingface.co/01-ai/Yi-6B-200K)|
128131
|yi-6b-chat|[01ai/Yi-6B-Chat](https://modelscope.cn/models/01ai/Yi-6B-Chat/summary)|q_proj, k_proj, v_proj|yi|&#x2714;|&#x2714;||-|[01-ai/Yi-6B-Chat](https://huggingface.co/01-ai/Yi-6B-Chat)|

docs/source/Multi-Modal/llava最佳实践.md

Lines changed: 15 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,16 @@
11

22
# Llava 最佳实践
3+
本篇文档对应的模型
4+
5+
| model | model_type |
6+
|-------|------------|
7+
| [llava-v1.6-mistral-7b](https://modelscope.cn/models/AI-ModelScope/llava-v1.6-mistral-7b/summary) | llava1d6-mistral-7b-instruct |
8+
| [llava-v1.6-34b](https://www.modelscope.cn/models/AI-ModelScope/llava-v1.6-34b/summary) | llava1d6-yi-34b-instruct |
9+
|[llama3-llava-next-8b](https://modelscope.cn/models/AI-ModelScope/llama3-llava-next-8b/summary)|llama3-llava-next-8b|
10+
|[llava-next-72b](https://modelscope.cn/models/AI-ModelScope/llava-next-72b/summary)|llava-next-72b|
11+
|[llava-next-110b](https://modelscope.cn/models/AI-ModelScope/llava-next-110b/summary)|llava-next-110b|
12+
13+
以下实践以`llava-v1.6-mistral-7b`为例,你也可以通过指定`--model_type`切换为其他模型
314

415
## 目录
516
- [环境准备](#环境准备)
@@ -16,10 +27,8 @@ pip install -e '.[llm]'
1627
```
1728

1829
## 推理
19-
20-
推理[llava1d6-mistral-7b-instruct](https://modelscope.cn/models/AI-ModelScope/llava-v1.6-mistral-7b/summary)[llava1d6-yi-34b-instruct](https://www.modelscope.cn/models/AI-ModelScope/llava-v1.6-34b/summary):
2130
```shell
22-
# Experimental environment: A10, 3090, V100...
31+
# Experimental environment: A100
2332
# 20GB GPU memory
2433
CUDA_VISIBLE_DEVICES=0 swift infer --model_type llava1d6-mistral-7b-instruct
2534

@@ -110,7 +119,7 @@ from swift.llm import (
110119
from swift.utils import seed_everything
111120
import torch
112121

113-
model_type = ModelType.llava1d6_mistral_7b_instruct # ModelType.llava1d6_yi_34b_instruct
122+
model_type = 'llava1d6-mistral-7b-instruct'
114123
template_type = get_default_template_type(model_type)
115124
print(f'template_type: {template_type}')
116125

@@ -208,7 +217,7 @@ CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 swift sft \
208217
## 微调后推理
209218
直接推理:
210219
```shell
211-
model_type="llava1d6-mistral-7b-instruct" # "llava1d6-yi-34b-instruct"
220+
model_type="llava1d6-mistral-7b-instruct"
212221

213222
CUDA_VISIBLE_DEVICES=0 swift infer \
214223
--ckpt_dir output/${model_type}/vx-xxx/checkpoint-xxx \
@@ -217,7 +226,7 @@ CUDA_VISIBLE_DEVICES=0 swift infer \
217226

218227
**merge-lora**并推理:
219228
```shell
220-
model_type="llava1d6-mistral-7b-instruct" # "llava1d6-yi-34b-instruct"
229+
model_type="llava1d6-mistral-7b-instruct"
221230
CUDA_VISIBLE_DEVICES=0 swift export \
222231
--ckpt_dir "output/${model_type}/vx-xxx/checkpoint-xxx" \
223232
--merge_lora true

docs/source_en/LLM/Supported-models-datasets.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -123,6 +123,9 @@ The table below introcudes all models supported by SWIFT:
123123
|atom-7b-chat|[FlagAlpha/Atom-7B-Chat](https://modelscope.cn/models/FlagAlpha/Atom-7B-Chat/summary)|q_proj, k_proj, v_proj|atom|&#x2714;|&#x2714;||-|[FlagAlpha/Atom-7B-Chat](https://huggingface.co/FlagAlpha/Atom-7B-Chat)|
124124
|llava1d6-mistral-7b-instruct|[AI-ModelScope/llava-v1.6-mistral-7b](https://modelscope.cn/models/AI-ModelScope/llava-v1.6-mistral-7b/summary)|q_proj, k_proj, v_proj|llava-mistral-instruct|&#x2714;|&#x2718;|transformers>=4.34|multi-modal, vision|[liuhaotian/llava-v1.6-mistral-7b](https://huggingface.co/liuhaotian/llava-v1.6-mistral-7b)|
125125
|llava1d6-yi-34b-instruct|[AI-ModelScope/llava-v1.6-34b](https://modelscope.cn/models/AI-ModelScope/llava-v1.6-34b/summary)|q_proj, k_proj, v_proj|llava-yi-instruct|&#x2714;|&#x2718;||multi-modal, vision|[liuhaotian/llava-v1.6-34b](https://huggingface.co/liuhaotian/llava-v1.6-34b)|
126+
|llama3-llava-next-8b|[AI-Modelscope/llama3-llava-next-8b](https://modelscope.cn/models/AI-Modelscope/llama3-llava-next-8b/summary)|q_proj, k_proj, v_proj|llama-llava-next|&#x2714;|&#x2718;||multi-modal, vision|[lmms-lab/llama3-llava-next-8b](https://huggingface.co/lmms-lab/llama3-llava-next-8b)|
127+
|llava-next-72b|[AI-Modelscope/llava-next-72b](https://modelscope.cn/models/AI-Modelscope/llava-next-72b/summary)|q_proj, k_proj, v_proj|llava-qwen-instruct|&#x2714;|&#x2718;||multi-modal, vision|[lmms-lab/llava-next-72b](https://huggingface.co/lmms-lab/llava-next-72b)|
128+
|llava-next-110b|[AI-Modelscope/llava-next-110b](https://modelscope.cn/models/AI-Modelscope/llava-next-110b/summary)|q_proj, k_proj, v_proj|llava-qwen-instruct|&#x2714;|&#x2718;||multi-modal, vision|[lmms-lab/llava-next-110b](https://huggingface.co/lmms-lab/llava-next-110b)|
126129
|yi-6b|[01ai/Yi-6B](https://modelscope.cn/models/01ai/Yi-6B/summary)|q_proj, k_proj, v_proj|default-generation|&#x2714;|&#x2714;||-|[01-ai/Yi-6B](https://huggingface.co/01-ai/Yi-6B)|
127130
|yi-6b-200k|[01ai/Yi-6B-200K](https://modelscope.cn/models/01ai/Yi-6B-200K/summary)|q_proj, k_proj, v_proj|default-generation|&#x2714;|&#x2714;||-|[01-ai/Yi-6B-200K](https://huggingface.co/01-ai/Yi-6B-200K)|
128131
|yi-6b-chat|[01ai/Yi-6B-Chat](https://modelscope.cn/models/01ai/Yi-6B-Chat/summary)|q_proj, k_proj, v_proj|yi|&#x2714;|&#x2714;||-|[01-ai/Yi-6B-Chat](https://huggingface.co/01-ai/Yi-6B-Chat)|

docs/source_en/Multi-Modal/llava-best-practice.md

Lines changed: 16 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,16 @@
11
# Llava Best Practices
2+
The document corresponds to the following models
3+
4+
| model | model_type |
5+
|-------|------------|
6+
| [llava-v1.6-mistral-7b](https://modelscope.cn/models/AI-ModelScope/llava-v1.6-mistral-7b/summary) | llava1d6-mistral-7b-instruct |
7+
| [llava-v1.6-34b](https://www.modelscope.cn/models/AI-ModelScope/llava-v1.6-34b/summary) | llava1d6-yi-34b-instruct |
8+
|[llama3-llava-next-8b](https://modelscope.cn/models/AI-ModelScope/llama3-llava-next-8b/summary)|llama3-llava-next-8b|
9+
|[llava-next-72b](https://modelscope.cn/models/AI-ModelScope/llava-next-72b/summary)|llava-next-72b|
10+
|[llava-next-110b](https://modelscope.cn/models/AI-ModelScope/llava-next-110b/summary)|llava-next-110b|
11+
12+
The following practices take `llava-v1.6-mistral-7b` as an example. You can also switch to other models by specifying the `--model_type`.
13+
214

315
## Table of Contents
416
- [Environment Setup](#environment-setup)
@@ -14,10 +26,8 @@ pip install -e '.[llm]'
1426
```
1527

1628
## Inference
17-
18-
Inference for [llava1d6-mistral-7b-instruct](https://modelscope.cn/models/AI-ModelScope/llava-v1.6-mistral-7b/summary) and [llava1d6-yi-34b-instruct](https://www.modelscope.cn/models/AI-ModelScope/llava-v1.6-34b/summary):
1929
```shell
20-
# Experimental environment: A10, 3090, V100...
30+
# Experimental environment: A100
2131
# 20GB GPU memory
2232
CUDA_VISIBLE_DEVICES=0 swift infer --model_type llava1d6-mistral-7b-instruct
2333

@@ -108,7 +118,7 @@ from swift.llm import (
108118
from swift.utils import seed_everything
109119
import torch
110120

111-
model_type = ModelType.llava1d6_mistral_7b_instruct # ModelType.llava1d6_yi_34b_instruct
121+
model_type = 'llava1d6-mistral-7b-instruct'
112122
template_type = get_default_template_type(model_type)
113123
print(f'template_type: {template_type}')
114124

@@ -205,15 +215,15 @@ CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 swift sft \
205215
## Inference after Fine-tuning
206216
Direct inference:
207217
```shell
208-
model_type="llava1d6-mistral-7b-instruct" # "llava1d6-yi-34b-instruct"
218+
model_type="llava1d6-mistral-7b-instruct"
209219
CUDA_VISIBLE_DEVICES=0 swift infer \
210220
--ckpt_dir output/${model_type}/vx-xxx/checkpoint-xxx \
211221
--load_dataset_config true
212222
```
213223

214224
**merge-lora** and inference:
215225
```shell
216-
model_type="llava1d6-mistral-7b-instruct" # "llava1d6-yi-34b-instruct"
226+
model_type="llava1d6-mistral-7b-instruct"
217227
CUDA_VISIBLE_DEVICES=0 swift export \
218228
--ckpt_dir "output/${model_type}/vx-xxx/checkpoint-xxx" \
219229
--merge_lora true

swift/llm/utils/model.py

Lines changed: 42 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -160,6 +160,9 @@ class ModelType:
160160
# llava
161161
llava1d6_mistral_7b_instruct = 'llava1d6-mistral-7b-instruct'
162162
llava1d6_yi_34b_instruct = 'llava1d6-yi-34b-instruct'
163+
llama3_llava_next_8b = 'llama3-llava-next-8b'
164+
llava_next_72b = 'llava-next-72b'
165+
llava_next_110b = 'llava-next-110b'
163166
# yi
164167
yi_6b = 'yi-6b'
165168
yi_6b_200k = 'yi-6b-200k'
@@ -3910,23 +3913,53 @@ def _new_generate(inputs=None, *args, **kwargs):
39103913
function_kwargs={'llm_model_type': 'mistral'},
39113914
tags=['multi-modal', 'vision'],
39123915
hf_model_id='liuhaotian/llava-v1.6-mistral-7b')
3916+
@register_model(
3917+
ModelType.llama3_llava_next_8b,
3918+
'AI-Modelscope/llama3-llava-next-8b',
3919+
LoRATM.llama2,
3920+
TemplateType.llama_llava_next,
3921+
support_flash_attn=True,
3922+
tags=['multi-modal', 'vision'],
3923+
function_kwargs={'llm_model_type': 'next_llama'},
3924+
hf_model_id='lmms-lab/llama3-llava-next-8b')
3925+
@register_model(
3926+
ModelType.llava_next_72b,
3927+
'AI-Modelscope/llava-next-72b',
3928+
LoRATM.llama2,
3929+
TemplateType.llava_qwen_instruct,
3930+
support_flash_attn=True,
3931+
tags=['multi-modal', 'vision'],
3932+
function_kwargs={'llm_model_type': 'next_qwen'},
3933+
hf_model_id='lmms-lab/llava-next-72b')
3934+
@register_model(
3935+
ModelType.llava_next_110b,
3936+
'AI-Modelscope/llava-next-110b',
3937+
LoRATM.llama2,
3938+
TemplateType.llava_qwen_instruct,
3939+
support_flash_attn=True,
3940+
tags=['multi-modal', 'vision'],
3941+
function_kwargs={'llm_model_type': 'next_qwen'},
3942+
hf_model_id='lmms-lab/llava-next-110b')
39133943
def get_model_tokenizer_llava(model_dir: str,
39143944
torch_dtype: Dtype,
39153945
model_kwargs: Dict[str, Any],
39163946
load_model: bool = True,
39173947
**kwargs):
3948+
llm_model_type = kwargs.pop('llm_model_type')
39183949
if 'local_repo_path' in kwargs:
3919-
local_repo_path = kwargs['local_repo_path']
3950+
repo_path = kwargs['local_repo_path']
3951+
elif 'next' in llm_model_type:
3952+
repo_path = 'https://github.com/LLaVA-VL/LLaVA-NeXT.git'
39203953
else:
3921-
local_repo_path = _git_clone_github('https://github.com/haotian-liu/LLaVA.git')
3954+
repo_path = 'https://github.com/haotian-liu/LLaVA.git'
3955+
local_repo_path = _git_clone_github(repo_path)
39223956
sys.path.append(os.path.join(local_repo_path))
39233957

3924-
llm_model_type = kwargs.pop('llm_model_type')
39253958
if llm_model_type == 'mistral':
39263959
from llava.model import LlavaMistralForCausalLM, LlavaMistralConfig
39273960
model_config = LlavaMistralConfig.from_pretrained(model_dir)
39283961
automodel_class = LlavaMistralForCausalLM
3929-
else: # llama
3962+
elif 'llama' in llm_model_type: # llama
39303963
from llava.model import LlavaLlamaForCausalLM, LlavaConfig
39313964
if not hasattr(LlavaLlamaForCausalLM, '__old_forward'): # Avoid double patching
39323965
forward = LlavaLlamaForCausalLM.forward
@@ -3940,6 +3973,11 @@ def _new_forward(*args, **kwargs):
39403973
LlavaLlamaForCausalLM.forward = _new_forward
39413974
model_config = LlavaConfig.from_pretrained(model_dir)
39423975
automodel_class = LlavaLlamaForCausalLM
3976+
else: # qwen
3977+
from llava.model import LlavaQwenForCausalLM
3978+
automodel_class = LlavaQwenForCausalLM
3979+
model_config = AutoConfig.from_pretrained(model_dir)
3980+
39433981
model_config.mm_vision_tower = snapshot_download('AI-ModelScope/clip-vit-large-patch14-336')
39443982
model, tokenizer = get_model_tokenizer_with_flash_attn(
39453983
model_dir,

swift/llm/utils/template.py

Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -38,6 +38,8 @@ class TemplateType:
3838
llava_mistral_instruct = 'llava-mistral-instruct'
3939
llava_yi_instruct = 'llava-yi-instruct'
4040
llava_llama_instruct = 'llava-llama-instruct'
41+
llava_qwen_instruct = 'llava-qwen-instruct'
42+
llama_llava_next = 'llama-llava-next'
4143
openbuddy = 'openbuddy'
4244
openbuddy2 = 'openbuddy2'
4345
internlm = 'internlm'
@@ -1060,6 +1062,40 @@ def data_collator(self, batch: List[Dict[str, Any]], padding_to: Optional[int] =
10601062
lazy_tokenize=True)
10611063

10621064

1065+
class LlamaLlavaNextTemplate(LLavaTemplate):
1066+
default_system = 'You are a helpful language and vision assistant. ' \
1067+
'You are able to understand the visual content that the user provides, ' \
1068+
'and assist the user with a variety of tasks using natural language.'
1069+
1070+
def __init__(self):
1071+
Template.__init__(self, [], [
1072+
'<|start_header_id|>user<|end_header_id|>\n\n', [-200],
1073+
'\n{{QUERY}}<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n'
1074+
], ['<|eot_id|>'], ['<|eot_id|>'], self.default_system,
1075+
['<|begin_of_text|><|start_header_id|>system<|end_header_id|>\n\n{{SYSTEM}}'])
1076+
1077+
1078+
register_template(
1079+
TemplateType.llama_llava_next,
1080+
LlamaLlavaNextTemplate(),
1081+
use_model=True,
1082+
infer_media_type='round',
1083+
lazy_tokenize=True)
1084+
1085+
1086+
class LLavaQwenTemplate(LLavaTemplate):
1087+
llavayi_query_template = 'You are a helpful assistant'
1088+
1089+
def __init__(self):
1090+
Template.__init__(self, [], ['<|im_start|>user\n', [-200], '{{QUERY}}<|im_end|>\n<|im_start|>assistant\n'],
1091+
['<|im_end|>\n'], ['<|im_end|>'], self.llavayi_query_template,
1092+
['<|im_start|>system\n{{SYSTEM}}<|im_end|>\n'])
1093+
1094+
1095+
register_template(
1096+
TemplateType.llava_qwen_instruct, LLavaQwenTemplate(), use_model=True, infer_media_type='round', lazy_tokenize=True)
1097+
1098+
10631099
def _findall(token_list: List[int], token: int) -> List[int]:
10641100
"""Find the index of a token in the token_list."""
10651101
res = []

0 commit comments

Comments
 (0)