Skip to content

[New Model]: Kimi-VL models are already supported on vllm, but I cannnot run them directly #2579

@YuanCheng-coder

Description

@YuanCheng-coder

The model to consider.

Kimi-VL-A3B-Instruct

The closest model vllm already supports.

No response

What's your difficulty of supporting the model you want?

vllm-project/vllm#16387 here is the suppor PR but I cannot run it directly. Below is the ERROR

(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700] File "/vllm-workspace/main-vllm/vllm/model_executor/models/kimi_vl.py", line 302, in init
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700] self.language_model = DeepseekV2Model(
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700] ^^^^^^^^^^^^^^^^
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700] File "/vllm-workspace/main-vllm/vllm/compilation/decorators.py", line 183, in init
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700] old_init(self, vllm_config=vllm_config, prefix=prefix, **kwargs)
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700] File "/vllm-workspace/main-vllm/vllm/model_executor/models/deepseek_v2.py", line 674, in init
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700] self.start_layer, self.end_layer, self.layers = make_layers(
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700] ^^^^^^^^^^^^
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700] File "/vllm-workspace/main-vllm/vllm/model_executor/models/utils.py", line 640, in make_layers
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700] [PPMissingLayer() for _ in range(start_layer)] + [
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700] ^
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700] File "/vllm-workspace/main-vllm/vllm/model_executor/models/utils.py", line 641, in
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700] maybe_offload_to_cpu(layer_fn(prefix=f"{prefix}.{idx}"))
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700] File "/vllm-workspace/main-vllm/vllm/model_executor/models/deepseek_v2.py", line 676, in
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700] lambda prefix: DeepseekV2DecoderLayer(
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700] ^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700] File "/vllm-workspace/main-vllm/vllm/model_executor/models/deepseek_v2.py", line 562, in init
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700] self.self_attn = attn_cls(
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700] ^^^^^^^^^
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700] File "/vllm-workspace/main-vllm/vllm/model_executor/models/deepseek_v2.py", line 475, in init
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700] self.mla_attn = Attention(
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700] ^^^^^^^^^^
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700] File "/vllm-workspace/main-vllm/vllm/attention/layer.py", line 175, in init
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700] self.impl = impl_cls(num_heads, head_size, scale, num_kv_heads,
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700] File "/vllm-workspace/main-vllm-ascend/vllm_ascend/attention/mla_v1.py", line 455, in init
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700] self.rotary_emb = kwargs['rotary_emb']
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700] ~~~~~~^^^^^^^^^^^^^^
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700] KeyError: 'rotary_emb'

Version of vllm and vllm-ascend

Image

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions