[New Model]: Kimi-VL models are already supported on vllm, but I cannnot run them directly

### The model to consider.

[Kimi-VL-A3B-Instruct](https://huggingface.co/moonshotai/Kimi-VL-A3B-Instruct)


### The closest model vllm already supports.

_No response_

### What's your difficulty of supporting the model you want?

https://github.com/vllm-project/vllm/pull/16387 here is the suppor PR but I cannot run it directly. Below is the ERROR

(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700]   File "/vllm-workspace/main-vllm/vllm/model_executor/models/kimi_vl.py", line 302, in __init__
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700]     self.language_model = DeepseekV2Model(
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700]                           ^^^^^^^^^^^^^^^^
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700]   File "/vllm-workspace/main-vllm/vllm/compilation/decorators.py", line 183, in __init__
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700]     old_init(self, vllm_config=vllm_config, prefix=prefix, **kwargs)
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700]   File "/vllm-workspace/main-vllm/vllm/model_executor/models/deepseek_v2.py", line 674, in __init__
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700]     self.start_layer, self.end_layer, self.layers = make_layers(
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700]                                                     ^^^^^^^^^^^^
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700]   File "/vllm-workspace/main-vllm/vllm/model_executor/models/utils.py", line 640, in make_layers
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700]     [PPMissingLayer() for _ in range(start_layer)] + [
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700]                                                      ^
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700]   File "/vllm-workspace/main-vllm/vllm/model_executor/models/utils.py", line 641, in <listcomp>
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700]     maybe_offload_to_cpu(layer_fn(prefix=f"{prefix}.{idx}"))
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700]                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700]   File "/vllm-workspace/main-vllm/vllm/model_executor/models/deepseek_v2.py", line 676, in <lambda>
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700]     lambda prefix: DeepseekV2DecoderLayer(
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700]                    ^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700]   File "/vllm-workspace/main-vllm/vllm/model_executor/models/deepseek_v2.py", line 562, in __init__
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700]     self.self_attn = attn_cls(
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700]                      ^^^^^^^^^
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700]   File "/vllm-workspace/main-vllm/vllm/model_executor/models/deepseek_v2.py", line 475, in __init__
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700]     self.mla_attn = Attention(
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700]                     ^^^^^^^^^^
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700]   File "/vllm-workspace/main-vllm/vllm/attention/layer.py", line 175, in __init__
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700]     self.impl = impl_cls(num_heads, head_size, scale, num_kv_heads,
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700]                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700]   File "/vllm-workspace/main-vllm-ascend/vllm_ascend/attention/mla_v1.py", line 455, in __init__
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700]     self.rotary_emb = kwargs['rotary_emb']
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700]                       ~~~~~~^^^^^^^^^^^^^^
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700] KeyError: 'rotary_emb'

Version of vllm and vllm-ascend

<img width="533" height="403" alt="Image" src="https://github.com/user-attachments/assets/b94496e6-ea3d-439c-b1fc-6a7e08ba486f" />






Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[New Model]: Kimi-VL models are already supported on vllm, but I cannnot run them directly #2579

The model to consider.

The closest model vllm already supports.

What's your difficulty of supporting the model you want?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[New Model]: Kimi-VL models are already supported on vllm, but I cannnot run them directly #2579

Description

The model to consider.

The closest model vllm already supports.

What's your difficulty of supporting the model you want?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions