-
Notifications
You must be signed in to change notification settings - Fork 468
Description
The model to consider.
The closest model vllm already supports.
No response
What's your difficulty of supporting the model you want?
vllm-project/vllm#16387 here is the suppor PR but I cannot run it directly. Below is the ERROR
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700] File "/vllm-workspace/main-vllm/vllm/model_executor/models/kimi_vl.py", line 302, in init
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700] self.language_model = DeepseekV2Model(
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700] ^^^^^^^^^^^^^^^^
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700] File "/vllm-workspace/main-vllm/vllm/compilation/decorators.py", line 183, in init
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700] old_init(self, vllm_config=vllm_config, prefix=prefix, **kwargs)
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700] File "/vllm-workspace/main-vllm/vllm/model_executor/models/deepseek_v2.py", line 674, in init
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700] self.start_layer, self.end_layer, self.layers = make_layers(
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700] ^^^^^^^^^^^^
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700] File "/vllm-workspace/main-vllm/vllm/model_executor/models/utils.py", line 640, in make_layers
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700] [PPMissingLayer() for _ in range(start_layer)] + [
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700] ^
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700] File "/vllm-workspace/main-vllm/vllm/model_executor/models/utils.py", line 641, in
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700] maybe_offload_to_cpu(layer_fn(prefix=f"{prefix}.{idx}"))
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700] File "/vllm-workspace/main-vllm/vllm/model_executor/models/deepseek_v2.py", line 676, in
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700] lambda prefix: DeepseekV2DecoderLayer(
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700] ^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700] File "/vllm-workspace/main-vllm/vllm/model_executor/models/deepseek_v2.py", line 562, in init
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700] self.self_attn = attn_cls(
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700] ^^^^^^^^^
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700] File "/vllm-workspace/main-vllm/vllm/model_executor/models/deepseek_v2.py", line 475, in init
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700] self.mla_attn = Attention(
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700] ^^^^^^^^^^
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700] File "/vllm-workspace/main-vllm/vllm/attention/layer.py", line 175, in init
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700] self.impl = impl_cls(num_heads, head_size, scale, num_kv_heads,
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700] File "/vllm-workspace/main-vllm-ascend/vllm_ascend/attention/mla_v1.py", line 455, in init
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700] self.rotary_emb = kwargs['rotary_emb']
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700] ~~~~~~^^^^^^^^^^^^^^
(EngineCore_0 pid=12216) ERROR 08-27 11:28:55 [core.py:700] KeyError: 'rotary_emb'
Version of vllm and vllm-ascend
