[Bug]: Mistral 3.1 Small Image inference is broken on 0.8.4

### Your current environment

<details>
<summary>The output of `python collect_env.py`</summary>

```text
Your output of `python collect_env.py` here
```

</details>


### 🐛 Describe the bug

I tested that `vllm==0.8.3` works fine and `vllm==0.8.4` fails

Server:
```
vllm serve nm-testing/Mistral-Small-3.1-24B-Instruct-2503-FP8-dynamic --disable-log-requests
```

Client:
```python
from openai import OpenAI

openai_api_key = "EMPTY"
openai_api_base = "http://localhost:8000/v1"
client = OpenAI(api_key=openai_api_key, base_url=openai_api_base)
model_id = client.models.list().data[0].id

# Text inference
chat_response = client.chat.completions.create(
    model=model_id,
    messages=[{
        "role": "user",
        "content": [
            {"type": "text", "text": "Who are you?"},
        ],
    }],
)
print("Text Chat completion output:", chat_response.choices[0].message.content)

# Single-image input inference
image_url = [
    "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg",
]
prompt = "What's in this image?"
for img in image_url:
    messages=[{
        "role": "user",
        "content": [
            {"type": "text", "text": prompt},
            {"type": "image_url", "image_url": {"url": img}},
        ],
    }]
    chat_response = client.chat.completions.create(model=model_id, messages=messages)
    print("Single image Chat completion output:", chat_response.choices[0].message.content)
```

Here is the stacktrace for the failure when the image request is sent to `vllm==0.8.4`
```
INFO:     127.0.0.1:52340 - "POST /v1/chat/completions HTTP/1.1" 200 OK
ERROR 04-15 17:28:02 [core.py:387] EngineCore hit an exception: Traceback (most recent call last):
ERROR 04-15 17:28:02 [core.py:387]   File "/home/mgoin/venvs/vllm-rel/lib/python3.12/site-packages/vllm/v1/engine/core.py", line 380, in run_engine_core
ERROR 04-15 17:28:02 [core.py:387]     engine_core.run_busy_loop()
ERROR 04-15 17:28:02 [core.py:387]   File "/home/mgoin/venvs/vllm-rel/lib/python3.12/site-packages/vllm/v1/engine/core.py", line 402, in run_busy_loop
ERROR 04-15 17:28:02 [core.py:387]     self._process_engine_step()
ERROR 04-15 17:28:02 [core.py:387]   File "/home/mgoin/venvs/vllm-rel/lib/python3.12/site-packages/vllm/v1/engine/core.py", line 431, in _process_engine_step
ERROR 04-15 17:28:02 [core.py:387]     outputs = self.step_fn()
ERROR 04-15 17:28:02 [core.py:387]               ^^^^^^^^^^^^^^
ERROR 04-15 17:28:02 [core.py:387]   File "/home/mgoin/venvs/vllm-rel/lib/python3.12/site-packages/vllm/v1/engine/core.py", line 207, in step
ERROR 04-15 17:28:02 [core.py:387]     output = self.model_executor.execute_model(scheduler_output)
ERROR 04-15 17:28:02 [core.py:387]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 04-15 17:28:02 [core.py:387]   File "/home/mgoin/venvs/vllm-rel/lib/python3.12/site-packages/vllm/v1/executor/abstract.py", line 77, in execute_model
ERROR 04-15 17:28:02 [core.py:387]     output = self.collective_rpc("execute_model",
ERROR 04-15 17:28:02 [core.py:387]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 04-15 17:28:02 [core.py:387]   File "/home/mgoin/venvs/vllm-rel/lib/python3.12/site-packages/vllm/executor/uniproc_executor.py", line 56, in collective_rpc
ERROR 04-15 17:28:02 [core.py:387]     answer = run_method(self.driver_worker, method, args, kwargs)
ERROR 04-15 17:28:02 [core.py:387]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 04-15 17:28:02 [core.py:387]   File "/home/mgoin/venvs/vllm-rel/lib/python3.12/site-packages/vllm/utils.py", line 2378, in run_method
ERROR 04-15 17:28:02 [core.py:387]     return func(*args, **kwargs)
ERROR 04-15 17:28:02 [core.py:387]            ^^^^^^^^^^^^^^^^^^^^^
ERROR 04-15 17:28:02 [core.py:387]   File "/home/mgoin/venvs/vllm-rel/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
ERROR 04-15 17:28:02 [core.py:387]     return func(*args, **kwargs)
ERROR 04-15 17:28:02 [core.py:387]            ^^^^^^^^^^^^^^^^^^^^^
ERROR 04-15 17:28:02 [core.py:387]   File "/home/mgoin/venvs/vllm-rel/lib/python3.12/site-packages/vllm/v1/worker/gpu_worker.py", line 242, in execute_model
ERROR 04-15 17:28:02 [core.py:387]     output = self.model_runner.execute_model(scheduler_output)
ERROR 04-15 17:28:02 [core.py:387]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 04-15 17:28:02 [core.py:387]   File "/home/mgoin/venvs/vllm-rel/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
ERROR 04-15 17:28:02 [core.py:387]     return func(*args, **kwargs)
ERROR 04-15 17:28:02 [core.py:387]            ^^^^^^^^^^^^^^^^^^^^^
ERROR 04-15 17:28:02 [core.py:387]   File "/home/mgoin/venvs/vllm-rel/lib/python3.12/site-packages/vllm/v1/worker/gpu_model_runner.py", line 1002, in execute_model
ERROR 04-15 17:28:02 [core.py:387]     self._execute_mm_encoder(scheduler_output)
ERROR 04-15 17:28:02 [core.py:387]   File "/home/mgoin/venvs/vllm-rel/lib/python3.12/site-packages/vllm/v1/worker/gpu_model_runner.py", line 888, in _execute_mm_encoder
ERROR 04-15 17:28:02 [core.py:387]     self.encoder_cache[req_id][input_id] = scatter_mm_placeholders(
ERROR 04-15 17:28:02 [core.py:387]                                            ^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 04-15 17:28:02 [core.py:387]   File "/home/mgoin/venvs/vllm-rel/lib/python3.12/site-packages/vllm/v1/worker/utils.py", line 58, in scatter_mm_placeholders
ERROR 04-15 17:28:02 [core.py:387]     placeholders[is_embed] = embeds
ERROR 04-15 17:28:02 [core.py:387]     ~~~~~~~~~~~~^^^^^^^^^^
ERROR 04-15 17:28:02 [core.py:387] RuntimeError: shape mismatch: value tensor of shape [1980, 5120] cannot be broadcast to indexing result of shape [7920, 5120]
```


### Before submitting a new issue...

- [x] Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the [documentation page](https://docs.vllm.ai/en/latest/), which can answer lots of frequently asked questions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Bug]: Mistral 3.1 Small Image inference is broken on 0.8.4 #16675

Your current environment

🐛 Describe the bug

Before submitting a new issue...

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[Bug]: Mistral 3.1 Small Image inference is broken on 0.8.4 #16675

Description

Your current environment

🐛 Describe the bug

Before submitting a new issue...

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions