[Bug]: The same, wrong results of FP8-Dynamic Qwen2-VL, Llava-OneVision regardless input images

### ⚙️ Your current environment

<details>
<summary>The output of <code>python collect_env.py</code></summary>

```text
### Environment Information ###
Operating System: `Linux-6.8.0-1043-aws-x86_64-with-glibc2.35`
Python Version: `3.10.12 (main, Nov  4 2025, 08:48:33) [GCC 11.4.0]`
llm-compressor Version: `0.8.1`
compressed-tensors Version: `0.12.2`
transformers Version: `4.53.3`
torch Version: `2.6.0+cu126`
CUDA Devices: `['NVIDIA L40S']`
AMD Devices: `None`
```
</details>

### 🐛 Describe the bug

Hi team, I encounter an issue that [FP8-quantized Llava-Onevision](https://huggingface.co/nm-testing/llava-onevision-qwen2-7b-ov-hf-FP8-dynamic), [Qwen2-VL](https://huggingface.co/nm-testing/Qwen2-VL-7B-Instruct-FP8-dynamic) models produce identical outputs for all input images (e.g., "blue" for red/blue/green images), while non-quantized models work correctly. [FP8-quantized Qwen2.5-VL](https://huggingface.co/RedHatAI/Qwen2.5-VL-3B-Instruct-FP8-dynamic) models work well.
Could you please help to take a look?

### 🛠️ Steps to reproduce

- **Reproduction:** Load `nm-testing/llava-onevision-qwen2-7b-ov-hf-FP8-dynamic`, `nm-testing/Qwen2-VL-7B-Instruct-FP8-dynamic` with vLLM, test with different colored images (red/blue/green) - all produce identical outputs. `RedHatAI/Qwen2.5-VL-3B-Instruct-FP8-dynamic` answers correctly.
  - **Expected:** Different outputs for different images (works with non-quantized)  
  - **Actual:** All images produce "blue" / "purple" (identical outputs)

- **Scripts**:
[test_nm_testing_fp8_qwen2vl.py](https://github.com/user-attachments/files/23757949/test_nm_testing_fp8_qwen2vl.py)
[test_redhat_fp8.py](https://github.com/user-attachments/files/23757961/test_redhat_fp8.py)
[test_nm_testing_fp8.py](https://github.com/user-attachments/files/23757962/test_nm_testing_fp8.py)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug]: The same, wrong results of FP8-Dynamic Qwen2-VL, Llava-OneVision regardless input images #2072

⚙️ Your current environment

🐛 Describe the bug

🛠️ Steps to reproduce

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Bug]: The same, wrong results of FP8-Dynamic Qwen2-VL, Llava-OneVision regardless input images #2072

Description

⚙️ Your current environment

🐛 Describe the bug

🛠️ Steps to reproduce

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions