Commit 6192bc9
[Bugfix] fix tensor not same device in qwen2_5_vl_without_padding (vllm-project#2051)
bugfix cherry-pick from v0.9.1-dev
vllm-project#2007
### What this PR does / why we need it?
Minimum reproducing code:
```python
# test.py
from vllm import LLM, SamplingParams
prompts = [
"Hello, my name is",
"The future of AI is",
]
sampling_params = SamplingParams(temperature=0.8, top_p=0.95)
llm = LLM(model="Qwen2.5-VL-7B-Instruct", max_model_len=26240)
outputs = llm.generate(prompts, sampling_params)
for output in outputs:
prompt = output.prompt
generated_text = output.outputs[0].text
print(f"Prompt: {prompt!r}, Generated text: {generated_text!r}")
```
```bash
export USE_OPTIMIZED_MODEL=0
python test.py
```
exception as follow:
```
[rank0]: File "/home/xxx/vllm_ascend/models/qwen2_5_vl_without_padding.py", line 84, in forward
[rank0]: q = torch_npu.npu_rotary_mul(q, cos, sin)
[rank0]: File "/home/anaconda3/envs/xxx/lib/python3.10/site-packages/torch/_ops.py", line 1116, in __call__
[rank0]: return self._op(*args, **(kwargs or {}))
[rank0]: RuntimeError: Expected all tensors to be on the same device, but found at least two devices, npu:0 and cpu! (when checking argument for argument r1 in method wrapper__npu_rotary_mul)
```
In `AscendQwen2_5_VisionAttention_Without_Padding`,
`torch_npu.npu_rotary_mul(q, cos, sin)`, `cos`/`sin` on cpu, but `q` on
npu, so there will be an error.
`qwen2_5_vl_without_padding.py` need this bugfix, because
`AscendQwen2_5_VisionTransformer_Without_Padding.rot_pos_emb` in
wen2_5_vl_without_padding.py is from vllm and `inv_freq` will create on
cpu.
https://github.com/vllm-project/vllm/blob/40d86ee412eeeca93e0c37432db6b96829cb64e2/vllm/model_executor/models/qwen2_5_vl.py#L482
```python
inv_freq = 1.0 / (theta**(torch.arange(0, dim, 2, dtype=torch.float, device='cpu') / dim))
```
`qwen2_5_vl.py` do not need, because
`AscendQwen2_5_VisionRotaryEmbedding` in qwen2_5_vl.py rewrite
`AscendQwen2_5_VisionRotaryEmbedding` and `inv_freq` will create on
device.
```python
inv_freq = 1.0 / (theta**(torch.arange(0, dim, 2, dtype=torch.float) / dim))
```
### Does this PR introduce _any_ user-facing change?
no
### How was this patch tested?
CI passed with new added/existing test.
- vLLM version: v0.10.0
- vLLM main:
vllm-project/vllm@18cc33d
Signed-off-by: pjgao <[email protected]>
Co-authored-by: pjgao <[email protected]>1 parent 72eceff commit 6192bc9
File tree
2 files changed
+12
-1
lines changed- tests/ut/models
- vllm_ascend/models
2 files changed
+12
-1
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
231 | 231 | | |
232 | 232 | | |
233 | 233 | | |
| 234 | + | |
| 235 | + | |
234 | 236 | | |
235 | 237 | | |
236 | 238 | | |
| |||
239 | 241 | | |
240 | 242 | | |
241 | 243 | | |
| 244 | + | |
| 245 | + | |
| 246 | + | |
| 247 | + | |
242 | 248 | | |
243 | 249 | | |
244 | 250 | | |
| |||
264 | 270 | | |
265 | 271 | | |
266 | 272 | | |
267 | | - | |
| 273 | + | |
268 | 274 | | |
269 | 275 | | |
270 | 276 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
41 | 41 | | |
42 | 42 | | |
43 | 43 | | |
| 44 | + | |
| 45 | + | |
44 | 46 | | |
45 | 47 | | |
46 | 48 | | |
| |||
160 | 162 | | |
161 | 163 | | |
162 | 164 | | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
163 | 168 | | |
164 | 169 | | |
165 | 170 | | |
| |||
0 commit comments