Skip to content

Commit d72b0be

Browse files
authored
[XPU]Fix for Qwen-OMNI crash (vllm-project#35249)
Signed-off-by: Chendi Xue <chendi.xue@intel.com>
1 parent 42489e4 commit d72b0be

File tree

1 file changed

+2
-1
lines changed

1 file changed

+2
-1
lines changed

vllm/_xpu_ops.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -105,9 +105,10 @@ def flash_attn_varlen_func(
105105
assert len(window_size) == 2
106106
real_window_size = (window_size[0], window_size[1]) # noqa: F841
107107

108-
# In encode attention, v maybe not contiguous and current
108+
# In encode attention, k and v maybe not contiguous and current
109109
# kernel can't handle it
110110
if block_table is None:
111+
k = k.contiguous()
111112
v = v.contiguous()
112113
return flash_attn_varlen_func(
113114
out=out,

0 commit comments

Comments
 (0)