Skip to content
This repository was archived by the owner on Sep 4, 2025. It is now read-only.

Commit a5d87a1

Browse files
authored
re-enable avoid torch slice fix when chunked prefill is disabled (#209)
1 parent cc2039c commit a5d87a1

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

vllm/attention/backends/rocm_flash_attn.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -573,7 +573,7 @@ def forward(
573573
else:
574574
out = output
575575
ops.paged_attention_rocm(
576-
output[num_prefill_tokens:],
576+
out,
577577
exp_sums,
578578
max_logits,
579579
tmp_output,

0 commit comments

Comments
 (0)