Skip to content

Commit 3da9787

Browse files
hliucamawong-amd
authored andcommitted
[Bugfix] fix tmp_out and exp_sums dimensions (vllm-project#17438)
Signed-off-by: Hui Liu <96135754+hliuca@users.noreply.github.com>
1 parent 8c2ce97 commit 3da9787

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

vllm/attention/ops/chunked_prefill_paged_decode.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -289,7 +289,7 @@ def chunked_prefill_paged_decode(
289289
max_num_partitions = ((max_seq_len + _PARTITION_SIZE_ROCM - 1) //
290290
_PARTITION_SIZE_ROCM)
291291
assert _PARTITION_SIZE_ROCM % block_size == 0
292-
total_num_seq = query.shape[0]
292+
total_num_seq = block_table.shape[0]
293293
tmp_output = torch.empty(
294294
size=(total_num_seq, num_query_heads, max_num_partitions,
295295
head_size),

0 commit comments

Comments
 (0)