Skip to content

Commit 5929779

Browse files
rasmithmicah-wil
authored andcommitted
Cast bn to int64 to avoid integer overflow
Signed-off-by: Randall Smith <[email protected]>
1 parent bfe0b20 commit 5929779

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

vllm/attention/ops/prefix_prefill.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -151,7 +151,7 @@ def _fwd_kernel(Q,
151151
start_n = tl.multiple_of(start_n, BLOCK_SIZE)
152152
# -- compute qk ----
153153
bn = tl.load(B_Loc + cur_batch * stride_b_loc_b +
154-
(start_n // BLOCK_SIZE) * stride_b_loc_s)
154+
(start_n // BLOCK_SIZE) * stride_b_loc_s).to(tl.int64)
155155
# [D,BLOCK_SIZE]
156156
off_k = (
157157
bn[None, :] * stride_k_cache_bs + cur_kv_head * stride_k_cache_h +

0 commit comments

Comments
 (0)