Skip to content

Commit f0a7c13

Browse files
Apply suggestion from @matthewdouglas
1 parent 22685ac commit f0a7c13

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

csrc/kernels.hip

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2613,7 +2613,7 @@ template <typename T, int THREADS, int BITS> __global__ void kgemm_4bit_inferenc
26132613
// per threadblock:
26142614
// load step-by-step in chunks of [BNB_WARP_SIZE,warps]: 1xBNB_WARP_SIZE * [BNB_WARP_SIZE,warps] -> [1,warps]
26152615
// 4 warps -> 4 loads per iter
2616-
// 1xBNB_WARP_SIZE * BNB_WARP_SIZEx4 -> 1x4 outputs per thread block
2616+
// 1 x BNB_WARP_SIZE * BNB_WARP_SIZE x 4 -> 1x4 outputs per thread block
26172617
typedef hipcub::WarpReduce<float, BNB_WARP_SIZE> WarpReduce;
26182618
__shared__ typename WarpReduce::TempStorage temp_storage[THREADS/BNB_WARP_SIZE];
26192619

0 commit comments

Comments
 (0)