Skip to content

Commit afca0b9

Browse files
author
sangchengmeng
committed
(fix) fix rms_norm op
1 parent 14f1974 commit afca0b9

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

lightllm/models/llama/triton_kernel/rmsnorm.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -56,7 +56,7 @@ def rmsnorm_forward(x: torch.Tensor, weight, eps, out=None):
5656
if N > BLOCK_SIZE:
5757
raise RuntimeError("This layer norm doesn't support feature dim >= 64KB.")
5858
# heuristics for number of warps
59-
num_warps = min(max(BLOCK_SIZE // 256, 1), 8)
59+
num_warps = min(max(BLOCK_SIZE // 256, 1), 4)
6060
num_warps = triton.next_power_of_2(num_warps)
6161
if BLOCK_SIZE > 16384:
6262
BLOCK_SIZE = 16384

0 commit comments

Comments
 (0)