Skip to content

Commit 94aecac

Browse files
fix
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
1 parent be25bfd commit 94aecac

File tree

1 file changed

+2
-3
lines changed

1 file changed

+2
-3
lines changed

lightllm/common/quantization/triton_quant/fp8/fp8w8a8_block_quant_kernel.py

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -19,9 +19,8 @@ def weight_quant_kernel(x_ptr, s_ptr, y_ptr, M, N, BLOCK_SIZE: tl.constexpr):
1919
amax = tl.max(tl.abs(x))
2020

2121
max_fp8e4m3_val = 448.0
22-
scale = amax / (max_fp8e4m3_val + 1e-6)
23-
24-
y = (x / scale).to(y_ptr.dtype.element_ty)
22+
scale = amax / max_fp8e4m3_val
23+
y = (x / (scale + 1e-6)).to(y_ptr.dtype.element_ty)
2524

2625
tl.store(y_ptr + offs, y, mask=mask)
2726
tl.store(s_ptr + pid_m * n_blocks + pid_n, scale)

0 commit comments

Comments
 (0)