Skip to content

Commit bfa4b4f

Browse files
committed
Don't clamp FP32 residual during quantization
1 parent 142190e commit bfa4b4f

File tree

1 file changed

+2
-1
lines changed

1 file changed

+2
-1
lines changed

exllamav2/conversion/measure.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -131,7 +131,8 @@ def test_error(module, hidden_states, target_states, cache, attn_params):
131131
x = x.cuda()
132132
xref = xref.cuda()
133133
xtest = module.forward(x, cache, attn_params)
134-
xtest.clamp_(-65504, 65504)
134+
if not module.model.config.arch.lm.residual_stream_fp32:
135+
xtest.clamp_(-65504, 65504)
135136
xtest = xtest[0].float()
136137
xref = xref[0].float()
137138
rfn_sum += torch.linalg.norm(xtest - xref, 'fro') / torch.linalg.norm(xref, 'fro')

0 commit comments

Comments
 (0)