Skip to content

Commit 2dd1b62

Browse files
authored
Fix per-token dynamic quant (#393)
1 parent 6625cd3 commit 2dd1b62

File tree

1 file changed

+1
-1
lines changed
  • src/compressed_tensors/quantization/utils

1 file changed

+1
-1
lines changed

src/compressed_tensors/quantization/utils/helpers.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -165,7 +165,7 @@ def compute_dynamic_scales_and_zp(
165165

166166
keep_dims = True
167167
if args.strategy == QuantizationStrategy.TOKEN:
168-
dim = {1, 2}
168+
dim = {0, 1}
169169
reduce_dims = tuple(idx for idx in range(value.ndim) if idx not in dim)
170170
elif args.strategy == QuantizationStrategy.TENSOR:
171171
reduce_dims = None

0 commit comments

Comments
 (0)