Skip to content

Commit 326f802

Browse files
committed
add initialize
1 parent f91f6d1 commit 326f802

File tree

1 file changed

+6
-0
lines changed

1 file changed

+6
-0
lines changed

src/compressed_tensors/quantization/lifecycle/initialize.py

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -234,6 +234,12 @@ def initialize_qparams(
234234
num_cols = strategy_cdiv(observed_shape[-1], block_structure[-1], strategy)
235235
expected_shape = (num_rows, num_cols)
236236

237+
elif strategy == QuantizationStrategy.ATTN_HEAD:
238+
if len(observed_shape) < 2:
239+
raise ValueError("Attention quant requires at least 2 observed dimensions")
240+
241+
expected_shape = (observed_shape[-2], 1)
242+
237243
else:
238244
assert False, f"Unknown strategy {strategy}"
239245

0 commit comments

Comments
 (0)