Skip to content

Commit 1ef32e3

Browse files
committed
remove attention head
Signed-off-by: Kyle Sayers <[email protected]>
1 parent 8053b51 commit 1ef32e3

File tree

1 file changed

+0
-6
lines changed

1 file changed

+0
-6
lines changed

src/compressed_tensors/quantization/lifecycle/initialize.py

Lines changed: 0 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -234,12 +234,6 @@ def initialize_qparams(
234234
num_cols = strategy_cdiv(observed_shape[-1], block_structure[-1], strategy)
235235
expected_shape = (num_rows, num_cols)
236236

237-
elif strategy == QuantizationStrategy.ATTN_HEAD:
238-
if len(observed_shape) < 2:
239-
raise ValueError("Attention quant requires at least 2 observed dimensions")
240-
241-
expected_shape = (observed_shape[-2], 1)
242-
243237
else:
244238
assert False, f"Unknown strategy {strategy}"
245239

0 commit comments

Comments
 (0)