Skip to content

Commit aca81d6

Browse files
raise error if block quantization is used, as it is not yet supported (#1476)
SUMMARY: More info at #1464 and #1475, for now just raising an error if user is trying to block-quantize. This provides a more useful error than downstream error user is reporting in #1464 TEST PLAN: n/a Signed-off-by: Brian Dellabetta <[email protected]>
1 parent 92cbf01 commit aca81d6

File tree

1 file changed

+8
-0
lines changed

1 file changed

+8
-0
lines changed

src/llmcompressor/observers/base.py

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -145,6 +145,14 @@ def get_qparams(
145145
dim={0, 1},
146146
)
147147

148+
elif self.quantization_args.strategy == QuantizationStrategy.BLOCK:
149+
# TODO (#1475) add support for block-wise quantization
150+
raise NotImplementedError(
151+
"Block-wise quantization is not yet supported, "
152+
"consider group-wise quantization instead. More info at "
153+
"https://github.com/vllm-project/llm-compressor/issues/1475"
154+
)
155+
148156
return self._scale, self._zero_point
149157

150158
def get_qparams_along_dim(

0 commit comments

Comments
 (0)