When using asymmetric quantization with PackedQuantizationCompressor, zero-points are packed during compression but were not unpacked during decompression. This prevented loading and inference for models using GROUP or CHANNEL strategies.
This tracks the feature request in vllm-project/llm-compressor#1704 and proposes adding zero-point unpack support in decompress_weight.
References: