-
Notifications
You must be signed in to change notification settings - Fork 193
Open
Labels
bugSomething isn't workingSomething isn't working
Description
I want to test awq_lite with mx format input_quantizer, however input_quantizer was disabled after call awq_lite function, my quant_cfg and quant_summary:
quant_cfg = {
"quant_cfg": {
"*weight_quantizer": {
"num_bits": (2, 1),
"block_sizes": {-1: 16, "type": "dynamic", "scale_bits": (4, 3)},
"axis": None,
"enable": True,
},
"*input_quantizer": {
"num_bits": 8,
"block_sizes": {-1: 32, "type": "dynamic", "scale_bits": (8, 0)},
"enable": True,
},
**_default_disabled_quantizer_cfg,
},
"algorithm": "awq_lite",
}
and with below quant summary(awq_lite/)
model.layers.23.mlp.up_proj.input_quantizer TensorQuantizer(disabled)
model.layers.23.mlp.up_proj.output_quantizer TensorQuantizer(disabled)
model.layers.23.mlp.up_proj.weight_quantizer TensorQuantizer((2, 1) bit fake block_sizes:{-1: 16, 'type': 'dynamic', 'scale_bits': (4, 3)}, amax=2.4219 calibrator=MaxCalibrator quant)
when i try awq_clip with mx format input_quantizer, the input_quantizer was as expected:
{
"quant_cfg": {
"*weight_quantizer": {
"num_bits": (2, 1),
"block_sizes": {-1: 16, "type": "dynamic", "scale_bits": (4, 3)},
"axis": None,
"enable": True,
},
"*input_quantizer": {
"num_bits": 8,
"block_sizes": {-1: 32, "type": "dynamic", "scale_bits": (8, 0)},
"enable": True,
},
**_default_disabled_quantizer_cfg,
},
"algorithm": {"method": "awq_clip"},
}
and with quant summary(awq_clip)
model.layers.23.mlp.down_proj.input_quantizer TensorQuantizer(8 bit fake block_sizes={-1: 32, 'type': 'dynamic', 'scale_bits': (8, 0)}, amax=None calibrator=MaxCalibrator quant)
model.layers.23.mlp.down_proj.output_quantizer TensorQuantizer(disabled)
model.layers.23.mlp.down_proj.weight_quantizer TensorQuantizer((2, 1) bit fake block_sizes={-1: 16, 'type': 'dynamic', 'scale_bits': (4, 3)}, amax=0.5000 calibrator=MaxCalibrator quant)
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working