Skip to content

Commit 5d13bcb

Browse files
authored
Block matmul and kv_cache in dynamic quantization (#673)
Currently disabling matmul and kv_cache in dynamic quantization mode --------- Signed-off-by: Danny Semiat <[email protected]>
1 parent b333e4b commit 5d13bcb

File tree

1 file changed

+7
-0
lines changed

1 file changed

+7
-0
lines changed

calibration/quantization_config/maxabs_quant_dynamic_quantization.json

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,5 +4,12 @@
44
"scale_format": "CONST",
55
"scale_method": "act_maxabs_pcs_pow2_weight_maxabs_pts_pow2_hw",
66
"dynamic_quantization": true,
7+
"blocklist": {
8+
"types": [
9+
"Matmul",
10+
"KVCache",
11+
"VLLMKVCache"
12+
]
13+
},
714
"dump_stats_path": ""
815
}

0 commit comments

Comments
 (0)