Skip to content

Commit 399b3a2

Browse files
committed
Update on "[ExecuTorch][XNNPACK] Don't partition per_tensor weights with qd8"
This is not supported, so we shouldn't partition it. Add an expectedFailure test to indicate that this is not supported. Differential Revision: [D70343584](https://our.internmc.facebook.com/intern/diff/D70343584/) [ghstack-poisoned]
1 parent 4a29500 commit 399b3a2

File tree

1 file changed

+5
-4
lines changed

1 file changed

+5
-4
lines changed

backends/xnnpack/partition/config/gemm_configs.py

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -161,17 +161,18 @@ def _get_weight_deps(
161161
return False, []
162162
gemm_deps.append(weight)
163163

164+
if is_per_tensor(dequant_node) and precision == ConfigPrecisionType.DYNAMIC_QUANT:
165+
why(node, "XNNPACK does not support per tensor quantized weights for dynamic quantization of activations")
166+
return False, []
167+
164168
if is_per_channel(dequant_node) or is_per_channel_group(dequant_node):
165169
if len(dequant_node.all_input_nodes) < 2:
166170
# Expected channel quantized to have scale/zp nodes
167171
why(node, "Expected channel quantized to have scale/zp nodes")
168172
return False, []
169173

170-
if is_per_tensor(dequant_node) and precision == ConfigPrecisionType.DYNAMIC_QUANT:
171-
why(node, "XNNPACK does not support per tensor quantized weights for dynamic quantization of activations")
172-
return False, []
173-
174174
gemm_deps.extend(dequant_node.all_input_nodes[1:3])
175+
175176
return (True, gemm_deps)
176177

177178
def _get_output_deps(

0 commit comments

Comments
 (0)