Skip to content

Commit 066f34b

Browse files
author
ssjia
committed
Update base for Update on "[ET-VK][AOT] Enable exporting Q8 Quantized Linear + Convolution"
As title. Introduce fusion patterns to enable fusing quantized convolution and linear graph patterns into a custom op. ## Changes Introduce the concept of using custom pattern detection functions to detect graph patterns rather than solely relying on SubgraphMatcher. The issue with SubgraphMatcher is that a large number of graph patterns may need to be exported to obtain variants for different combinations of decompositions/quantization workflows. Having a custom detection function improves maintainability. Implement detection + replacement functions for quantized linear and quantized conv2d. Differential Revision: [D81323425](https://our.internmc.facebook.com/intern/diff/D81323425/) [ghstack-poisoned]
1 parent c680357 commit 066f34b

File tree

1 file changed

+4
-0
lines changed

1 file changed

+4
-0
lines changed

backends/vulkan/runtime/graph/ops/impl/QuantizedConvolution.cpp

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -478,6 +478,8 @@ void conv2d_q8csw_linear_tiled_impl(
478478
const ValueRef padding = args.at(idx++);
479479
const ValueRef dilation = args.at(idx++);
480480
const ValueRef groups = args.at(idx++);
481+
const ValueRef orig_OC = args.at(idx++);
482+
(void)orig_OC;
481483
const ValueRef output = args.at(idx++);
482484

483485
const ValueRef packed_weight = prepack_q8_linear_weight(graph, weight);
@@ -552,6 +554,8 @@ void conv2d_q8ta_q8csw_linear_tiled_impl(
552554
const ValueRef padding = args.at(idx++);
553555
const ValueRef dilation = args.at(idx++);
554556
const ValueRef groups = args.at(idx++);
557+
const ValueRef orig_OC = args.at(idx++);
558+
(void)orig_OC;
555559
const ValueRef output = args.at(idx++);
556560

557561
const ValueRef packed_weight = prepack_q8_linear_weight(graph, weight);

0 commit comments

Comments
 (0)