Skip to content

Commit 97d037e

Browse files
author
ssjia
committed
Update on "[ET-VK] Quantized Int8 Convolution"
See the below diff; this diff implements int8 quantized conv2d using the quantized linear layer introduced below. Note that the current implementation doesn't yet support depthwise convs; a specialized implementation will need to be added for that. Differential Revision: [D81330809](https://our.internmc.facebook.com/intern/diff/D81330809/) [ghstack-poisoned]
2 parents ec9b15e + 4981a7c commit 97d037e

File tree

1 file changed

+4
-0
lines changed

1 file changed

+4
-0
lines changed

backends/vulkan/runtime/graph/ops/impl/QuantizedConvolution.cpp

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -478,6 +478,8 @@ void conv2d_q8csw_linear_tiled_impl(
478478
const ValueRef padding = args.at(idx++);
479479
const ValueRef dilation = args.at(idx++);
480480
const ValueRef groups = args.at(idx++);
481+
const ValueRef orig_OC = args.at(idx++);
482+
(void)orig_OC;
481483
const ValueRef output = args.at(idx++);
482484

483485
const ValueRef packed_weight = prepack_q8_linear_weight(graph, weight);
@@ -552,6 +554,8 @@ void conv2d_q8ta_q8csw_linear_tiled_impl(
552554
const ValueRef padding = args.at(idx++);
553555
const ValueRef dilation = args.at(idx++);
554556
const ValueRef groups = args.at(idx++);
557+
const ValueRef orig_OC = args.at(idx++);
558+
(void)orig_OC;
555559
const ValueRef output = args.at(idx++);
556560

557561
const ValueRef packed_weight = prepack_q8_linear_weight(graph, weight);

0 commit comments

Comments
 (0)