Skip to content

Commit 06ffc8c

Browse files
author
ssjia
committed
Update on "[ET-VK] Implemement linear_dq8ta_q4gsw"
Title says it all! Build upon the support for quantized linear introduced in the previous diffs to enable dynamically quantized linear. Also included in this diff is a cleanup of the glslh files used across quantized linear implementations. Differential Revision: [D81931060](https://our.internmc.facebook.com/intern/diff/D81931060/) [ghstack-poisoned]
1 parent 994ca6d commit 06ffc8c

File tree

4 files changed

+2
-803
lines changed

4 files changed

+2
-803
lines changed

backends/vulkan/patterns/quantized_linear.py

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -116,7 +116,6 @@ def __init__(self, mm_node: torch.fx.Node) -> None:
116116

117117
# If input is not quantized, then we are done
118118
if self.quantize_input_node is None:
119-
raise Exception("Input is not quantized")
120119
self.match_found = True
121120
return
122121

@@ -478,7 +477,6 @@ def replace_quantized_linear_patterns(
478477
and match.is_weight_pergroup_quantized()
479478
and utils.is_in_4bit_range(weight_tensor)
480479
):
481-
raise Exception("Unsupported pattern")
482480
make_linear_q4gsw_op(
483481
ep, graph_module, match, weight_tensor, weight_scales_tensor
484482
)

backends/vulkan/test/custom_ops/CMakeLists.txt

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -92,7 +92,7 @@ if(TARGET vulkan_backend)
9292
# Define operator prototypes
9393
add_operator_prototype(add)
9494
add_operator_prototype(q8csw_linear)
95-
add_operator_prototype(quantized_q4gaw_linear)
96-
add_operator_prototype(quantized_int4_linear)
9795
add_operator_prototype(q8csw_conv2d)
96+
add_operator_prototype(q4gsw_linear)
97+
add_operator_prototype(choose_qparams_per_row)
9898
endif()

backends/vulkan/test/custom_ops/quantized_int4_linear.cpp

Lines changed: 0 additions & 366 deletions
This file was deleted.

0 commit comments

Comments
 (0)