Update on "[ET-VK] Implemement linear_dq8ta_q4gsw"

ssjia · ssjia · commit 31f68011a644 · 2025-09-08T14:04:06.000-07:00
Title says it all! Build upon the support for quantized linear introduced in the previous diffs to enable dynamically quantized linear. Also included in this diff is a cleanup of the glslh files used across quantized linear implementations. Differential Revision: [D81931060](https://our.internmc.facebook.com/intern/diff/D81931060/) [ghstack-poisoned]
diff --git a/.github/workflows/pull.yml b/.github/workflows/pull.yml
@@ -939,6 +939,7 @@ jobs:
         # Run e2e testing for selected operators. More operators will be tested via this
         # route in the future.
         python -m unittest backends/vulkan/test/test_vulkan_delegate.py -k "*pt2e*"
+        python -m unittest backends/vulkan/test/test_vulkan_delegate.py -k "*torchao*"
 
   nxp-build-test:
     name: nxp-build-test
diff --git a/backends/vulkan/test/test_vulkan_delegate.py b/backends/vulkan/test/test_vulkan_delegate.py
@@ -2651,6 +2651,7 @@ def forward(self, x):
             rtol=1e-1,
         )
 
+    @unittest.skip("Cannot run on swiftshader due to no 8-bit int support")
     def test_vulkan_backend_torchao_8da4w_quantized_linear(self):
         """
         Test TorchAO 8da4w quantization (int8 dynamic activation + int4 weight) with Vulkan backend.

Original file line number	Diff line number	Diff line change
`@@ -2651,6 +2651,7 @@ def forward(self, x):`
`2651`	`2651`	`rtol=1e-1,`
`2652`	`2652`	`)`
`2653`	`2653`
	`2654`	`+ @unittest.skip("Cannot run on swiftshader due to no 8-bit int support")`
`2654`	`2655`	`def test_vulkan_backend_torchao_8da4w_quantized_linear(self):`
`2655`	`2656`	`"""`
`2656`	`2657`	`Test TorchAO 8da4w quantization (int8 dynamic activation + int4 weight) with Vulkan backend.`