Skip to content

Commit 0d56e9a

Browse files
committed
Update on "[ET-VK] Add custom VkInt4WeightOnlyQuantizer for vulkan"
## Context This diff adds the `VkInt4WeightOnlyQuantizer` class which enables 4-bit quantization of linear layers via source transformation. This quantizer class is copied from `torchao.quantization.GPTQ.WeightOnlyInt4Linear` with some minor changes as annotated in the implementation. Note that the pt2e quantization flow does not yet support groupwise quantization, so source transformation is the only way to perform groupwise quantization at the moment. Differential Revision: [D64406457](https://our.internmc.facebook.com/intern/diff/D64406457/) [ghstack-poisoned]
2 parents ff546dc + d19036a commit 0d56e9a

File tree

1 file changed

+2
-2
lines changed

1 file changed

+2
-2
lines changed

backends/vulkan/test/op_tests/linear_weight_int4_test.cpp

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -202,15 +202,15 @@ void test_vulkan_linear_int4(
202202
ASSERT_TRUE(at::allclose(vk_out, out_ref, 1e-4, 1e-4));
203203
}
204204

205-
TEST(VulkanSDPATest, test_reference_impl) {
205+
TEST(VulkanInt4LinearTest, test_reference_impl) {
206206
test_reference_linear_int4(
207207
/*B = */ 1,
208208
/*M = */ 4,
209209
/*K = */ 128,
210210
/*N = */ 32);
211211
}
212212

213-
TEST(VulkanSDPATest, test_vulkan_impl) {
213+
TEST(VulkanInt4LinearTest, test_vulkan_impl) {
214214
if (!vkcompute::api::context()
215215
->adapter_ptr()
216216
->has_full_int8_buffers_support()) {

0 commit comments

Comments
 (0)