Skip to content

Commit 2d32f6f

Browse files
author
Nathanael See
committed
Update on "[ET-VK][int4] patch 4-bit linear op for ensuring w-packed in/out"
If the partitioner is using channels-packed setting for activations, then the checks will throw. Remove the checks and conditionally re-pack the input/output tensors if they are not width-packed. Differential Revision: [D68813946](https://our.internmc.facebook.com/intern/diff/D68813946/) [ghstack-poisoned]
2 parents 345eee4 + 986b149 commit 2d32f6f

File tree

1 file changed

+2
-2
lines changed

1 file changed

+2
-2
lines changed

backends/vulkan/runtime/graph/ops/impl/QuantizedLinear.cpp

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -352,8 +352,8 @@ void add_q_4w_linear_node(
352352
local_wg_size,
353353
// Inputs and Outputs
354354
{{out_W_packed, vkapi::MemoryAccessType::WRITE},
355-
{{mat1_W_packed, mat2, scales_and_zeros},
356-
vkapi::MemoryAccessType::READ}},
355+
{{mat1_W_packed, mat2, scales_and_zeros},
356+
vkapi::MemoryAccessType::READ}},
357357
// Shader params buffers
358358
ubos,
359359
// Specialization Constants

0 commit comments

Comments
 (0)