Skip to content

Commit 8794595

Browse files
committed
Update on "[ET-VK][ez] Add support for buffer backed qparams in int4 linear + add checks for physical limits when allocating"
## Context Currently, the groupwise quantized int4 linear op implementation forces the scales and zero tensor to be a `Texture3D`. However, for i.e. transformer models that have a logit linear layer, the image extents required may exceed the maximum image extents available on the device. ## Changes * Add support for the scales and zero tensor being a `Buffer` instead of a `Texture3D` * Add checks when allocating buffers or images for tensors that the requested resource fits within the physical device limits Differential Revision: [D72662176](https://our.internmc.facebook.com/intern/diff/D72662176/) [ghstack-poisoned]
2 parents 356277b + 955973d commit 8794595

File tree

1 file changed

+2
-2
lines changed

1 file changed

+2
-2
lines changed

backends/vulkan/runtime/graph/ops/glsl/pack_int4_linear_weight_transposed_interleaved.glsl

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -109,8 +109,8 @@ void main() {
109109
in_vals[r][0] = get_first(in_val_packed);
110110
in_vals[r][1] = get_second(in_val_packed);
111111
} else {
112-
in_vals[r][0] = uint8_t(254);
113-
in_vals[r][1] = uint8_t(254);
112+
in_vals[r][0] = uint8_t(0);
113+
in_vals[r][1] = uint8_t(0);
114114
}
115115
}
116116

0 commit comments

Comments
 (0)