Skip to content

Commit a5d888c

Browse files
committed
Update on "[ET-VK][ez] Add support for buffer backed qparams in int4 linear + add checks for physical limits when allocating"
## Context Currently, the groupwise quantized int4 linear op implementation forces the scales and zero tensor to be a `Texture3D`. However, for i.e. transformer models that have a logit linear layer, the image extents required may exceed the maximum image extents available on the device. ## Changes * Add support for the scales and zero tensor being a `Buffer` instead of a `Texture3D` * Add checks when allocating buffers or images for tensors that the requested resource fits within the physical device limits Differential Revision: [D72662176](https://our.internmc.facebook.com/intern/diff/D72662176/) [ghstack-poisoned]
2 parents 8794595 + 99e2b64 commit a5d888c

File tree

0 file changed

+0
-0
lines changed

    0 file changed

    +0
    -0
    lines changed

    0 commit comments

    Comments
     (0)