Commit a5d888c

committed

Update on "[ET-VK][ez] Add support for buffer backed qparams in int4 linear + add checks for physical limits when allocating"

## Context Currently, the groupwise quantized int4 linear op implementation forces the scales and zero tensor to be a `Texture3D`. However, for i.e. transformer models that have a logit linear layer, the image extents required may exceed the maximum image extents available on the device. ## Changes * Add support for the scales and zero tensor being a `Buffer` instead of a `Texture3D` * Add checks when allocating buffers or images for tensors that the requested resource fits within the physical device limits Differential Revision: [D72662176](https://our.internmc.facebook.com/intern/diff/D72662176/) [ghstack-poisoned]

2 parents 8794595 + 99e2b64 commit a5d888cCopy full SHA for a5d888c

0 file changed

-0

lines changed

0 file changed

-0

lines changed

Comments

(0)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Commit a5d888c

0 file changed

0 file changed

File tree

0 file changed

0 file changed

0 commit comments