You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Update on "[ET-VK][ez] Add support for buffer backed qparams in int4 linear + add checks for physical limits when allocating"
## Context
Currently, the groupwise quantized int4 linear op implementation forces the scales and zero tensor to be a `Texture3D`. However, for i.e. transformer models that have a logit linear layer, the image extents required may exceed the maximum image extents available on the device.
## Changes
* Add support for the scales and zero tensor being a `Buffer` instead of a `Texture3D`
* Add checks when allocating buffers or images for tensors that the requested resource fits within the physical device limits
Differential Revision: [D72662176](https://our.internmc.facebook.com/intern/diff/D72662176/)
[ghstack-poisoned]
0 commit comments