You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Update base for Update on "[ET-VK] Fast path for choose_qparams"
The current implementations of `choose_qparams` are too slow to be practically usable.
As a temporary workaround to unblock LLM optimizations, this diff/PR introduces a fast path for computing per-channel quantization parameters for 2D matrices in the form of the choose_qparams_per_row shader.
Differential Revision: [D81800024](https://our.internmc.facebook.com/intern/diff/D81800024/)
[ghstack-poisoned]
0 commit comments