You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Summary:
Pull Request resolved: #4958
X-link: https://github.com/facebookresearch/FBGEMM/pull/1979
We upgrade the FP4 grouped kernel with new API `f4f4bf16_grouped_mm` that could be used in torch and vLLM. The kernel will support both MX and NV FP4, determined based on the scale dtypes (E4M3 vs E8M0). The API largely matches existing one we added for MXFP8. We also add unit tests for these new APIs.
Next steps:
- Full re-tune of the kernel
- Add other layouts to better support backwards
Reviewed By: q10
Differential Revision: D83171662
fbshipit-source-id: ba0abe5e1adf151e1b98b19b0abb14c9325d7966
0 commit comments