- [ ] sm100a matmul kernels: BF16/FP16, NVFP4 (from the competition) - [ ] FP8 kernel for 5090 using mxfp8 instruction