CUDA: Accelerate MXFP4 table lookup using __byte_perm
(#15451)
#234
server.yml
on: push
server-windows
7m 58s
Matrix: server