https://github.com/NVIDIA/TransformerEngine/pull/2177 Please bring nvfp4 training from NVIDIA/TransformerEngine