Commit 83a1f60
authored
feat: Expose bias and FP8_MXFP4 MOE CUTLASS backend features to pytorch (NVIDIA#5410)
Signed-off-by: Daniel Stokes <[email protected]>1 parent ef43b95 commit 83a1f60
File tree
5 files changed
+165
-37
lines changed- cpp/tensorrt_llm/thop
- tensorrt_llm/_torch
- auto_deploy/custom_ops
- custom_ops
- modules/fused_moe
5 files changed
+165
-37
lines changed
0 commit comments