Skip to content

Commit d84ce33

Browse files
[feat] Sharding logic split to pattern detection and executor for EP and BMM (fixes NVIDIA#5916) (#94)
* Updated tests Signed-off-by: greg-kwasniewski1 <[email protected]> * fixed tp sharding bug Signed-off-by: greg-kwasniewski1 <[email protected]> * Fixed sharding tests Signed-off-by: greg-kwasniewski1 <[email protected]> * Fixed sharding tests 1.1 Signed-off-by: greg-kwasniewski1 <[email protected]> * import fix Signed-off-by: Lucas Liebenwein <[email protected]> --------- Signed-off-by: greg-kwasniewski1 <[email protected]> Signed-off-by: Lucas Liebenwein <[email protected]> Co-authored-by: Lucas Liebenwein <[email protected]>
1 parent 4d89913 commit d84ce33

File tree

7 files changed

+455
-261
lines changed

7 files changed

+455
-261
lines changed

tensorrt_llm/_torch/auto_deploy/transformations/library/__init__.py

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,6 @@
33
from .attention import *
44
from .collectives import *
55
from .eliminate_redundant_transposes import *
6-
from .ep_sharding import *
76
from .fused_moe import *
87
from .fusion import *
98
from .kvcache import *

tensorrt_llm/_torch/auto_deploy/transformations/library/ep_sharding.py

Lines changed: 0 additions & 144 deletions
This file was deleted.

0 commit comments

Comments
 (0)