-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Open
Description
hello~@Autumn1998 I’d like to ask some questions about Hybrid-EP.
- i found that there has cudamemcpy from/to user hidden tensor to/from rdma registed buffer in Hybrid-EP. When using the permute version, the cudamemcpy in postprocess can be fused, but the preprocessing still remains. So i wonder if it can use zero copy in Hybrid-EP like this.
- There seem to be redundant stores here.
thx~
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels