Skip to content

[Feature]: AutoDeploy: investigate+optimize comm collectives in TEP mode #11068

@lucaslie

Description

@lucaslie

🚀 The feature, motivation and pitch

Currently, we have one all_reduce in TEP mode (TP for attention, TP+EP in MoE). Let's investigate current optimal configuration in PT manual backend to better understand if our TEP performance is en par with current PT backend

Alternatives

No response

Additional context

No response

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and checked the documentation and examples for answers to frequently asked questions.

Metadata

Metadata

Labels

AutoDeploy<NV> AutoDeploy Backendfeature requestNew feature or request. This includes new model, dtype, functionality support

Type

Projects

Status

Rejected

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions