Skip to content

Investigate NanoV3 perf sweep with tp=2,4,8 #9917

@galagam

Description

@galagam

Initial perf sweep yields poor results. Investigate reason.

Image

vLLM sweep looks fine (tp=8 is broken on vLLM ToT)
Image

Metadata

Metadata

Assignees

Labels

AutoDeploy<NV> AutoDeploy BackendScale-out<NV>Multi-GPU and distributed inference scaling issues, tensor/pipeline/data parallelism

Type

No type

Projects

Status

Ready

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions