Skip to content

[RL] Improve vLLM/Generator startup time with cudagraphs and support_torch_compile #2509

@Lucaskabela

Description

@Lucaskabela

As noted in initial PR #2486 - the time with our vLLMWrapper is quite a bit larger (4x) than vLLM Native. We should investigate this startup time and see how to reduce it

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions