-
Notifications
You must be signed in to change notification settings - Fork 102
Show input and output length on vLLM dashboard #6992
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Huy Do <[email protected]>
@huydhn is attempting to deploy a commit to the Meta Open Source Team on Vercel. A member of the Team first needs to authorize it. |
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
For latency and thoughput benchmarks, input and output length aren't set directly, but from what I see |
Signed-off-by: Huy Do <[email protected]>
One last note here is that input and output lengths are set in latency and serving benchmarks, but not throughput. I try to look into the default that it uses, but there is none for the ShareGPT dataset https://github.com/vllm-project/vllm/blob/main/vllm/benchmarks/datasets.py#L402. Few other datasets like Random or Sonnet define input and output lengths, but not ShareGPT. I will keep these fields empty for throughput benchmark for now. |
Fixes #6974
Input and output lengths are new dimensions on the dashboard that needs to be displayed after pytorch/pytorch-integration-testing#42. This PR also cleans up some old TODO code path for vLLM dashboard.
Testing
Different input and output lengths are showing up correctly now with their benchmark results on the preview