-
Notifications
You must be signed in to change notification settings - Fork 119
Add multimodal benchmarking usage docs #568
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Mark Kurtz <[email protected]>
Signed-off-by: Mark Kurtz <[email protected]>
Signed-off-by: michelia <[email protected]> Signed-off-by: Mark Kurtz <[email protected]>
125b659 to
2d7503b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds documentation for benchmarking multimodal models (image, video, and audio) with GuideLLM using OpenAI-compatible endpoints, and links these guides into the broader docs navigation.
Changes:
- Added dedicated guides for image, video, and audio benchmarking, covering setup, data loading, request formatting, metrics, and example
guidellm benchmarkcommands. - Introduced a multimodal benchmarking index page that explains prerequisites and links to each modality-specific guide.
- Updated the main Guides index to surface the new multimodal benchmarking docs.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
docs/guides/multimodal/image.md |
New guide for benchmarking vision-language models with image inputs, including data-column mapping, request formatting, metrics, and VQA/captioning examples. |
docs/guides/multimodal/video.md |
New guide for benchmarking video-language models with video inputs, including data loading, video request formatting, metrics, and QA/captioning examples. |
docs/guides/multimodal/audio.md |
New guide for benchmarking audio models for ASR, translation, and audio chat, with detailed encoder options, metrics, and three example benchmark commands. |
docs/guides/multimodal/index.md |
New multimodal overview page describing prerequisites and linking to image, video, and audio benchmarking guides. |
docs/guides/index.md |
Adds a “Multimodal Benchmarking” card that links the main Guides index to the new multimodal documentation. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <[email protected]> Signed-off-by: Mark Kurtz <[email protected]>
Co-authored-by: Copilot <[email protected]> Signed-off-by: Mark Kurtz <[email protected]>
Co-authored-by: Copilot <[email protected]> Signed-off-by: Mark Kurtz <[email protected]>
Co-authored-by: Copilot <[email protected]> Signed-off-by: Mark Kurtz <[email protected]>
Co-authored-by: Copilot <[email protected]> Signed-off-by: Mark Kurtz <[email protected]>
sjmonson
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall just a few nits around explaining arguments. There is also a lot of redundancy around explaining column mapping, output files, etc that would possible be better to have dedicated docs for with backlinks, but we can address that in a future PR.
Co-authored-by: Samuel Monson <[email protected]> Signed-off-by: Mark Kurtz <[email protected]>
Co-authored-by: Samuel Monson <[email protected]> Signed-off-by: Mark Kurtz <[email protected]>
Co-authored-by: Samuel Monson <[email protected]> Signed-off-by: Mark Kurtz <[email protected]>
Co-authored-by: Samuel Monson <[email protected]> Signed-off-by: Mark Kurtz <[email protected]>
Co-authored-by: Samuel Monson <[email protected]> Signed-off-by: Mark Kurtz <[email protected]>
Summary
This PR adds documentation for benchmarking multimodal models (image, video, and audio) with GuideLLM using OpenAI-compatible endpoints, and links these guides into the broader docs navigation.
Details
Test Plan
Use of AI
## WRITTEN BY AI ##)