Skip to content

Conversation

rgerganov
Copy link
Contributor

Summary

Add documentation on how to run guidellm with llama.cpp server

Details

guidellm can run against a llama.cpp server when the model metadata is prefetched and the server is started with the right arguments.

Test Plan

Verified that guidellm runs successfully when following the documented steps

Related Issues

n/a


  • "I certify that all code in this PR is my own, except as noted below."

Use of AI

  • Includes AI-assisted code completion
  • Includes code generated by an AI application
  • Includes AI-generated tests (NOTE: AI written tests should have a docstring that includes ## WRITTEN BY AI ##)

GuideLLM can run against a llama.cpp server when the model metadata is
prefetched and the server is started with the right arguments.

Signed-off-by: Radoslav Gerganov <[email protected]>
Copy link
Collaborator

@sjmonson sjmonson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good addition

@sjmonson sjmonson merged commit 82ab6bf into vllm-project:main Sep 11, 2025
17 checks passed
tukwila pushed a commit to tukwila/guidellm that referenced this pull request Sep 17, 2025
## Summary

Add documentation on how to run guidellm with llama.cpp server

## Details

guidellm can run against a llama.cpp server when the model metadata is
prefetched and the server is started with the right arguments.

## Test Plan

Verified that guidellm runs successfully when following the documented
steps

## Related Issues

n/a

---

- [x] "I certify that all code in this PR is my own, except as noted
below."

## Use of AI

- [ ] Includes AI-assisted code completion
- [ ] Includes code generated by an AI application
- [ ] Includes AI-generated tests (NOTE: AI written tests should have a
docstring that includes `## WRITTEN BY AI ##`)

Signed-off-by: Radoslav Gerganov <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants