Skip to content

server : implement /api/version endpoint for ollama compatibility (#15167 ) #15177

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

albert-polak
Copy link

@albert-polak albert-polak commented Aug 8, 2025

This PR implements a minimal /api/version endpoint to make llama.cpp compatible with tools expecting the Ollama API, such as the copilot chat VS Code extension.

Fixes #15167

@65a
Copy link
Contributor

65a commented Aug 9, 2025

Drive-by comment, not an approver. Maybe we should return the actual llama.cpp version on this endpoint, and have a generic LLAMA_API_VERSION_OVERRIDE env var for cases where it's necessary to return specific values?

@Green-Sky
Copy link
Collaborator

Drive-by comment, not an approver. Maybe we should return the actual llama.cpp version on this endpoint, and have a generic LLAMA_API_VERSION_OVERRIDE env var for cases where it's necessary to return specific values?

I think so too.
I also don't really see the point in faking/pretending to be ollama by default.

@albert-polak
Copy link
Author

Drive-by comment, not an approver. Maybe we should return the actual llama.cpp version on this endpoint, and have a generic LLAMA_API_VERSION_OVERRIDE env var for cases where it's necessary to return specific values?

I think so too. I also don't really see the point in faking/pretending to be ollama by default.

yeah, that is probably a better idea, but the llama.cpp versioning convention probably doesn't comply with the ollama one.

image

Would you suggest spliting it manually by inserting dots?

@65a
Copy link
Contributor

65a commented Aug 11, 2025

In the example, I'd return 6121 unless overriden, I guess.

@albert-polak
Copy link
Author

Drive-by comment, not an approver. Maybe we should return the actual llama.cpp version on this endpoint, and have a generic LLAMA_API_VERSION_OVERRIDE env var for cases where it's necessary to return specific values?

I think so too. I also don't really see the point in faking/pretending to be ollama by default.

yeah, that is probably a better idea, but the llama.cpp versioning convention probably doesn't comply with the ollama one.

image Would you suggest spliting it manually by inserting dots?

It actually does comply with the ollama versioning, it is treated as 6121.0.0. Commited some changes.

@ngxson
Copy link
Collaborator

ngxson commented Aug 11, 2025

If it's purely for compatibility, why don't we hard-code the version number to something like 99.99.99.99 ?

Tbh I don't feel confident spending a lot of code just to match a short-lived integration. VSCode will eventually has OAI-compat support, the ollama-compat is currently a short-term solution.

@ngxson
Copy link
Collaborator

ngxson commented Aug 11, 2025

Drive-by comment, not an approver. Maybe we should return the actual llama.cpp version on this endpoint, and have a generic LLAMA_API_VERSION_OVERRIDE env var for cases where it's necessary to return specific values?

What's the use case? Does nay downstream app check for this version? And even if it checks, does an incorrect version number blocks you from doing certain things?

@albert-polak
Copy link
Author

Drive-by comment, not an approver. Maybe we should return the actual llama.cpp version on this endpoint, and have a generic LLAMA_API_VERSION_OVERRIDE env var for cases where it's necessary to return specific values?

What's the use case? Does nay downstream app check for this version? And even if it checks, does an incorrect version number blocks you from doing certain things?

That's exactly right, if the endpoint isn't there the vs code copilot chat extension can't get the model list due to a certain commit (linked in the issue #15167 ). It's connected to this PR #12896. But just returning llama cpp build version works as I commented above. It treats it as 6121.0.0 which won't ever be surpassed I think

@Green-Sky
Copy link
Collaborator

ggml presents it's version as 0.0.xxxx.

@albert-polak
Copy link
Author

ggml presents it's version as 0.0.xxxx.

Llama build version as in ./llama-cli --version is being treated as "build_version.0.0"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Misc. bug: VSCode copilot chat now asks for a minimum version
4 participants