Skip to content
Discussion options

You must be logged in to vote

@TFWol For the code completions you will need llama.cpp server. The reason is that llama.vscode uses /infill endpoint for better performance on local machines. The other providers don't provide the /infill endpoint as far as I know.
For the chat or agent (tools) you don't need obligatory llama.cpp. Any OpenAI compatible API should work. (Don't use llama-vscode.use_openai_endpoint. I have to remove it. Tt is for completion, but is very slow).
Currently the documentation is hier . I know it is not enough, will try to improve it.

How to make it work for chat related functionality

  1. Set property endpoint_chat to the endpoint with the OpenAI API. For example https://openrouter.ai/api (for OpenR…

Replies: 1 comment 10 replies

Comment options

You must be logged in to vote
10 replies
@TFWol
Comment options

@TFWol
Comment options

@TFWol
Comment options

@TFWol
Comment options

@TFWol
Comment options

Answer selected by TFWol
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants