Skip to content

Commit f2fc583

Browse files
ohmeowigardevigardevggerganov
authored
endpoints : add experimental OpenAI support (#16)
* initial openai compatible api endpoint integration * fix watch * added openAiClientModel to config; tested with local vllm server * fixed config and completions to work with FIM models by default * remove unnecessary try catch * core : remove repeating suffix of a suggestion + fix speculative FIM (#18) * Remove repeating suffix of a suggestion * If linesuffix is empty - cut the repeating suffix of the suggestion. * If there is a linesuffix, suggest only one line, don't make hidden second request * Fix the caching of the future suggestion in case of max inputPrefix length. --------- Co-authored-by: igardev <[email protected]> * core : disable trimming of suggestions * release : v0.0.6 * readme : add CPU-only configs * fixed configuration/settings UI * fixed conflicts * fix watch * fixed * fixes * update version * readme : add example * core : fix cutting the lines of a suggestion (#22) * Fix the problem with cutting the lines of a suggestion after the first one. * Remove the less important checks on cutting the suggestion. --------- Co-authored-by: igardev <[email protected]> * Fix manual trigger without cache + accept always on pressing a Tab (#25) * Ensure Ctrl+Shift+L always makes a new request to the servers. * If a suggestion is visible - pressing a Tab always accepts it. --------- Co-authored-by: igardev <[email protected]> * fixed conflicts * fix watch * fixed * fixes * initial openai compatible api endpoint integration * added openAiClientModel to config; tested with local vllm server * fixed config and completions to work with FIM models by default * fixed * make api key optional for openai compatible endpoints as well * updated to work with llama.cpp without api key * removed this.handleOpenAICompletion() call from prepareLlamaForNextCompletion per @igardev * updated package-lock.json after build --------- Co-authored-by: igardev <[email protected]> Co-authored-by: igardev <[email protected]> Co-authored-by: Georgi Gerganov <[email protected]>
1 parent 3ae637e commit f2fc583

File tree

6 files changed

+379
-91
lines changed

6 files changed

+379
-91
lines changed

.prettierrc

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
{
2+
"printWidth": 140,
3+
"tabWidth": 4,
4+
"useTabs": false
5+
}

.vscode/settings.json

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,11 @@
11
// Place your settings in this file to overwrite default and user settings.
22
{
3-
"files.exclude": {
4-
"out": false // set this to true to hide the "out" folder with the compiled JS files
5-
},
6-
"search.exclude": {
7-
"out": true // set this to false to include "out" folder in search results
8-
},
9-
// Turn off tsc task auto detection since we have the necessary tasks as npm scripts
10-
"typescript.tsc.autoDetect": "off"
3+
"files.exclude": {
4+
"out": false // set this to true to hide the "out" folder with the compiled JS files
5+
},
6+
"search.exclude": {
7+
"out": true // set this to false to include "out" folder in search results
8+
},
9+
// Turn off tsc task auto detection since we have the necessary tasks as npm scripts
10+
"typescript.tsc.autoDetect": "off"
1111
}

package-lock.json

Lines changed: 179 additions & 3 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

package.json

Lines changed: 22 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -114,10 +114,25 @@
114114
"default": true,
115115
"description": "If code completion should be trggered automatically (true) or only by pressing Ctrl+l."
116116
},
117-
"llama-vscode.api_key": {
117+
"llama-vscode.api_key": {
118118
"type": "string",
119119
"default": "",
120-
"description": "llama.cpp server api key (optional)"
120+
"description": "llama.cpp server api key or OpenAI endpoint API key (optional)"
121+
},
122+
"llama-vscode.is_openai_compatible": {
123+
"type": "boolean",
124+
"default": false,
125+
"description": "If the server exposes an OpenAI API compatible endpoint."
126+
},
127+
"llama-vscode.openai_client_model": {
128+
"type": "string",
129+
"default": "",
130+
"description": "The FIM friendly model supported by your OpenAI compatible endpoint to be used (e.g., Qwen2.5-Coder-14B-4-bit)"
131+
},
132+
"llama-vscode.openai_prompt_template": {
133+
"type": "string",
134+
"default": "<|fim_prefix|>{inputPrefix}{prompt}<|fim_suffix|>{inputSuffix}<|fim_middle|>",
135+
"description": "The prompt template to be used for the OpenAI compatible endpoint."
121136
},
122137
"llama-vscode.n_prefix": {
123138
"type": "number",
@@ -203,8 +218,12 @@
203218
}
204219
}
205220
},
221+
"scripts": {
222+
"watch": "tsc -watch -p ./"
223+
},
206224
"dependencies": {
207-
"axios": "^1.1.2"
225+
"axios": "^1.1.2",
226+
"openai": "^4.80.1"
208227
},
209228
"devDependencies": {
210229
"@types/jest": "^29.5.14",

0 commit comments

Comments
 (0)