A method for code indexing using the local model qwen3-embedding-8b #1411
planb788
started this conversation in
4. Show and tell
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Because the official does not support the ollama version of qwen3-embedding-8b, I used AI to write code that uses a gguf model, simulating an OpenAI compatible API. This is a compressed package containing the code file and related dependency files. Download the compressed package to your local machine, extract it, and modify the GGUF_MODEL_PATH in openai_embedding_api.py to point to your local Qwen3-Embedding-8B-Q8_0.gguf file (i.e., fill in the absolute path of this file; you need to download the model yourself from the qwen official model repository). Then, change "your-secret-key" to your own key and save the changes. Next, open a terminal in the current directory of the code, use
pip install -r requirements.txt
to install the required dependencies, and then usepython openai_embedding_api.py
to start the service. You can then use the vector model. In kilo code, click the cylindrical icon shaped like three stacked pies at the bottom, set the base_url to http://127.0.0.1:8000/v1, fill in 4096 for the model dimension, enter qwen3-embedding-8b for the model name, and fill in the key you just set. Click save, and then click start indexing. I am not responsible for guiding the usage of qdrant.qwensimulateopenai.zip
Beta Was this translation helpful? Give feedback.
All reactions