Add RAG search for Ask With AI with project conext #56

igardev · 2025-05-04T12:14:26Z

Use Ctrl+Shift+; or select from llama.vscode menu "Ask with AI with project context", enter the question and press enter.
The program will search for chunks of text which are close to the query, and send the top 5 to the AI together with the query.

The chunks are created on opening the project, they are in memory and are lost on closing VS Code.
The chunks are filtered on 2 steps - first by using BM25 (keywords are extracted with a REST request to the chat model) and the result is filtered by comparing embeddings of the chunks and query.

Embeddings server (property endpoint_embeddings, default http://127.0.0.1:8010) and Chat server (property endpoint_chat, default http://127.0.0.1:8011) need to run to use this functionality. Tested with all-MiniLM-L6-v2 - a very small embedding server.

rag_* properties could be configured to fine tune the RAG search.

Another pull request for webui will improve the user experience (the request will be sent immediately, no need to click Send button from webui).

…i.e. @test.cpp)

…y BM25 algorith.

…embeddings for RAG. search.

…ier to filter them.

…n case of problem with embeddings server. If embeddings server endpoint is not available - shows message and uses only BM25 filtering.

igardev and others added 13 commits May 4, 2025 14:57

Add RAG search for Ask With AI with project conext

325bec8

Remove duplicated call for getting context.

f7a5136

Chat with project supports providing files as context with @ prefix (…

498d484

…i.e. @test.cpp)

Reindex files if rag settings are changed

fe4d547

Add menu item for starting embedding server on mac.

beab2fd

Improve excuding the files from .gitignore; reduce the memory usage b…

75128e3

…y BM25 algorith.

Update file chunks on save improvement, progress bar for calculating …

dbd21f4

…embeddings for RAG. search.

Add prefix llama-vscode for the shortcut commands. This way it is eas…

a73e9ee

…ier to filter them.

Removed senidng extra context chunks to the chat server. Show error i…

0c7c356

…n case of problem with embeddings server. If embeddings server endpoint is not available - shows message and uses only BM25 filtering.

Typing error fix in translations

93b8112

style : fix whitespaces + disable extra context for chat edit

0fd7b66

config : adjust params

c3cf5ea

menu : fix embedding commands

286783c

ggerganov merged commit d375aed into master May 9, 2025

ggerganov deleted the chat-with-project-context branch May 9, 2025 09:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add RAG search for Ask With AI with project conext #56

Add RAG search for Ask With AI with project conext #56

Uh oh!

igardev commented May 4, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Add RAG search for Ask With AI with project conext #56

Add RAG search for Ask With AI with project conext #56

Uh oh!

Conversation

igardev commented May 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

igardev commented May 4, 2025 •

edited

Loading