-
Notifications
You must be signed in to change notification settings - Fork 2.5k
Libsql Integration #6021
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Libsql Integration #6021
Conversation
|
@daniel-lxs |
|
The index file of Roo-Code codebase is 11GB! This is too big. I tried to use the sqlite solution, and the generated index file is only 1.1gb. I uses batch processing and will not load all into memory at once. This is why I prefer the db solution. qdrant will add all codebases to memory (including those that have been indexed but not active). Of course, the shortcomings of the Sqlite solution are as you said. For whole codebase retrieval, all rows need to be queried and calculated. Although some results can be cached, the performance is still poor in large codebases. In a project with 170,000 blocks, a search takes about 5-7s. But I think this is still acceptable, and the performance is better in small codebases. In the RooCode codebase, a search takes about 3s. The advantage of the sqlite solution is that it is smaller in size
|
|
@NaccOll Thanks a lot for sharing this information. @penberg Could you please help me with the issues highlighted in the PR description. The main file is src/services/code-index/vector-store/libsql-client.ts in this PR. I will really appreciate your help here. |
|
@daniel-lxs For Roo-Code codebase of 65840 code blocks, I was able to index it in 16 minutes with final db file size of 1.8 gb. And the search function operated in milliseconds. Now, the only thing left is to download the .node native module (9 mb) when the extension starts up (after installation). |
|
@daniel-lxs @mrubens
after this the libsql can use the native module. I have verified this feature on ubuntu 24.04 and windows 11. here are the screenshots of output panel logs.
|
|
UPDATE 3:
Last time I said it took me 16 minutes to index Roo-Code. I had created vector index in the start which led to frequent vector index update on insertions. Now I have implemented a new strategy: create index at last when all the code blocks are ingested in db. |
04d6720 to
e567132
Compare
|
UPDATE 4: |
| return false | ||
| } catch (error) { | ||
| console.error(`Failed to initialize vector store table ${this.tableName}:`, error) | ||
| throw new Error(`LibsqlConnectionFailed ${error instanceof Error ? error.message : String(error)}`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo/consistency: The error message throws LibsqlConnectionFailed but elsewhere the naming is LibSQL.... Consider changing it to LibSQLConnectionFailed for consistency.
| throw new Error(`LibsqlConnectionFailed ${error instanceof Error ? error.message : String(error)}`) | |
| throw new Error(`LibSQLConnectionFailed ${error instanceof Error ? error.message : String(error)}`) |
|
Can you assemble the project name as the prefix of dbName? This will help identify the unnecessary codebase index instead of deleting it through the roocode, which I will need later when I find that I don't have enough disk space. Another suggestion is to move the download of libsql to libsql-client.ts. I found that you download libsql when the extension starts, but it is very likely that users do not even need the code index function. |
|
Hey @PaperBoardOfficial, thanks for putting this together. After reviewing the changes, we believe the implementation adds a fair amount of complexity and could impact resource usage. I'm open to further discussion, but for now I'm going to close this PR. That doesn't mean we don't appreciate the time and effort you put into it, it's just that this particular approach doesn't align with our current roadmap for the project. Thank you again. |
|
@daniel-lxs Since you said you're open to chatting more, maybe we vibe-check this with tight benchmarks or a scoped-down test? Just to see if it really needs to be put in the bin. |
|
@PaperBoardOfficial I'm all for it, if we can come up with solid data that this doesn't impact performance it will be way easier to sell to the dev team |
|
Thanks @daniel-lxs . I’m thinking of running some benchmarks, looking at memory, CPU, and maybe comparing it with local Qdrant to get a clear picture. That said, if resource usage turns out to be minimal, would complexity or roadmap alignment still be a blocker? If the answer is a "maybe", that’s totally fair. I’d rather not sink more time into it if it's unlikely to move forward. In that case, I’ll keep the LibSQL integration in my fork and use it for personal builds or maybe share it with other downstreams like Kilo. Appreciate you being upfront either way. |


Related GitHub Issue
Closes: #5682
Roo Code Task Context (Optional)
Description
The aim of this PR is to add a local file based vector store so that the user doesn't have to set up docker locally.
I have added LibSQL It is a fork of sqlite which has built in vector indexing. It uses DiskANN algorithm for searching vectors. Reference.
Why I chose LibSQL?
I wanted to use file based db like SQLite but it doesn't have built in vector indexing, so if we search for a vector, it will scan all the rows, compute the vector cosine and provide the top k results. This is time consuming.
I was considering of using hnsw or faiss algorithm with sqlite (sqlite for metadata and hnsw or faiss for vectors). But this would require too much effort and complexity.
So I found out that mastra also uses libsql and after reading the documentation I found out that LibSQL already supports a vecotr similarity search alogrithm called DiskANN. Also this blog motivated me: Using LibSQL on mobile devices
Some points:
Things to work on:
This PR is too long so main things to look at:
Test Procedure
Pre-Submission Checklist
Screenshots / Videos
Documentation Updates
Additional Notes
Get in Touch
discord username: paperboard_52655
Important
Integrates LibSQL as a vector store provider, adding support for configuration, UI updates, and tests.
libsql-client.tsand updatesCodeIndexPopover.tsxfor UI changes.libsql-client.spec.ts.codebase-index.tsandconfig-manager.tsto include LibSQL configuration options.copyLibSQLVersionfunction inesbuild.tsfor version management.downloadLibsqlNativefunction inlibsql-download.tsfor downloading native modules.package.jsonto include@libsql/clientdependency.This description was created by
for 6296649. You can customize this summary. It will automatically update as commits are pushed.