Skip to content

Conversation

@NaccOll
Copy link
Contributor

@NaccOll NaccOll commented Jul 21, 2025

Related GitHub Issue

Closes: #5682

Description

Upgrade Node.JS 22 to use the built-in node:sqlite to implement local vector storage

Qdrant is still used by default, but users can switch according to their own preferences

Test Procedure

Unit Test:

src\services\code-index\vector-store_tests_\local-vector-store.spec.ts

Integration test

  1. After loading the RooCode extension
  2. Click the index button, switch the Vector storage to local, and click Save
  3. Click Start, wait for the index to complete, and .roo/vector/vector_store.db will be generated
  4. Use the codebase_search tool

Pre-Submission Checklist

  • Issue Linked: This PR is linked to an approved GitHub Issue (see "Related GitHub Issue" above).
  • Scope: My changes are focused on the linked issue (one major feature/fix per PR).
  • Self-Review: I have performed a thorough self-review of my code.
  • Testing: New and/or updated tests have been added to cover my changes (if applicable).
  • Documentation Impact: I have considered if my changes require documentation updates (see "Documentation Updates" section below).
  • Contribution Guidelines: I have read and agree to the Contributor Guidelines.

Screenshots / Videos


Important

Adds support for configuring a local vector store using Node.js 22's node:sqlite, allowing users to choose between local and Qdrant storage options.

  • Behavior:
    • Adds support for local vector store configuration using node:sqlite in Node.js 22.
    • Users can switch between local and Qdrant vector stores via settings.
  • Configuration:
    • Updates codebaseIndexConfigSchema in codebase-index.ts to include codebaseIndexVectorStoreProvider and codebaseIndexLocalVectorStoreDirectory.
    • Modifies ClineProvider and webviewMessageHandler to handle new vector store settings.
  • Implementation:
    • Introduces LocalVectorStore class in local-vector-store.ts for handling local storage operations.
    • Implements getLocalVectorStoreDirectoryPath in storage.ts for directory management.
  • Testing:
    • Adds unit tests in local-vector-store.spec.ts to validate local vector store functionality.
  • UI:
    • Updates CodeIndexPopover.tsx to allow users to select vector store provider and configure local storage path.
  • Misc:
    • Upgrades Node.js version to 22.17.1 in .nvmrc, .tool-versions, and package.json.

This description was created by Ellipsis for 5d70de1349ed8cd93f86e8f9a847fc1711f0da62. You can customize this summary. It will automatically update as commits are pushed.

@NaccOll NaccOll requested review from cte, jr and mrubens as code owners July 21, 2025 16:23
@dosubot dosubot bot added size:XXL This PR changes 1000+ lines, ignoring generated files. enhancement New feature or request labels Jul 21, 2025
@hannesrudolph hannesrudolph added the Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. label Jul 21, 2025
@NaccOll NaccOll mentioned this pull request Jul 22, 2025
6 tasks
@daniel-lxs daniel-lxs moved this from Triage to PR [Needs Prelim Review] in Roo Code Roadmap Jul 22, 2025
@hannesrudolph hannesrudolph added PR - Needs Preliminary Review and removed Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. labels Jul 22, 2025
Copy link
Member

@daniel-lxs daniel-lxs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @NaccOll, I saw in the description that Node 22 is required for this implementation. That makes sense, but updating such a core dependency can have side effects in other parts of the project.

Is there any alternative approach we could explore, maybe using an external library, that wouldn't require changing the Node version?

@daniel-lxs daniel-lxs moved this from PR [Needs Prelim Review] to PR [Changes Requested] in Roo Code Roadmap Jul 24, 2025
@NaccOll
Copy link
Contributor Author

NaccOll commented Jul 25, 2025

@daniel-lxs

I have tried it

better\sqlite3 has better performance, but needs to be packaged by platform, which will greatly increase the size of the extension

sql.js only needs one wasm file, about 1MB. But the performance is worse (this PR uses the RooCode code base for testing, codebase_search takes 2-3 seconds) And it will cause the database to be loaded into memory. If you want to save it, you need to save the entire database file at once.

If you are concerned about upgrading NodeJS, I suggest you check out Libsql first

It has better performance than Sqlite, but its problem is that the generated db file is larger (the RooCode DB file generated by this solution is 1.1g), and it will also greatly increase the size of the extension. It is currently planning to temporarily download the platform node file when the extension starts.

Upgrading NodeJs brings risks. The other side of the scale is expansion size and cross-platform compatibility.

@NaccOll NaccOll force-pushed the local-vector-store branch from 7e03915 to e161d57 Compare July 25, 2025 12:00
NaccOll added 3 commits July 26, 2025 22:27
- Upgrade Node.JS 22.17.1
- Updated WebviewMessage interface to include options for local vector store provider and directory.
- Implemented synchronous function to retrieve storage path for conversations.
- Enhanced CodeIndexPopover component to handle local vector store settings.
- Added validation tests for new settings in CodeIndexPopover.
- Updated ExtensionStateContext to initialize new settings.
- Translated new settings labels and descriptions into multiple languages.
@NaccOll NaccOll force-pushed the local-vector-store branch from e161d57 to 5a00745 Compare July 26, 2025 14:28
@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. and removed size:XXL This PR changes 1000+ lines, ignoring generated files. labels Jul 26, 2025
@NaccOll
Copy link
Contributor Author

NaccOll commented Jul 26, 2025

@daniel-lxs Please review
I replaced sqlite with lancedb, and only downloaded dependencies when the user selected local. The performance was greatly improved, and the expansion size did not change.

@NaccOll NaccOll requested a review from daniel-lxs July 26, 2025 14:44
@daniel-lxs daniel-lxs moved this from PR [Changes Requested] to PR [Draft / In Progress] in Roo Code Roadmap Jul 26, 2025
@daniel-lxs
Copy link
Member

Please see #5682 (comment)

@NaccOll
Copy link
Contributor Author

NaccOll commented Jul 27, 2025

@daniel-lxs
I don't quite understand the problem you're talking about

  1. Because it's a dynamic download, it doesn't increase the size of the extension package. If users still use qdrant, they won't download dependencies

  2. Compared to qdbrant, the memory usage is greatly reduced.

In fact, I don't know how qdrant avoids memory exhaustion. When I start qdrant using docker, it loads all codebase indexes into memory, including projects that I have indexed but not currently opened in vscode. Because I have indexed so many projects by Qwen3-Embedding-8B(the vector size is 4096), opening qdbrantconsumes more than 10g of memory, which is unacceptable to me.

@daniel-lxs
Copy link
Member

@NaccOll
You can setup how Qdrant stores the vector as shown here: https://qdrant.tech/documentation/concepts/storage/

@NaccOll
Copy link
Contributor Author

NaccOll commented Jul 28, 2025

@daniel-lxs
I have tried this, but it didn't work. I use a Windows computer and start qdrant through docker.

Maybe this is a problem only with Docker on Windows, but I don't know where to analyze it. Because my goal is to use code index under the premise of limited memory size.

Even if I set memmap_threshold, on_disk and other configuration items, the next time I start the computer, qdrant in docker will recover all indexes to RAM.

That's why I want a local vector solution. Its biggest advantage for me is not that it works out of the box, but that it works on demand. Even if my single project index has 3G, it only consumes 3G of memory for me (actually not, because the data is on disk, although the performance is lower), instead of loading the index of other projects that I am not currently developing into memory, which will consume more than 10G of memory and bother me.

Not to mention that the local solution is smaller than qdrant. I found that even if qdrant did nothing, a collection would consume about 400MB of storage space. A project that only takes 100MB in the local solution would take 650MB of storage space if using qdrant. This may be the inherent format of qdrant.

I sincerely hope you will test my PR. I have implemented dynamic download (via npm) which does not increase the size of the extension, and it is not the default solution. The default solution is still qdrant. This is not to say that you must adopt my PR, but to adopt related ideas to achieve on-demand index memory loading without increasing the size of the extension. It is importantful for me.

Below is my docker-compose.yaml

version: "3.8"
services:
  qdrant:
    image: qdrant/qdrant:v1.14.1
    restart: always
    ports:
      - 127.0.0.1:6333:6333
    # environment:
    #   QDRANT__STORAGE__OPTIMIZERS__MEMMAP_THRESHOLD: 20000
    volumes:
      - qdrant_storage:/qdrant/storage
      - ./qdrant/config.yaml:/qdrant/config/config.yaml
volumes:
  qdrant_storage:
log_level: INFO

# Logging configuration
# Qdrant logs to stdout. You may configure to also write logs to a file on disk.
# Be aware that this file may grow indefinitely.
# logger:
#   # Logging format, supports `text` and `json`
#   format: text
#   on_disk:
#     enabled: true
#     log_file: path/to/log/file.log
#     log_level: INFO
#     # Logging format, supports `text` and `json`
#     format: text

storage:
  # Where to store all the data
  storage_path: ./storage

  # Where to store snapshots
  snapshots_path: ./snapshots

  snapshots_config:
    # "local" or "s3" - where to store snapshots
    snapshots_storage: local
    # s3_config:
    #   bucket: ""
    #   region: ""
    #   access_key: ""
    #   secret_key: ""

  # Where to store temporary files
  # If null, temporary snapshots are stored in: storage/snapshots_temp/
  temp_path: null

  # If true - point payloads will not be stored in memory.
  # It will be read from the disk every time it is requested.
  # This setting saves RAM by (slightly) increasing the response time.
  # Note: those payload values that are involved in filtering and are indexed - remain in RAM.
  #
  # Default: true
  on_disk_payload: true

  # Maximum number of concurrent updates to shard replicas
  # If `null` - maximum concurrency is used.
  update_concurrency: null

  # Write-ahead-log related configuration
  wal:
    # Size of a single WAL segment
    wal_capacity_mb: 32

    # Number of WAL segments to create ahead of actual data requirement
    wal_segments_ahead: 0

  # Normal node - receives all updates and answers all queries
  node_type: "Normal"

  # Listener node - receives all updates, but does not answer search/read queries
  # Useful for setting up a dedicated backup node
  # node_type: "Listener"

  performance:
    # Number of parallel threads used for search operations. If 0 - auto selection.
    max_search_threads: 0

    # Max number of threads (jobs) for running optimizations across all collections, each thread runs one job.
    # If 0 - have no limit and choose dynamically to saturate CPU.
    # Note: each optimization job will also use `max_indexing_threads` threads by itself for index building.
    max_optimization_threads: 0

    # CPU budget, how many CPUs (threads) to allocate for an optimization job.
    # If 0 - auto selection, keep 1 or more CPUs unallocated depending on CPU size
    # If negative - subtract this number of CPUs from the available CPUs.
    # If positive - use this exact number of CPUs.
    optimizer_cpu_budget: 0

    # Prevent DDoS of too many concurrent updates in distributed mode.
    # One external update usually triggers multiple internal updates, which breaks internal
    # timings. For example, the health check timing and consensus timing.
    # If null - auto selection.
    update_rate_limit: null

    # Limit for number of incoming automatic shard transfers per collection on this node, does not affect user-requested transfers.
    # The same value should be used on all nodes in a cluster.
    # Default is to allow 1 transfer.
    # If null - allow unlimited transfers.
    #incoming_shard_transfers_limit: 1

    # Limit for number of outgoing automatic shard transfers per collection on this node, does not affect user-requested transfers.
    # The same value should be used on all nodes in a cluster.
    # Default is to allow 1 transfer.
    # If null - allow unlimited transfers.
    #outgoing_shard_transfers_limit: 1

    # Enable async scorer which uses io_uring when rescoring.
    # Only supported on Linux, must be enabled in your kernel.
    # See: <https://qdrant.tech/articles/io_uring/#and-what-about-qdrant>
    #async_scorer: false

  optimizers:
    # The minimal fraction of deleted vectors in a segment, required to perform segment optimization
    deleted_threshold: 0.2

    # The minimal number of vectors in a segment, required to perform segment optimization
    vacuum_min_vector_number: 1000

    # Target amount of segments optimizer will try to keep.
    # Real amount of segments may vary depending on multiple parameters:
    #  - Amount of stored points
    #  - Current write RPS
    #
    # It is recommended to select default number of segments as a factor of the number of search threads,
    # so that each segment would be handled evenly by one of the threads.
    # If `default_segment_number = 0`, will be automatically selected by the number of available CPUs
    default_segment_number: 0

    # Do not create segments larger this size (in KiloBytes).
    # Large segments might require disproportionately long indexation times,
    # therefore it makes sense to limit the size of segments.
    #
    # If indexation speed have more priority for your - make this parameter lower.
    # If search speed is more important - make this parameter higher.
    # Note: 1Kb = 1 vector of size 256
    # If not set, will be automatically selected considering the number of available CPUs.
    max_segment_size_kb: null

    # Maximum size (in KiloBytes) of vectors to store in-memory per segment.
    # Segments larger than this threshold will be stored as read-only memmapped file.
    # To enable memmap storage, lower the threshold
    # Note: 1Kb = 1 vector of size 256
    # To explicitly disable mmap optimization, set to `0`.
    # If not set, will be disabled by default.
    memmap_threshold_kb: 10000

    # Maximum size (in KiloBytes) of vectors allowed for plain index.
    # Default value based on https://github.com/google-research/google-research/blob/master/scann/docs/algorithms.md
    # Note: 1Kb = 1 vector of size 256
    # To explicitly disable vector indexing, set to `0`.
    # If not set, the default value will be used.
    indexing_threshold_kb: 20000

    # Interval between forced flushes.
    flush_interval_sec: 5

    # Max number of threads (jobs) for running optimizations per shard.
    # Note: each optimization job will also use `max_indexing_threads` threads by itself for index building.
    # If null - have no limit and choose dynamically to saturate CPU.
    # If 0 - no optimization threads, optimizations will be disabled.
    max_optimization_threads: null

  # This section has the same options as 'optimizers' above. All values specified here will overwrite the collections
  # optimizers configs regardless of the config above and the options specified at collection creation.
  #optimizers_overwrite:
  #  deleted_threshold: 0.2
  #  vacuum_min_vector_number: 1000
  #  default_segment_number: 0
  #  max_segment_size_kb: null
  #  memmap_threshold_kb: null
  #  indexing_threshold_kb: 20000
  #  flush_interval_sec: 5
  #  max_optimization_threads: null

  # Default parameters of HNSW Index. Could be overridden for each collection or named vector individually
  hnsw_index:
    # Number of edges per node in the index graph. Larger the value - more accurate the search, more space required.
    m: 16

    # Number of neighbours to consider during the index building. Larger the value - more accurate the search, more time required to build index.
    ef_construct: 100

    # Minimal size (in KiloBytes) of vectors for additional payload-based indexing.
    # If payload chunk is smaller than `full_scan_threshold_kb` additional indexing won't be used -
    # in this case full-scan search should be preferred by query planner and additional indexing is not required.
    # Note: 1Kb = 1 vector of size 256
    full_scan_threshold_kb: 10000

    # Number of parallel threads used for background index building.
    # If 0 - automatically select.
    # Best to keep between 8 and 16 to prevent likelihood of building broken/inefficient HNSW graphs.
    # On small CPUs, less threads are used.
    max_indexing_threads: 0

    # Store HNSW index on disk. If set to false, index will be stored in RAM. Default: false
    on_disk: true

    # Custom M param for hnsw graph built for payload index. If not set, default M will be used.
    payload_m: null

  # Default shard transfer method to use if none is defined.
  # If null - don't have a shard transfer preference, choose automatically.
  # If stream_records, snapshot or wal_delta - prefer this specific method.
  # More info: https://qdrant.tech/documentation/guides/distributed_deployment/#shard-transfer-method
  shard_transfer_method: null

  # Default parameters for collections
  collection:
    # Number of replicas of each shard that network tries to maintain
    replication_factor: 1

    # How many replicas should apply the operation for us to consider it successful
    write_consistency_factor: 1

    # Default parameters for vectors.
    vectors:
      # Whether vectors should be stored in memory or on disk.
      on_disk: true

    # shard_number_per_node: 1

    # Default quantization configuration.
    # More info: https://qdrant.tech/documentation/guides/quantization
    quantization: null

    # Default strict mode parameters for newly created collections.
    strict_mode:
      # Whether strict mode is enabled for a collection or not.
      enabled: false

      # Max allowed `limit` parameter for all APIs that don't have their own max limit.
      max_query_limit: null

      # Max allowed `timeout` parameter.
      max_timeout: null

      # Allow usage of unindexed fields in retrieval based (eg. search) filters.
      unindexed_filtering_retrieve: null

      # Allow usage of unindexed fields in filtered updates (eg. delete by payload).
      unindexed_filtering_update: null

      # Max HNSW value allowed in search parameters.
      search_max_hnsw_ef: null

      # Whether exact search is allowed or not.
      search_allow_exact: null

      # Max oversampling value allowed in search.
      search_max_oversampling: null

  # Maximum number of collections allowed to be created
  # If null - no limit.
  max_collections: null

service:
  # Maximum size of POST data in a single request in megabytes
  max_request_size_mb: 32

  # Number of parallel workers used for serving the api. If 0 - equal to the number of available cores.
  # If missing - Same as storage.max_search_threads
  max_workers: 0

  # Host to bind the service on
  host: 0.0.0.0

  # HTTP(S) port to bind the service on
  http_port: 6333

  # gRPC port to bind the service on.
  # If `null` - gRPC is disabled. Default: null
  # Comment to disable gRPC:
  grpc_port: 6334

  # Enable CORS headers in REST API.
  # If enabled, browsers would be allowed to query REST endpoints regardless of query origin.
  # More info: https://developer.mozilla.org/en-US/docs/Web/HTTP/CORS
  # Default: true
  enable_cors: true

  # Enable HTTPS for the REST and gRPC API
  enable_tls: false

  # Check user HTTPS client certificate against CA file specified in tls config
  verify_https_client_certificate: false

  # Set an api-key.
  # If set, all requests must include a header with the api-key.
  # example header: `api-key: <API-KEY>`
  #
  # If you enable this you should also enable TLS.
  # (Either above or via an external service like nginx.)
  # Sending an api-key over an unencrypted channel is insecure.
  #
  # Uncomment to enable.
  # api_key: your_secret_api_key_here

  # Set an api-key for read-only operations.
  # If set, all requests must include a header with the api-key.
  # example header: `api-key: <API-KEY>`
  #
  # If you enable this you should also enable TLS.
  # (Either above or via an external service like nginx.)
  # Sending an api-key over an unencrypted channel is insecure.
  #
  # Uncomment to enable.
  # read_only_api_key: your_secret_read_only_api_key_here

  # Uncomment to enable JWT Role Based Access Control (RBAC).
  # If enabled, you can generate JWT tokens with fine-grained rules for access control.
  # Use generated token instead of API key.
  #
  # jwt_rbac: true

  # Hardware reporting adds information to the API responses with a
  # hint on how many resources were used to execute the request.
  #
  # Warning: experimental, this feature is still under development and is not supported yet.
  #
  # Uncomment to enable.
  # hardware_reporting: true

cluster:
  # Use `enabled: true` to run Qdrant in distributed deployment mode
  enabled: false

  # Configuration of the inter-cluster communication
  p2p:
    # Port for internal communication between peers
    port: 6335

    # Use TLS for communication between peers
    enable_tls: false

  # Configuration related to distributed consensus algorithm
  consensus:
    # How frequently peers should ping each other.
    # Setting this parameter to lower value will allow consensus
    # to detect disconnected nodes earlier, but too frequent
    # tick period may create significant network and CPU overhead.
    # We encourage you NOT to change this parameter unless you know what you are doing.
    tick_period_ms: 100

    # Compact consensus operations once we have this amount of applied
    # operations. Allows peers to join quickly with a consensus snapshot without
    # replaying a huge amount of operations.
    # If 0 - disable compaction
    compact_wal_entries: 128

# Set to true to prevent service from sending usage statistics to the developers.
# Read more: https://qdrant.tech/documentation/guides/telemetry
telemetry_disabled: false

# TLS configuration.
# Required if either service.enable_tls or cluster.p2p.enable_tls is true.
tls:
  # Server certificate chain file
  cert: ./tls/cert.pem

  # Server private key file
  key: ./tls/key.pem

  # Certificate authority certificate file.
  # This certificate will be used to validate the certificates
  # presented by other nodes during inter-cluster communication.
  #
  # If verify_https_client_certificate is true, it will verify
  # HTTPS client certificate
  #
  # Required if cluster.p2p.enable_tls is true.
  ca_cert: ./tls/cacert.pem

  # TTL in seconds to reload certificate from disk, useful for certificate rotations.
  # Only works for HTTPS endpoints. Does not support gRPC (and intra-cluster communication).
  # If `null` - TTL is disabled.
  cert_ttl: 3600

@NaccOll NaccOll closed this Jul 31, 2025
@github-project-automation github-project-automation bot moved this from PR [Draft / In Progress] to Done in Roo Code Roadmap Jul 31, 2025
@github-project-automation github-project-automation bot moved this from New to Done in Roo Code Roadmap Jul 31, 2025
@NaccOll NaccOll deleted the local-vector-store branch August 3, 2025 15:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request PR - Draft / In Progress size:L This PR changes 100-499 lines, ignoring generated files.

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

Local Embedding and Local Vector Store for Indexing

3 participants