Skip to content

Conversation

@roomote
Copy link
Contributor

@roomote roomote bot commented Aug 29, 2025

Description

This PR addresses Issue #7509 by exposing the codebase indexing concurrency settings in the VS Code extension settings UI, allowing users to optimize indexing performance based on their hardware capabilities.

Changes

Configuration Schema

  • Added three new configurable settings to codebase-index.ts type definitions:
    • codebaseIndexParsingConcurrency: Number of files to parse in parallel (1-50, default: 10)
    • codebaseIndexMaxPendingBatches: Maximum batches to accumulate before waiting (1-100, default: 20)
    • codebaseIndexBatchProcessingConcurrency: Number of batches to process in parallel (1-50, default: 10)

Implementation

  • Config Management: Extended CodeIndexConfigManager to read and provide the new concurrency settings
  • Scanner Integration: Modified DirectoryScanner to accept and use configurable concurrency values instead of hardcoded constants
  • Service Factory: Updated to pass configuration from manager to scanner
  • VS Code Settings: Added new configuration properties to package.json with appropriate defaults and constraints
  • Localization: Added user-friendly descriptions for the new settings

Testing

  • Added comprehensive test suite for the new configuration functionality
  • All existing tests pass without regression
  • Test coverage includes default values, custom values, and boundary conditions

Impact

  • Performance Improvement: Power users can increase concurrency settings to significantly speed up codebase indexing
  • Greater Flexibility: All users can fine-tune the indexing process to match their hardware capabilities
  • Enhanced User Control: Important performance-tuning options are now accessible through the UI instead of being hardcoded

Testing Instructions

  1. Open VS Code settings (UI or JSON)
  2. Search for "codebase index"
  3. Verify the three new settings appear with proper descriptions
  4. Adjust the values and verify they are within the allowed ranges
  5. Run codebase indexing and observe performance changes with different settings

Related Issue

Fixes #7509

Review Notes

The implementation follows existing patterns in the codebase and includes proper validation, error handling, and test coverage. The review tool analysis showed 95% confidence with a PROCEED recommendation.


Important

Expose codebase indexing concurrency settings in VS Code extension UI for customizable performance optimization.

  • Configuration:
    • Added codebaseIndexParsingConcurrency, codebaseIndexMaxPendingBatches, and codebaseIndexBatchProcessingConcurrency to codebase-index.ts.
    • Updated package.json to include new settings with defaults and constraints.
    • Extended CodeIndexConfigManager to handle new settings.
  • Implementation:
    • Modified DirectoryScanner to use configurable concurrency values.
    • Updated service-factory.ts to pass concurrency settings to DirectoryScanner.
  • Testing:
    • Added tests in config-manager-concurrency.spec.ts for default, custom, and boundary values.
    • Updated config-manager.spec.ts to include new settings in configuration tests.
  • Impact:
    • Users can adjust concurrency settings to optimize indexing performance based on hardware.

This description was created by Ellipsis for 8cc9424. You can customize this summary. It will automatically update as commits are pushed.

- Add configurable concurrency settings to codebase-index types
- Update config-manager to read and provide concurrency settings
- Modify DirectoryScanner to accept and use configurable concurrency values
- Add VS Code configuration properties for the new settings
- Add localization strings for user-friendly descriptions
- Add tests for the new configuration functionality

This allows users to optimize indexing performance based on their hardware capabilities.
@roomote roomote bot requested review from cte, jr and mrubens as code owners August 29, 2025 03:52
@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. enhancement New feature or request labels Aug 29, 2025
Copy link
Contributor Author

@roomote roomote bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reviewed my own code and found issues. Shocking, I know.

Review Findings

Critical Issues

  1. Unused import in constants/index.ts - The import of CODEBASE_INDEX_DEFAULTS appears to be unused. The constants (PARSING_CONCURRENCY, MAX_PENDING_BATCHES, BATCH_PROCESSING_CONCURRENCY) are still hardcoded instead of referencing the defaults from the types package.

  2. Missing runtime validation - The DirectoryScanner accepts config values but doesn't validate they're within acceptable bounds. Consider adding defensive validation.

Suggestions

  1. Documentation opportunity - Consider adding JSDoc comments to the DirectoryScannerConfig interface to explain what each setting controls and their performance implications.

  2. Test uncertainty - The test in config-manager-concurrency.spec.ts line 135 has a comment suggesting uncertainty about whether concurrency changes should require restart. This should be clarified.

Minor Notes

  1. The implementation correctly follows existing patterns and includes proper test coverage.
  2. The localization strings are properly added for the new settings.

Overall, the PR successfully addresses the issue requirements but needs some cleanup around constant usage and validation.

@hannesrudolph hannesrudolph added the Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. label Aug 29, 2025
@daniel-lxs daniel-lxs moved this from Triage to Issue [In Progress] in Roo Code Roadmap Aug 29, 2025
@daniel-lxs daniel-lxs moved this from Issue [In Progress] to PR [Needs Prelim Review] in Roo Code Roadmap Aug 29, 2025
@hannesrudolph hannesrudolph added PR - Needs Preliminary Review and removed Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. labels Aug 29, 2025
@daniel-lxs
Copy link
Member

We recently exposed the batch threshold, we have experimented with the settings and found out that they rarely increase the processing speed before hitting rate limits or timeouts with the embeddings providers.

I don't see any value in exposing these settings unless we have prove that changing one does decrease indexing time.

@daniel-lxs daniel-lxs closed this Sep 2, 2025
@github-project-automation github-project-automation bot moved this from PR [Needs Prelim Review] to Done in Roo Code Roadmap Sep 2, 2025
@github-project-automation github-project-automation bot moved this from New to Done in Roo Code Roadmap Sep 2, 2025
@daniel-lxs daniel-lxs deleted the feat/expose-indexing-concurrency-settings branch September 2, 2025 15:04
@VooDisss
Copy link

VooDisss commented Sep 2, 2025

@daniel-lxs I think that exposing those settings would allow the user to increase embedding speeds when running locally, cause currently I see no difference between using 0.7B and 7B model from qwen for embedding

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request PR - Needs Preliminary Review size:L This PR changes 100-499 lines, ignoring generated files.

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

[Feature] Expose Codebase Indexing Concurrency Settings in UI

5 participants