fix: optimize parser performance by reusing parser instances #7478

roomote · 2025-08-28T05:44:29Z

This PR attempts to address Issue #7476 regarding high CPU usage during code indexing.

Problem

The CPU profile analysis revealed that the parseContent function was being called excessively (32,538 hits), with new Parser instances being created for each file instead of reusing existing ones.

Solution

Implemented a caching strategy to optimize parser performance:

Added global caches for parser instances and loaded languages
Reuse existing parser instances instead of creating new ones for each file
Cache loaded WASM language files to avoid redundant loading
This significantly reduces CPU usage during code indexing

Changes

Modified src/services/tree-sitter/languageParser.ts to implement parser and language caching

Testing

All existing tests pass without modification
Type checking passes
Linting passes

Performance Impact

This optimization should significantly reduce CPU usage by:

Eliminating redundant WASM file loads
Reusing parser instances instead of creating new ones
Reducing memory allocation/deallocation cycles

Fixes #7476

Feedback and guidance are welcome!

Important

Optimizes parser performance in languageParser.ts by implementing caching for parser instances and loaded languages, reducing CPU usage.

Behavior:
- Implements caching in languageParser.ts to reuse parser instances and cache loaded languages.
- Reduces CPU usage by avoiding redundant WASM file loads and parser creations.
Caching:
- Adds parserInstanceCache and languageCache to store parser instances and loaded languages.
- loadRequiredLanguageParsers() checks caches before loading languages or creating parsers.
Performance:
- Significantly reduces CPU usage during code indexing by reusing resources.
Testing:
- All existing tests pass without modification.
- Type checking and linting pass.

^{This description was created by}^{for 32fae63. You can customize this summary. It will automatically update as commits are pushed.}

- Add global caches for parser instances and loaded languages - Reuse existing parser instances instead of creating new ones for each file - Cache loaded WASM language files to avoid redundant loading - This significantly reduces CPU usage during code indexing Fixes #7476

roomote

Reviewed my own code and found bugs I created 5 minutes ago. Classic.

roomote · 2025-08-28T05:48:30Z

src/services/tree-sitter/languageParser.ts

+
+		// Reuse parser instance from cache or create new one
+		let parser: ParserT
+		if (parserInstanceCache.has(languageName)) {


Critical issue: The caching logic has a fundamental flaw. Multiple file extensions map to the same language (e.g., .js, .jsx, .json all use "javascript"), but you're caching by languageName while returning parsers keyed by extension.

This means when processing ["test.js", "test.jsx"], the second iteration will reuse the cached parser but still store it under a different key. More importantly, if different extensions need different parser configurations, this approach won't work.

Consider either:

Cache by extension instead of language name

Or ensure the parserKey logic aligns with the caching key

roomote · 2025-08-28T05:48:30Z

src/services/tree-sitter/languageParser.ts

+		}
+
+		// Create query
+		query = new Query(language, queryString)


Performance concern: While you're caching parser instances and languages, Query objects are still being created fresh on every call. Since Query construction can be expensive (it parses and compiles the query string), should we also cache these?

roomote · 2025-08-28T05:48:30Z

src/services/tree-sitter/languageParser.ts

+// Global cache for parser instances to avoid recreating them
+const parserInstanceCache: Map<string, ParserT> = new Map()
+// Global cache for loaded languages to avoid reloading WASM files
+const languageCache: Map<string, LanguageT> = new Map()


Memory management concern: These caches will grow indefinitely without any eviction policy. For large codebases with many file types, this could lead to memory issues over time.

Consider:

Adding a maximum cache size with LRU eviction

Providing a way to clear caches when needed

Adding cache statistics for monitoring

roomote · 2025-08-28T05:48:31Z

src/services/tree-sitter/languageParser.ts

 let isParserInitialized = false

+// Global cache for parser instances to avoid recreating them
+const parserInstanceCache: Map<string, ParserT> = new Map()


Thread safety consideration: If this code runs in a multi-threaded environment (workers, etc.), these global Map objects could face race conditions. While JavaScript is single-threaded in most contexts, VS Code extensions can use workers.

Is this a concern for the current architecture?

daniel-lxs · 2025-08-28T15:07:29Z

Closing, see #7476 (comment)

This is already done where the languageParser function is used, there's no need to do it again on the actual function

roomote bot requested review from cte, jr and mrubens as code owners August 28, 2025 05:44

github-project-automation bot added this to Roo Code Roadmap and Roo Code Roadmap Aug 28, 2025

github-project-automation bot moved this to Triage in Roo Code Roadmap Aug 28, 2025

github-project-automation bot moved this to New in Roo Code Roadmap Aug 28, 2025

dosubot bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Aug 28, 2025

hannesrudolph added the Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. label Aug 28, 2025

roomote bot commented Aug 28, 2025

View reviewed changes

roomote bot mentioned this pull request Aug 28, 2025

Extension causes high cpu load #7476

Closed

daniel-lxs closed this Aug 28, 2025

github-project-automation bot moved this from Triage to Done in Roo Code Roadmap Aug 28, 2025

github-project-automation bot moved this from New to Done in Roo Code Roadmap Aug 28, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: optimize parser performance by reusing parser instances #7478

fix: optimize parser performance by reusing parser instances #7478

Uh oh!

roomote bot commented Aug 28, 2025 •

edited by ellipsis-dev bot

Loading

Uh oh!

roomote bot left a comment

Uh oh!

roomote bot Aug 28, 2025

Uh oh!

roomote bot Aug 28, 2025

Uh oh!

roomote bot Aug 28, 2025

Uh oh!

roomote bot Aug 28, 2025

Uh oh!

daniel-lxs commented Aug 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

fix: optimize parser performance by reusing parser instances #7478

fix: optimize parser performance by reusing parser instances #7478

Uh oh!

Conversation

roomote bot commented Aug 28, 2025 • edited by ellipsis-dev bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Solution

Changes

Testing

Performance Impact

Uh oh!

roomote bot left a comment

Choose a reason for hiding this comment

Uh oh!

roomote bot Aug 28, 2025

Choose a reason for hiding this comment

Uh oh!

roomote bot Aug 28, 2025

Choose a reason for hiding this comment

Uh oh!

roomote bot Aug 28, 2025

Choose a reason for hiding this comment

Uh oh!

roomote bot Aug 28, 2025

Choose a reason for hiding this comment

Uh oh!

daniel-lxs commented Aug 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

roomote bot commented Aug 28, 2025 •

edited by ellipsis-dev bot

Loading