-
Notifications
You must be signed in to change notification settings - Fork 2.6k
fix: resolve Go duplicate references in tree-sitter queries (#5367) #5377
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: resolve Go duplicate references in tree-sitter queries (#5367) #5377
Conversation
…nc#5367) - Replace broad statement captures with function-scoped queries - Eliminates overlapping captures that caused duplicate references - Improves search quality and indexing performance for Go projects - Add test to validate no duplicate line ranges are captured - Maintains backward compatibility with existing functionality Fixes RooCodeInc#5367
|
It seems like codebase indexing isn't parsing |
- Update Go tree-sitter queries to capture full declarations instead of just identifiers - Implement language-specific character thresholds (50 chars for Go vs 100 default) - Fix inspectGo.spec.ts test to match new query behavior - Add comprehensive test coverage for Go indexing fix This ensures Go files are properly indexed for semantic search while preventing duplicate references. All tests now pass.
- Changed MIN_BLOCK_CHARS from 100 to 50 in parser.ts - Updated tests to expect single-block captures for small Go files - Removed language-specific threshold logic - Fixes Go files not being indexed due to high character threshold Fixes RooCodeInc#5367
daniel-lxs
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Since the change to MIN_BLOCK_CHARS is generic (reducing it from 100 to 50), I think we can either remove the language-specific tests or replace them with a single generic test that applies to all languages.
- Remove go-indexing-fix.spec.ts as requested in PR feedback - Add generic test in parser.spec.ts to verify 50-character threshold - Test ensures content under 50 chars is filtered, 50+ chars is indexed - Applies to all languages, not just Go
|
I've addressed the feedback by replacing the Go-specific test with a generic test that verifies the MIN_BLOCK_CHARS threshold for all languages. Changes made:
The test ensures the 50-character minimum threshold is respected across all language parsers. |
|
CI checks are now passing! The failing tests were due to the markdown tests still expecting the old MIN_BLOCK_CHARS value of 100, while it was changed to 50 as part of the fix. I've updated the tests to work correctly with the new threshold:
All platform unit tests are now passing on both Ubuntu and Windows. |
daniel-lxs
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!
LGTM
Description
This PR implements a universal 50-character minimum threshold for code block indexing across all languages, replacing the previous 100-character threshold. This change ensures that Go files (and other languages with concise syntax) are properly indexed.
The root cause of the Go indexing issue was that Go's tree-sitter queries were capturing small identifiers (4-12 characters) that were filtered out by the 100-character minimum threshold, resulting in zero blocks being created for Go files.
Changes Made
1. Universal 50-Character Threshold
2. Updated Tests for New Behavior
Testing
Test Coverage
Verification Results
Translations
No translations required. All changes are in backend code files that handle internal parsing and indexing logic. There are no user-facing strings, UI components, or documentation changes that require translation.
Verification of Acceptance Criteria
Checklist
Fixes #5367
Important
Reduces code block indexing threshold to 50 characters for all languages, ensuring proper indexing of Go files and updates tests accordingly.
parser.ts, replacing the previous 100-character threshold.go-indexing-fix.spec.tsto test Go indexing with the new threshold.parseSourceCodeDefinitions.go.spec.tsandsimple-go-test.spec.tsto expect single-block captures.inspectGo.spec.ts.go.tsto capture full declarations instead of just identifiers.This description was created by
for def262f. You can customize this summary. It will automatically update as commits are pushed.