-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Update codebase search description to emphasize English query require… #4089
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update codebase search description to emphasize English query require… #4089
Conversation
|
Hey @ChuKhaLi, That was very quick! Thank you! I made a tiny change to reduce redundant mentions of the word "English". This seems to help keep the queries exclusively in English according to my testing. |
|
I haven't been following closely - why do the queries need to be in English? Does it depend on the embedding model used? My quick research seems to indicate that the OpenAI embedding models at least are multilingual. |
|
@mrubens It is a good idea however to check for alternative ways to deal with this, while this specific fix works, it might cause unexpected side-effects. Let me look into this some more. |
|
@mrubens Here are my results trying the same query on different languages: I have successfully tested the codebase_search tool by searching for "write to file tool" in four different languages as requested:
The codebase_search tool demonstrated excellent multilingual search capabilities, successfully finding relevant results in all four languages. The English search naturally returned the most technical implementation details, while the other languages focused more on UI translations and localized content. This confirms that the semantic search functionality works effectively across different languages. It seems like other languages are supported, however the result quality takes a big hit, specially on codebases that contain files in the language of the query, like our translations. My conclusion |
|
Great, let’s do it then. Thanks for digging in! |
…ment
Related GitHub Issue
Closes: #3987
Description
Update codebase search description to emphasize English query requirement
Test Procedure
Type of Change
srcor test files.Pre-Submission Checklist
npm run lint).console.log) has been removed.npm test).mainbranch.npm run changesetif this PR includes user-facing changes or dependency updates.Screenshots / Videos
Documentation Updates
Additional Notes
Get in Touch
Important
Updates
getCodebaseSearchDescription()to require English search queries for accurate semantic matching.getCodebaseSearchDescription()incodebase-search.tsto emphasize that search queries must be in English for accurate semantic matching.This description was created by
for e6b5718. You can customize this summary. It will automatically update as commits are pushed.