-
Notifications
You must be signed in to change notification settings - Fork 2.6k
feat: add user-configurable search score threshold slider for semantic search (#5027) #5041
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds a user-configurable search score threshold for semantic search by introducing a slider in the settings UI, extending model profiles and config schema, and wiring it through the backend config manager and embedding services.
- Added new i18n keys and UI slider component for adjusting the minimum similarity score (0.0–1.0).
- Extended
EMBEDDING_MODEL_PROFILESwithscoreThresholdandqueryPrefix, plus helper getters. - Updated the config manager and schema to support user‐set thresholds with priority over model defaults, and applied prefixes in embedder implementations.
Reviewed Changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| webview-ui/src/i18n/locales/en/settings.json | Added searchMinScoreLabel and searchMinScoreDescription entries |
| webview-ui/src/components/settings/CodeIndexSettings.tsx | Introduced range slider UI for search score threshold |
| webview-ui/src/components/settings/tests/CodeIndexSettings.spec.tsx | Updated tests to verify slider rendering, value display, and change handling |
| src/shared/embeddingModels.ts | Added scoreThreshold and queryPrefix to profiles; new getters |
| src/services/code-index/config-manager.ts | Implemented currentSearchMinScore logic with user, model, and default priority |
| src/services/code-index/interfaces/config.ts | Made searchMinScore required in service config interface |
| src/services/code-index/embedders/openai.ts | Applied model query prefix before embedding |
| src/services/code-index/embedders/openai-compatible.ts | Applied model query prefix before embedding |
| src/services/code-index/embedders/ollama.ts | Applied model query prefix before embedding |
| packages/types/src/codebase-index.ts | Extended Zod schema with optional codebaseIndexSearchMinScore |
|
✅ No security or compliance issues detected. Reviewed everything up to b62bb38. Security Overview
Detected Code ChangesThe diff is too large to display a summary of code changes. Reply to this PR with |
hannesrudolph
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implementation adds configurable search score threshold to replace hardcoded 0.4 value. Priority system: user setting → model-specific threshold → default constant.
Technical issues found:
- Query prefix concatenation doesn't validate token limits. Texts near MAX_ITEM_TOKENS will exceed limits after prefix addition.
- Hindi translation at line 63 contains Arabic/Urdu characters (اپنی) mixed with Devanagari.
- No test coverage for priority cascade logic in currentSearchMinScore getter.
Architecture considerations:
- EMBEDDING_MODEL_PROFILES hardcodes model configs. Future model additions require code changes instead of configuration.
- Query prefix logic doesn't check for existing prefixes, could result in double-prefixing on reprocessing.
- Type inconsistency: searchMinScore marked required in interface but optional in Zod schema.
Implementation notes:
- Slider correctly uses nullish coalescing (existing bot comments appear outdated).
- nomic-embed-code prefix "Represent this query for searching relevant code: " - is this documented by Nomic?
- All embedder implementations (OpenAI, Ollama, OpenAI-compatible) follow same prefix pattern.
Changes are focused and maintain backward compatibility. The 0.15 threshold for nomic-embed-code matches issue requirements.
- Import SEARCH_MIN_SCORE constant to avoid magic number duplication - Replace logical OR (||) with nullish coalescing (??) for numeric defaults to properly handle 0 values - Add aria-label attribute to slider for screen reader accessibility Fixes issues identified by GitHub Copilot and Ellipsis bots in PR #5041
- Add token limit validation in OpenAI and Ollama embedders to prevent exceeding MAX_ITEM_TOKENS with query prefixes - Add comprehensive test coverage for currentSearchMinScore getter priority system - Ensure fallback to unprefixed text when token limit would be exceeded - Add console warnings when falling back to unprefixed embeddings
- Add comprehensive test coverage for currentSearchMinScore getter with priority system - Add token limit validation in OpenAI and Ollama embedders to prevent embedding failures - Fix UI validation to distinguish between indexing vs search settings - Remove duplicate search threshold slider in UI - Fix TypeScript type definitions for optional searchMinScore property
- Move search score threshold setting to advanced configuration section - Make advanced section collapsible with toggle button (collapsed by default) - Position advanced section after action buttons for better UX flow - Add similarity score display badges in search results (3-decimal precision) - Include smooth CSS transitions and proper accessibility (aria-expanded, aria-controls) - Update all locale files with 'Advanced Configuration' translation - Update tests to handle collapsible behavior with comprehensive coverage Addresses UX feedback from mrubens on PR #5041 to de-emphasize complex settings for general users while keeping them accessible for advanced users. # Conflicts: # webview-ui/src/components/chat/CodebaseSearchResult.tsx
3375283 to
a9dfaea
Compare
- Add missing query prefix implementation in OpenAI embedder - Add token limit validation when adding query prefixes to all embedders - Fix translation error in Spanish locale (Russian text replaced with Spanish) - Ensure double-prefix prevention in all embedders - Maintain consistent error handling across embedders
- Add double-prefix guard to OpenAI embedder to prevent duplicate query prefixes - Update Advanced Configuration UI to match ModesView.tsx style with proper hover effects and aria attributes
daniel-lxs
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Description
Fixes #5027
Implements PR 1 of the solution for nomic-embed-code model compatibility with semantic code indexing. This PR adds a user-configurable search score threshold setting that allows users to control the minimum similarity score for semantic search results, replacing the previous hardcoded approach.
Changes Made
🔧 Backend Changes
packages/types/src/codebase-index.ts: ExtendedcodebaseIndexConfigSchemawith optionalcodebaseIndexSearchMinScorefield (number, 0-1 range)src/services/code-index/config-manager.ts: UpdatedcurrentSearchMinScoregetter to implement priority system:searchMinScorefield to track user preference🎨 Frontend Changes
webview-ui/src/components/settings/CodeIndexSettings.tsx: Added intuitive slider interface:🧪 Test Updates
webview-ui/src/components/settings/__tests__/CodeIndexSettings.spec.tsx: Updated tests to:How It Works
Users can now:
Testing
Verification of Acceptance Criteria
Future Impact
This implementation enables PR 2 which will add nomic-embed-code support for all providers with required query prefixes, completing the full solution for issue #5027 without requiring hardcoded model-specific thresholds.
Screenshots
The new slider interface provides intuitive control over search sensitivity:
Checklist
Important
Introduces a user-configurable search score threshold for semantic search with a new slider interface and comprehensive testing.
codebaseIndexConfigSchemaincodebase-index.ts.config-manager.ts: user setting > model-specific threshold > default (0.4).CodeIndexSettings.tsxfor threshold configuration (0.0-1.0 range, 0.05 steps).CodeIndexSettings.spec.tsxto test slider functionality and value changes.embeddingModels.tsto includescoreThresholdandqueryPrefixfor models.config-manager.spec.tsfor priority system and edge cases.CodeIndexSettings.spec.tsxfor slider component tests.i18nfiles for new UI elements.This description was created by
for b62bb38. You can customize this summary. It will automatically update as commits are pushed.