You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
description: `Perform semantic search across indexed files using natural language queries. This tool uses vector similarity to find the most relevant content, going beyond simple keyword matching to understand intent and context.
58
+
59
+
When to use this tool:
60
+
- Finding code examples, functions, or patterns ("error handling in Python", "JWT authentication implementation")
61
+
- Locating documentation or explanations ("how to configure Redis", "API rate limiting guide")
62
+
- Discovering similar functionality across files ("database connection patterns", "logging utilities")
63
+
- Research and exploration of codebases ("machine learning models", "test utilities")
64
+
- Finding files related to specific features or topics
65
+
66
+
How semantic search works:
67
+
- Searches by meaning and context, not just exact keywords
68
+
- Finds conceptually related content even with different terminology
69
+
- Returns files ranked by relevance with similarity scores
70
+
- Groups results by file to avoid duplicates from multiple matching sections
71
+
72
+
Response format:
73
+
- Returns lightweight metadata including file paths, relevance scores, and chunk IDs
74
+
- Use 'get_chunk' or 'get_content' tools to fetch actual content from search results
75
+
- Chunks are sorted by relevance score within each file
76
+
- Average similarity score calculated across all matching chunks per file
description: 'Natural language search query describing what you are looking for. Can be concepts, functionality, or specific technical terms.'
64
88
},
65
89
limit: {
66
90
type: 'number',
67
-
description: 'Maximum number of results (default: 10)',
91
+
description: 'Maximum number of files to return (default: 10). Each file may contain multiple matching chunks.',
68
92
default: 10
69
93
}
70
94
},
@@ -73,17 +97,40 @@ Performance note: Initial indexing may take time for large directories, but subs
73
97
},
74
98
{
75
99
name: 'similar_files',
76
-
description: 'Find files similar to a given file',
100
+
description: `Find files that are semantically similar to a given reference file. This tool analyzes the content and context of a file to discover other files with related functionality, similar patterns, or comparable content.
101
+
102
+
When to use this tool:
103
+
- Discovering related implementations across a codebase ("find files similar to this authentication module")
104
+
- Locating alternative approaches or patterns ("find other components like this React component")
105
+
- Finding documentation or examples related to a specific file
106
+
- Identifying code duplication or similar functionality that could be refactored
107
+
- Exploring unfamiliar codebases by finding files similar to known examples
108
+
- Locating test files, configuration files, or documentation related to a source file
109
+
110
+
How similarity detection works:
111
+
- Analyzes the semantic content of the reference file
112
+
- Compares against all indexed files using vector similarity
113
+
- Considers code patterns, function signatures, imports, and documentation
114
+
- Returns files ranked by content similarity, not just filename or location similarity
115
+
- Works across different file types and programming languages
116
+
117
+
Use cases:
118
+
- Code analysis: "Find files similar to this database model to understand the schema patterns"
119
+
- Learning: "Show me other API controllers similar to this one"
120
+
- Maintenance: "Find files with similar error handling patterns"
121
+
- Architecture: "Locate other services that follow this microservice pattern"
122
+
123
+
Note: The reference file must be indexed for this tool to work. If the file is not found in the index, an error will be returned.`,
77
124
inputSchema: {
78
125
type: 'object',
79
126
properties: {
80
127
file_path: {
81
128
type: 'string',
82
-
description: 'Path to the file to find similar files for'
129
+
description: 'Absolute or relative path to the reference file. This file must have been previously indexed.'
83
130
},
84
131
limit: {
85
132
type: 'number',
86
-
description: 'Maximum number of results (default: 10)',
133
+
description: 'Maximum number of similar files to return (default: 10). Results are sorted by similarity score.',
87
134
default: 10
88
135
}
89
136
},
@@ -92,46 +139,133 @@ Performance note: Initial indexing may take time for large directories, but subs
92
139
},
93
140
{
94
141
name: 'get_content',
95
-
description: 'Get file content',
142
+
description: `Retrieve the full content of a file or specific chunks within a file. This tool reads files directly from the filesystem and can optionally return only specific portions of indexed files.
143
+
144
+
When to use this tool:
145
+
- After performing a search, to retrieve the actual content of relevant files
146
+
- Reading complete files that were identified through semantic search
147
+
- Extracting specific sections of large files using chunk ranges
148
+
- Accessing source code, documentation, or configuration files for analysis
149
+
- Following up on search results with detailed content examination
150
+
151
+
How chunk selection works:
152
+
- If no chunks parameter is provided, returns the entire file content
153
+
- Chunk ranges allow selective reading of large files (e.g., "2-5" returns chunks 2, 3, 4, and 5)
154
+
- Single chunks can be specified (e.g., "3" returns only chunk 3)
155
+
- Chunks are the same segments created during indexing for semantic search
156
+
- Useful for large files where you only need specific sections identified by search
157
+
158
+
File access:
159
+
- Reads files directly from the filesystem (not from the search index)
160
+
- Works with any readable file, whether indexed or not
161
+
- Supports all text-based file formats
162
+
- Preserves original formatting and content exactly as stored
163
+
164
+
Workflow integration:
165
+
1. Use 'search' to find relevant files and identify interesting chunk IDs
166
+
2. Use 'get_content' to retrieve full file content or specific chunks
167
+
3. Analyze the content to understand context and implementation details
168
+
169
+
Performance note: For large files, using chunk ranges can be more efficient than reading entire files.`,
96
170
inputSchema: {
97
171
type: 'object',
98
172
properties: {
99
173
file_path: {
100
174
type: 'string',
101
-
description: 'Path to the file to retrieve'
175
+
description: 'Absolute or relative path to the file to retrieve. File must be readable and text-based.'
102
176
},
103
177
chunks: {
104
178
type: 'string',
105
-
description: 'Optional chunk range (e.g., "2-5")'
179
+
description: 'Optional chunk range specification. Examples: "3" (single chunk), "2-5" (chunks 2 through 5), "1-3" (first three chunks). Only works for indexed files.'
106
180
}
107
181
},
108
182
required: ['file_path']
109
183
}
110
184
},
111
185
{
112
186
name: 'get_chunk',
113
-
description: 'Get content of a specific chunk by file path and chunk ID',
187
+
description: `Retrieve the content of a specific chunk from an indexed file. This tool provides precise access to individual text segments that were identified during semantic search, allowing efficient retrieval of only the most relevant content.
188
+
189
+
When to use this tool:
190
+
- After performing a 'search' operation, to fetch the actual content of specific chunks that matched your query
191
+
- When you want to examine only the most relevant sections of a file rather than reading the entire file
192
+
- For targeted content analysis where you need specific text segments identified by their chunk IDs
193
+
- To build contextual responses using only the most semantically relevant portions of files
194
+
- When working with large files and you only need particular sections
195
+
196
+
How chunks work:
197
+
- Files are divided into overlapping text segments during indexing for better search granularity
198
+
- Each chunk represents a coherent section of text (typically 512 characters with overlap)
199
+
- Chunk IDs are sequential strings ("0", "1", "2", etc.) within each file
200
+
- Search results include chunk IDs for the most relevant sections
201
+
- This tool retrieves the exact content that was semantically matched
202
+
203
+
Typical workflow:
204
+
1. Use 'search' to find files and get chunk IDs with high relevance scores
205
+
2. Use 'get_chunk' to retrieve the specific content of the most relevant chunks
206
+
3. Analyze or process only the most pertinent text segments
207
+
208
+
Efficiency benefits:
209
+
- Avoids transferring unnecessary content from large files
210
+
- Provides precise access to semantically relevant text
211
+
- Reduces token usage by fetching only needed sections
212
+
- Enables focused analysis on the most important content
213
+
214
+
Note: Both the file and the specific chunk must exist in the search index for this tool to work.`,
114
215
inputSchema: {
115
216
type: 'object',
116
217
properties: {
117
218
file_path: {
118
219
type: 'string',
119
-
description: 'Path to the file'
220
+
description: 'Absolute or relative path to the indexed file containing the desired chunk.'
120
221
},
121
222
chunk_id: {
122
223
type: 'string',
123
-
description: 'ID of the chunk to retrieve'
224
+
description: 'ID of the specific chunk to retrieve. This is typically obtained from search results and is a sequential string like "0", "1", "2", etc.'
124
225
}
125
226
},
126
227
required: ['file_path','chunk_id']
127
228
}
128
229
},
129
230
{
130
231
name: 'server_info',
131
-
description: 'Get server information and status',
232
+
description: `Get comprehensive information about the directory indexer server status, configuration, and indexed content. This tool provides a complete overview of the current state of the semantic search system.
233
+
234
+
When to use this tool:
235
+
- To check if the indexer is properly set up and operational
236
+
- Before starting work to understand what content is already indexed
237
+
- To verify indexing operations completed successfully
238
+
- When debugging search issues or unexpected results
239
+
- To get an overview of available content for semantic search
240
+
- To check system health and identify any configuration problems
241
+
242
+
Information provided:
243
+
- Server version and operational status
244
+
- Total count of indexed directories, files, and searchable chunks
245
+
- Database size and storage information
246
+
- Most recent indexing timestamp
247
+
- List of all indexed directories with individual statistics
248
+
- File counts and chunk counts per directory
249
+
- Indexing status for each directory (completed, failed, in progress)
250
+
- Error reports and processing issues
251
+
- System consistency checks between database components
252
+
253
+
Status indicators:
254
+
- Operational status of vector database (Qdrant) connection
255
+
- Embedding service availability
256
+
- Data consistency between SQLite metadata and vector storage
257
+
- Recent errors or warnings that may affect search quality
258
+
259
+
Use this tool to:
260
+
- Verify setup before performing search operations
0 commit comments