Skip to content

Commit f9bbb9b

Browse files
committed
Rename search_code to codebase_search with balanced instructions
- Renamed search_code function to codebase_search across all files - Updated docstring to emphasize semantic search as the MAIN exploration tool - Added clear guidance on when to use codebase_search vs grep: * Always prefer codebase_search for initial code exploration * Use grep only for uncommitted local changes or different branches - Clarified that semantic search operates on indexed repository state (main/master branch) - Updated all references in tests, documentation, and imports - Maintains backward compatibility with existing data source format
1 parent dbf1022 commit f9bbb9b

File tree

8 files changed

+46
-33
lines changed

8 files changed

+46
-33
lines changed

CLAUDE.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -48,7 +48,7 @@ This is a Model Context Protocol (MCP) server that provides AI clients with acce
4848
### Core Components
4949

5050
- **`codealive_mcp_server.py`**: Main server implementation using FastMCP framework
51-
- **Three main tools**: `ask_question`, `search_code`, `get_data_sources`
51+
- **Three main tools**: `chat_completions`, `codebase_search`, `get_data_sources`
5252
- **CodeAliveContext**: Manages HTTP client and API credentials
5353
- **Async lifespan management**: Handles client setup/teardown
5454

@@ -63,7 +63,7 @@ This is a Model Context Protocol (MCP) server that provides AI clients with acce
6363
### Data Flow
6464

6565
1. AI client connects to MCP server via stdio/SSE transport
66-
2. Client calls tools (`get_data_sources``search_code``ask_question`)
66+
2. Client calls tools (`get_data_sources``codebase_search``chat_completions`)
6767
3. MCP server translates tool calls to CodeAlive API requests
6868
4. CodeAlive API returns semantic search results or chat completions
6969
5. Server formats and returns results to AI client

README.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -22,16 +22,16 @@ This MCP (Model Context Protocol) server enables AI clients like Claude Code, Cu
2222
Once connected, you'll have access to these powerful tools:
2323

2424
1. **`get_data_sources`** - List your indexed repositories and workspaces
25-
2. **`search_code`** - Semantic code search across your codebase
26-
3. **`ask_question`** - AI chat with full project context
25+
2. **`codebase_search`** - Semantic code search across your indexed codebase (main/master branch)
26+
3. **`chat_completions`** - AI chat with full project context
2727

2828
## 🎯 Usage Examples
2929

3030
After setup, try these commands with your AI assistant:
3131

3232
- *"Show me all available repositories"* → Uses `get_data_sources`
33-
- *"Find authentication code in the user service"* → Uses `search_code`
34-
- *"Explain how the payment flow works in this codebase"* → Uses `ask_question`
33+
- *"Find authentication code in the user service"* → Uses `codebase_search`
34+
- *"Explain how the payment flow works in this codebase"* → Uses `chat_completions`
3535

3636
## Table of Contents
3737

src/codealive_mcp_server.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@
2525

2626
# Import core components
2727
from core import codealive_lifespan, setup_debug_logging
28-
from tools import chat_completions, get_data_sources, search_code
28+
from tools import chat_completions, get_data_sources, codebase_search
2929

3030
# Initialize FastMCP server with lifespan and enhanced system instructions
3131
mcp = FastMCP(
@@ -42,7 +42,7 @@
4242
4343
When working with a codebase:
4444
1. First use `get_data_sources` to identify available repositories and workspaces
45-
2. Then use `search_code` to find relevant files and code snippets
45+
2. Then use `codebase_search` to find relevant files and code snippets
4646
3. Finally, use `chat_completions` for in-depth analysis of the code
4747
4848
For effective code exploration:
@@ -90,7 +90,7 @@ async def health_check(request: Request) -> JSONResponse:
9090
# Register tools
9191
mcp.tool()(chat_completions)
9292
mcp.tool()(get_data_sources)
93-
mcp.tool()(search_code)
93+
mcp.tool()(codebase_search)
9494

9595

9696
def main():

src/tests/test_response_transformer.py

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -267,7 +267,7 @@ def test_data_preservation_without_content(self):
267267
"results": [
268268
{
269269
"kind": "Symbol",
270-
"identifier": "CodeAlive-AI/codealive-mcp::src/tools/search.py::search_code",
270+
"identifier": "CodeAlive-AI/codealive-mcp::src/tools/search.py::codebase_search",
271271
"location": {
272272
"path": "src/tools/search.py",
273273
"range": {"start": {"line": 18}, "end": {"line": 168}}
@@ -317,13 +317,13 @@ def test_data_preservation_with_content(self):
317317
"results": [
318318
{
319319
"kind": "Symbol",
320-
"identifier": "CodeAlive-AI/codealive-mcp::src/tools/search.py::search_code",
320+
"identifier": "CodeAlive-AI/codealive-mcp::src/tools/search.py::codebase_search",
321321
"location": {
322322
"path": "src/tools/search.py",
323323
"range": {"start": {"line": 18}, "end": {"line": 168}}
324324
},
325325
"score": 0.99,
326-
"content": "async def search_code(\n ctx: Context,\n query: str,\n data_source_ids: Optional[List[str]] = None,\n mode: str = \"auto\",\n include_content: bool = False\n) -> Dict:",
326+
"content": "async def codebase_search(\n ctx: Context,\n query: str,\n data_source_ids: Optional[List[str]] = None,\n mode: str = \"auto\",\n include_content: bool = False\n) -> Dict:",
327327
"dataSource": {
328328
"type": "repository",
329329
"id": "685b21230e3822f4efa9d073",
@@ -369,7 +369,7 @@ def test_data_preservation_with_content(self):
369369
assert 'endLine="168"' in result
370370

371371
# Verify content is included
372-
assert "async def search_code" in result
372+
assert "async def codebase_search" in result
373373
assert "include_content: Whether to include full file content" in result
374374
assert "This file provides guidance" in result
375375

src/tests/test_search_tool.py

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -3,13 +3,13 @@
33
import pytest
44
from unittest.mock import AsyncMock, MagicMock, patch
55
from fastmcp import Context
6-
from tools.search import search_code
6+
from tools.search import codebase_search
77

88

99
@pytest.mark.asyncio
1010
@patch('tools.search.get_api_key_from_context')
11-
async def test_search_code_returns_dict(mock_get_api_key):
12-
"""Test that search_code returns a dictionary with structured_content."""
11+
async def test_codebase_search_returns_dict(mock_get_api_key):
12+
"""Test that codebase_search returns a dictionary with structured_content."""
1313
# Mock the API key function
1414
mock_get_api_key.return_value = "test_key"
1515

@@ -47,8 +47,8 @@ async def test_search_code_returns_dict(mock_get_api_key):
4747
ctx.request_context.lifespan_context = mock_codealive_context
4848
ctx.request_context.headers = {"authorization": "Bearer test_key"}
4949

50-
# Call search_code
51-
result = await search_code(
50+
# Call codebase_search
51+
result = await codebase_search(
5252
ctx=ctx,
5353
query="authenticate_user",
5454
data_source_ids=["test_id"],
@@ -57,7 +57,7 @@ async def test_search_code_returns_dict(mock_get_api_key):
5757
)
5858

5959
# Verify result is a dictionary
60-
assert isinstance(result, dict), "search_code should return a dictionary"
60+
assert isinstance(result, dict), "codebase_search should return a dictionary"
6161

6262
# Verify it has structured_content field
6363
assert "structured_content" in result, "Result should have structured_content field"

src/tools/__init__.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,6 @@
22

33
from .chat import chat_completions
44
from .datasources import get_data_sources
5-
from .search import search_code
5+
from .search import codebase_search
66

7-
__all__ = ['chat_completions', 'get_data_sources', 'search_code']
7+
__all__ = ['chat_completions', 'get_data_sources', 'codebase_search']

src/tools/datasources.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -48,7 +48,7 @@ async def get_data_sources(ctx: Context, alive_only: bool = True) -> str:
4848
For workspaces, the repositoryIds can be used to identify and work with
4949
individual repositories that make up the workspace.
5050
51-
Use the returned data source IDs with the search_code and chat_completions functions.
51+
Use the returned data source IDs with the codebase_search and chat_completions functions.
5252
"""
5353
context: CodeAliveContext = ctx.request_context.lifespan_context
5454

@@ -84,7 +84,7 @@ async def get_data_sources(ctx: Context, alive_only: bool = True) -> str:
8484
result = f"Available data sources:\n{formatted_data}"
8585

8686
# Add usage hint
87-
result += "\n\nYou can use these data source IDs with the search_code and chat_completions functions."
87+
result += "\n\nYou can use these data source IDs with the codebase_search and chat_completions functions."
8888

8989
return result
9090

src/tools/search.py

Lines changed: 23 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -10,25 +10,38 @@
1010
from utils import transform_search_response_to_xml, handle_api_error
1111

1212

13-
async def search_code(
13+
async def codebase_search(
1414
ctx: Context,
1515
query: str,
1616
data_source_ids: Optional[List[str]] = None,
1717
mode: str = "auto",
1818
include_content: bool = False
1919
) -> Dict:
2020
"""
21-
SEMANTIC search across your codebases.
21+
Use `codebase_search` tool to search for code in the codebase.
2222
23-
This endpoint is optimized for **natural-language** questions and intent-driven queries
24-
(not rigid templates). Ask it things like:
23+
Semantic search (`codebase_search`) is your MAIN exploration tool for understanding the
24+
indexed codebase (typically main/master branch or the specific branch shown in data sources).
25+
26+
ALWAYS prefer using `codebase_search` over grep/find for initial code exploration because:
27+
- It's much faster and more efficient for discovering relevant code
28+
- It understands semantic meaning, not just text patterns
29+
- It searches the indexed repository state with full context
30+
31+
IMPORTANT: This searches the INDEXED version of repositories (check branch in get_data_sources),
32+
NOT the current local files. Use grep when you specifically need to:
33+
- Search uncommitted local changes
34+
- Verify recent modifications
35+
- Check files on a different branch than the indexed one
36+
37+
This tool excels at natural-language questions and intent-driven queries like:
2538
• "What is the authentication flow?"
2639
• "Where is the user registration logic implemented?"
2740
• "How do services communicate with the billing API?"
2841
• "Where is rate limiting handled?"
2942
• "Show me how we validate JWTs."
3043
31-
You can still include function/class names if you know them, but it's not required.
44+
You can include function/class names for more targeted results.
3245
3346
Args:
3447
query: A natural-language description of what you're looking for.
@@ -57,19 +70,19 @@ async def search_code(
5770
5871
Examples:
5972
1. Natural-language question (recommended):
60-
search_code(query="What is the auth flow?", data_source_ids=["repo123"])
73+
codebase_search(query="What is the auth flow?", data_source_ids=["repo123"])
6174
6275
2. Intent query:
63-
search_code(query="Where is user registration logic?", data_source_ids=["repo123"])
76+
codebase_search(query="Where is user registration logic?", data_source_ids=["repo123"])
6477
6578
3. Workspace-wide question:
66-
search_code(query="How do microservices talk to the billing API?", data_source_ids=["workspace456"])
79+
codebase_search(query="How do microservices talk to the billing API?", data_source_ids=["workspace456"])
6780
6881
4. Mixed query with a known identifier:
69-
search_code(query="Where do we validate JWTs (AuthService)?", data_source_ids=["repo123"])
82+
codebase_search(query="Where do we validate JWTs (AuthService)?", data_source_ids=["repo123"])
7083
7184
5. Concise results without full file contents:
72-
search_code(query="Where is password reset handled?", data_source_ids=["repo123"], include_content=false)
85+
codebase_search(query="Where is password reset handled?", data_source_ids=["repo123"], include_content=false)
7386
7487
Note:
7588
- At least one data_source_id must be provided

0 commit comments

Comments
 (0)