pppyb · pppyb · Mar 8, 2025 · Mar 10, 2025 · Mar 10, 2025 · Mar 10, 2025
diff --git a/.code_search_index/documents.pkl b/.code_search_index/documents.pkl
diff --git a/.code_search_index/index.faiss b/.code_search_index/index.faiss
diff --git a/docs/rag_code_search.md b/docs/rag_code_search.md
@@ -0,0 +1,127 @@
+# RAG Code Search in OpenHands
+
+This document describes the Retrieval Augmented Generation (RAG) code search functionality in OpenHands, how it's integrated into the agent system, and how to test it.
+
+## Overview
+
+The RAG code search functionality allows OpenHands agents to search for relevant code in a repository using natural language queries. This is particularly useful for tasks that require understanding or modifying code, as it helps the agent quickly find relevant parts of the codebase.
+
+## How It Works
+
+1. **Indexing**: The first time a repository is searched, an index is created using sentence embeddings of the code files.
+2. **Searching**: When a query is made, it's converted to an embedding and compared to the indexed code files.
+3. **Ranking**: Results are ranked by similarity score and returned to the agent.
+4. **Integration**: The functionality is integrated into OpenHands as an action-observation pair (`CodeSearchAction` and `CodeSearchObservation`).
+
+## Components
+
+### Core Components
+
+- **Code Search Tool**: Implemented in `openhands_aci.tools.code_search_tool`, this is the core functionality that indexes and searches code.
+- **Action**: `CodeSearchAction` in `openhands.events.action.code_search` defines how agents can request code searches.
+- **Observation**: `CodeSearchObservation` in `openhands.events.observation.code_search` defines how search results are returned to agents.
+- **Schema Integration**: The action and observation types are defined in `openhands.core.schema.action` and `openhands.core.schema.observation`.
+
+### Integration with Agent System
+
+The code search functionality is integrated into the OpenHands agent system through:
+
+1. **Action Execution**: The `ActionExecutor` in `openhands.runtime.action_execution_server` can execute `CodeSearchAction` and return `CodeSearchObservation`.
+2. **Agent Usage**: Agents can create `CodeSearchAction` objects to search for code and process the resulting `CodeSearchObservation`.
+
+## Usage
+
+### Basic Usage
+
+```python
+from openhands.events.action.code_search import CodeSearchAction
+
+# Create a code search action
+action = CodeSearchAction(
+    query="function that handles API requests",
+    repo_path="/path/to/repo",
+    extensions=[".py", ".js"],
+    k=5
+)
+
+# Execute the action (in a real agent, this would be done by the agent system)
+observation = agent.execute_action(action)
+
+# Process the observation
+if isinstance(observation, CodeSearchObservation):
+    for result in observation.results:
+        print(f"File: {result['file']}")
+        print(f"Score: {result['score']}")
+        print(f"Content: {result['content']}")
+```
+
+### In an Agent
+
+In a real OpenHands agent, the code search functionality would be used as part of the agent's reasoning process:
+
+1. The agent identifies a need to understand some part of the codebase.
+2. The agent creates a `CodeSearchAction` with an appropriate query.
+3. The agent system executes the action and returns a `CodeSearchObservation`.
+4. The agent processes the observation and uses the results to inform its next actions.
+
+## Testing
+
+### Unit Tests
+
+Unit tests for the code search functionality are in `tests/unit/test_code_search_integration.py`. These tests verify that:
+
+1. `CodeSearchAction` and `CodeSearchObservation` can be created correctly.
+2. The code search functionality is properly integrated with the `ActionExecutor`.
+3. The schema integration is correct.
+
+To run the unit tests:
+
+```bash
+python -m pytest tests/unit/test_code_search_integration.py -v
+```
+
+### Integration Tests
+
+Integration tests that simulate how an agent would use the code search functionality are in `scripts/test_agent_code_search.py`. This script:
+
+1. Creates a `CodeSearchAction` with a specified query.
+2. Executes the action using an `ActionExecutor`.
+3. Processes the resulting `CodeSearchObservation`.
+4. Simulates how an agent would reason about the results.
+
+To run the integration test:
+
+```bash
+python scripts/test_agent_code_search.py --repo /path/to/repo --query "your search query"
+```
+
+### Full Agent Tests
+
+For a more comprehensive test of how the code search functionality is used in a real agent, use `scripts/test_rag_agent_integration.py`. This script:
+
+1. Initializes a full OpenHands agent with a specified repository.
+2. Gives the agent tasks that would benefit from code search.
+3. Analyzes how the agent uses the code search functionality to complete these tasks.
+4. Generates a detailed report of the agent's code search usage.
+
+To run the full agent test:
+
+```bash
+python scripts/test_rag_agent_integration.py --repo /path/to/repo --output results.json
+```
+
+## Limitations and Future Work
+
+### Current Limitations
+
+- The code search functionality currently only works on a single repository at a time.
+- The indexing process can be slow for large repositories.
+- The search results are based purely on semantic similarity and don't consider code structure.
+
+### Future Work
+
+- Improve indexing performance for large repositories.
+- Add support for searching across multiple repositories.
+- Incorporate code structure and dependencies into the search process.
+- Add support for more fine-grained queries (e.g., "find all functions that call X").
+- Integrate with other tools like static analysis to provide more context to the agent.
diff --git a/openhands/agenthub/codeact_agent/function_calling.py b/openhands/agenthub/codeact_agent/function_calling.py
@@ -37,6 +37,7 @@
     IPythonRunCellAction,
     MessageAction,
 )
+from openhands.events.action.code_search import CodeSearchAction
 from openhands.events.event import FileEditSource, FileReadSource
 from openhands.events.tool import ToolCallMetadata
 
@@ -104,6 +105,26 @@ def response_to_actions(response: ModelResponse) -> list[Action]:
                     inputs=arguments,
                 )
 
+            # ================================================
+            # CodeSearchAction
+            # ================================================
+            elif tool_call.function.name == 'code_search':
+                if 'query' not in arguments:
+                    raise FunctionCallValidationError(
+                        f'Missing required argument "query" in tool call {tool_call.function.name}'
+                    )
+
+                # Get repo_path with default to current directory
+                repo_path = arguments.get('repo_path', '.')
+
+                action = CodeSearchAction(
+                    query=arguments['query'],
+                    repo_path=repo_path,
+                    extensions=arguments.get('extensions'),
+                    k=arguments.get('k', 5),
+                    thought=arguments.get('thought', '')
+                )
+
             # ================================================
             # AgentFinishAction
             # ================================================
@@ -164,6 +185,26 @@ def response_to_actions(response: ModelResponse) -> list[Action]:
             # ================================================
             elif tool_call.function.name == ThinkTool['function']['name']:
                 action = AgentThinkAction(thought=arguments.get('thought', ''))
+
+            # ================================================
+            # CodeSearchTool
+            # ================================================
+            # elif tool_call.function.name == CodeSearchTool['function']['name']:
+            elif tool_call.function.name == 'code_search' or (hasattr(CodeSearchTool, 'function') and tool_call.function.name == CodeSearchTool['function']['name']):
+
+                if 'query' not in arguments:
+                    raise FunctionCallValidationError(
+                        f'Missing required argument "query" in tool call {tool_call.function.name}'
+                    )
+
+                # Create a CodeSearchAction with the provided arguments
+                action = CodeSearchAction(
+                    query=arguments['query'],
+                    repo_path=arguments.get('repo_path'),
+                    extensions=arguments.get('extensions'),
+                    k=arguments.get('k', 5),
+                    thought=arguments.get('thought', '')
+                )
 
             # ================================================
             # BrowserTool
@@ -212,12 +253,54 @@ def response_to_actions(response: ModelResponse) -> list[Action]:
     return actions
 
 
+# Define the code search tool
+CodeSearchTool = ChatCompletionToolParam(
+    type="function",
+    function={
+        "name": "code_search",
+        "description": "IMPORTANT: Use this tool to search for relevant code in the repository. This is the preferred way to find code related to your task.",
+        "parameters": {
+            "type": "object",
+            "properties": {
+                "query": {
+                    "type": "string",
+                    "description": "Natural language query to search for code (e.g., 'how to send a message', 'file handling functions')."
+                },
+                "repo_path": {
+                    "type": "string",
+                    "description": "Path to the Git repository to search. Use the repository path provided in the task."
+                },
+                "extensions": {
+                    "type": "array",
+                    "items": {"type": "string"},
+                    "description": "List of file extensions to include (e.g. [\".py\", \".js\"]). Default is [\".py\"]."
+                },
+                "k": {
+                    "type": "integer",
+                    "description": "Number of results to return. Default is 5."
+                },
+                "thought": {
+                    "type": "string",
+                    "description": "Your reasoning for why this search will help with the task."
+                }
+            },
+            "required": ["query"]
+        }
+    }
+)
+
 def get_tools(
     codeact_enable_browsing: bool = False,
     codeact_enable_llm_editor: bool = False,
     codeact_enable_jupyter: bool = False,
+    codeact_enable_code_search: bool = True,  # Enable code search by default
 ) -> list[ChatCompletionToolParam]:
     tools = [CmdRunTool, ThinkTool, FinishTool]
+
+    # Add code search tool first (if enabled) to make it more prominent
+    if codeact_enable_code_search:
+        tools.insert(0, CodeSearchTool)
+
     if codeact_enable_browsing:
         tools.append(WebReadTool)
         tools.append(BrowserTool)

diff --git a/openhands/core/config/agent_config.py b/openhands/core/config/agent_config.py
@@ -14,6 +14,7 @@ class AgentConfig(BaseModel):
         codeact_enable_browsing: Whether browsing delegate is enabled in the action space. Default is False. Only works with function calling.
         codeact_enable_llm_editor: Whether LLM editor is enabled in the action space. Default is False. Only works with function calling.
         codeact_enable_jupyter: Whether Jupyter is enabled in the action space. Default is False.
+        codeact_enable_code_search: Whether code search is enabled in the action space. Default is True. Only works with function calling.
         memory_enabled: Whether long-term memory (embeddings) is enabled.
         memory_max_threads: The maximum number of threads indexing at the same time for embeddings. (deprecated)
         llm_config: The name of the llm config to use. If specified, this will override global llm config.
@@ -30,6 +31,7 @@ class AgentConfig(BaseModel):
     codeact_enable_browsing: bool = Field(default=True)
     codeact_enable_llm_editor: bool = Field(default=False)
     codeact_enable_jupyter: bool = Field(default=True)
+    codeact_enable_code_search: bool = Field(default=True)
     enable_prompt_extensions: bool = Field(default=True)
     disabled_microagents: list[str] = Field(default_factory=list)
     enable_history_truncation: bool = Field(default=True)

diff --git a/openhands/core/message_utils.py b/openhands/core/message_utils.py
@@ -340,6 +340,10 @@ def get_observation_message(
     elif isinstance(obs, AgentCondensationObservation):
         text = truncate_content(obs.content, max_message_chars)
         message = Message(role='user', content=[TextContent(text=text)])
+    elif hasattr(obs, '__class__') and obs.__class__.__name__ == 'CodeSearchObservation':
+        # Handle CodeSearchObservation from openhands-aci
+        text = truncate_content(obs.content, max_message_chars)
+        message = Message(role='user', content=[TextContent(text=text)])
     else:
         # If an observation message is not returned, it will cause an error
         # when the LLM tries to return the next message

diff --git a/openhands/core/schema/action.py b/openhands/core/schema/action.py
@@ -82,5 +82,9 @@ class ActionTypeSchema(BaseModel):
     SEND_PR: str = Field(default='send_pr')
     """Send a PR to github."""
 
+    CODE_SEARCH: str = Field(default='code_search')
+    """Search for relevant code in a codebase using semantic search.
+    """
+
 
-ActionType = ActionTypeSchema()
+ActionType = ActionTypeSchema()
diff --git a/openhands/core/schema/observation.py b/openhands/core/schema/observation.py
@@ -49,5 +49,9 @@ class ObservationTypeSchema(BaseModel):
     CONDENSE: str = Field(default='condense')
     """Result of a condensation operation."""
 
+    CODE_SEARCH: str = Field(default='code_search')
+    """Result of code search, containing relevant code snippets.
+    """
+
 
-ObservationType = ObservationTypeSchema()
+ObservationType = ObservationTypeSchema()
diff --git a/openhands/events/action/__init__.py b/openhands/events/action/__init__.py
@@ -8,6 +8,7 @@
     ChangeAgentStateAction,
 )
 from openhands.events.action.browse import BrowseInteractiveAction, BrowseURLAction
+from openhands.events.action.code_search import CodeSearchAction
 from openhands.events.action.commands import CmdRunAction, IPythonRunCellAction
 from openhands.events.action.empty import NullAction
 from openhands.events.action.files import (
@@ -23,6 +24,7 @@
     'CmdRunAction',
     'BrowseURLAction',
     'BrowseInteractiveAction',
+    'CodeSearchAction',
     'FileReadAction',
     'FileWriteAction',
     'FileEditAction',
@@ -35,4 +37,4 @@
     'MessageAction',
     'ActionConfirmationStatus',
     'AgentThinkAction',
-]
+]
diff --git a/openhands/events/action/code_search.py b/openhands/events/action/code_search.py
@@ -0,0 +1,61 @@
+"""Code search action module."""
+
+from dataclasses import dataclass
+from typing import ClassVar, List, Optional
+
+from openhands.core.schema.action import ActionType
+from openhands.events.action.action import Action, ActionSecurityRisk
+
+
+@dataclass
+class CodeSearchAction(Action):
+    """Search for relevant code in a codebase using semantic search.
+
+    This action uses Retrieval Augmented Generation (RAG) to find relevant code
+    based on natural language queries. It first indexes the codebase (if needed)
+    and then performs a semantic search.
+
+    Attributes:
+        query: Natural language query.
+        repo_path: Path to the Git repository to search (optional if save_dir exists).
+        save_dir: Directory to save/load the search index (defaults to .code_search_index).
+        extensions: List of file extensions to include (e.g. [".py", ".js"]).
+        k: Number of results to return.
+        remove_duplicates: Whether to remove duplicate file results.
+        min_score: Minimum score threshold to filter out low-quality matches.
+        thought: Reasoning behind the search.
+        action: Type of action to execute.
+        runnable: Indicates whether the action is executable.
+        security_risk: Indicates any security risks associated with the action.
+        blocking: Indicates whether the action is a blocking operation.
+    """
+
+    query: str
+    repo_path: Optional[str] = None
+    save_dir: Optional[str] = None
+    extensions: Optional[List[str]] = None
+    k: int = 5
+    remove_duplicates: bool = True
+    min_score: float = 0.5
+    thought: str = ''
+    action: str = ActionType.CODE_SEARCH
+    runnable: ClassVar[bool] = True
+    security_risk: ActionSecurityRisk | None = None
+    blocking: bool = True  # Set as a blocking operation
+
+    @property
+    def message(self) -> str:
+        """Get a human-readable message describing the code search action."""
+        return f'Search code: {self.query}'
+
+    def __repr__(self) -> str:
+        """Get a string representation of the code search action."""
+        ret = '**Code Search Action**\n'
+        ret += f'Query: {self.query}\n'
+        if self.repo_path:
+            ret += f'Repository: {self.repo_path}\n'
+        if self.extensions:
+            ret += f'Extensions: {", ".join(self.extensions)}\n'
+        ret += f'Number of results: {self.k}\n'
+        ret += f'Thought: {self.thought}\n'
+        return ret