add agent docs including prd and future plans to share with others

tmaiaroto · tmaiaroto · commit 74212cfc5a94 · 2025-10-02T09:28:54.000-07:00
diff --git a/AGENT_DOCS/PLANS/SEMANTIC_ANALYSIS_PLAN.md b/AGENT_DOCS/PLANS/SEMANTIC_ANALYSIS_PLAN.md
@@ -0,0 +1,121 @@
+# Plan for Semantic Analysis Integration in MCP Server
+
+## 1. Overview
+
+This document outlines a phased approach to integrate semantic analysis into the `ast-grep-linter-mcp-server`. The goal is to evolve beyond purely syntactic pattern matching (`ast-grep`) and enable rules that can leverage type information and symbol resolution. This will dramatically increase the accuracy of our linting, reduce false positives, and allow for a much more powerful class of rules.
+
+We will begin with Go, using the official Go language server (`gopls`), and then establish a framework for incorporating other languages over time.
+
+---
+
+## 2. Core Concept: The Hybrid Analysis Model
+
+The server will operate in a hybrid mode, combining the speed of `ast-grep` for syntactic analysis with the precision of a language server for semantic validation.
+
+The workflow for a semantic rule will be:
+1.  **Syntactic Pre-filtering (`ast-grep`):** A broad `ast-grep` rule finds all potential candidates for a violation. For example, it finds all standalone function calls.
+2.  **Semantic Verification (Language Server):** For each candidate found by `ast-grep`, the MCP server queries a language server (`gopls`) to get semantic information (e.g., "What are the return types of this function?").
+3.  **Final Decision:** The server combines the syntactic and semantic information to make a final, accurate decision on whether to report a violation.
+
+---
+
+## 3. Phase 1: Go Language Support via `gopls`
+
+### 3.1. Architecture
+
+The MCP server will manage a long-running `gopls` process. It will communicate with `gopls` using the Language Server Protocol (LSP) over stdio.
+
+### 3.2. New MCP Tools for Semantic Queries
+
+We will introduce new tools to the MCP server that abstract the LSP communication.
+
+#### Tool 1: `get_definition`
+-   **Purpose:** Finds the definition of a symbol at a given position.
+-   **Arguments:**
+    -   `file_path`: `string`
+    -   `line`: `int`
+    -   `column`: `int`
+-   **Process:**
+    1.  Receives the request.
+    2.  Formats and sends a `textDocument/definition` request to the `gopls` process.
+    3.  Parses the `gopls` response, which contains the location (file and position) of the definition.
+    4.  Reads the source code at the definition's location to extract the full function signature or type definition.
+-   **Returns:** A JSON object with the definition's `file_path`, `start_line`, `end_line`, and the full `signature` text.
+
+#### Tool 2: `get_type_info`
+-   **Purpose:** Gets type information for the symbol at a given position.
+-   **Arguments:**
+    -   `file_path`: `string`
+    -   `line`: `int`
+    -   `column`: `int`
+-   **Process:**
+    1.  Receives the request.
+    2.  Formats and sends a `textDocument/hover` request to `gopls`.
+    3.  Parses the `gopls` response to extract the type information string.
+-   **Returns:** A JSON object containing the `type` as a string.
+
+### 3.3. Enhanced `scan_code` Tool
+
+The `scan_code` tool will be upgraded to support a new type of rule.
+
+**New Rule Property: `semantic_check`**
+
+Rules in `sgconfig.yml` can have an optional `semantic_check` property.
+
+```yaml
+id: go-unchecked-error-semantic
+language: go
+rule:
+  # 1. Syntactic pre-filter: find all standalone function calls
+  pattern: $A($$$)
+  inside:
+    kind: expression_statement
+message: "The error returned by '$A' is not checked."
+severity: "error"
+# 2. Semantic verification step
+semantic_check:
+  # The MCP server will execute this check
+  - type: 'function_returns_error'
+    # It will pass the matched node's info to the check
+    input: '$A'
+```
+
+**New `scan_code` Workflow:**
+1.  Run `ast-grep` as usual.
+2.  For each finding from a rule that has a `semantic_check`:
+    a. Extract the AST node specified by `input` (e.g., `$A`).
+    b. Get its position (file, line, column).
+    c. Call the appropriate new MCP tool (`get_definition` or `get_type_info`).
+    d. Analyze the returned signature/type to see if it matches the check (e.g., does it include `error` as a return type?).
+    e. If the semantic check passes, the finding is confirmed and reported. If it fails, the finding is discarded as a false positive.
+
+---
+
+## 4. Phase 2: Framework for Multi-Language Support
+
+### 4.1. Language Server Management
+
+The MCP server will be updated to manage a pool of language server processes.
+-   A configuration file (`language_servers.json`) will map language IDs (e.g., "go", "typescript", "python") to the command needed to start their respective language servers (e.g., `gopls`, `tsserver`, `pylance`).
+-   The server will start and manage these processes on demand.
+
+### 4.2. Generic Semantic Checks
+
+The `semantic_check` types will be kept as generic as possible to be reusable across languages.
+-   `function_returns_type`: Checks if a function returns a specific type name.
+-   `variable_is_type`: Checks if a variable is of a certain type.
+-   `is_deprecated`: Checks for deprecation annotations.
+
+This creates a powerful, extensible system where adding support for a new language primarily involves adding its language server to the configuration and ensuring its output can be parsed.
+
+---
+
+## 5. Implementation Steps
+
+1.  **[Go]** Implement the `gopls` process management within the MCP server.
+2.  **[Go]** Implement the `get_definition` tool by creating an LSP client that can send `textDocument/definition` requests.
+3.  **[Go]** Update the `scan_code` handler to perform the hybrid analysis workflow described in section 3.3.
+4.  **[Go]** Test the new `go-unchecked-error-semantic` rule.
+5.  **[Framework]** Refactor the language server management to support multiple languages via a configuration file.
+6.  **[Framework]** Generalize the semantic check logic.
+7.  **[TypeScript]** Add `tsserver` to the configuration and implement a semantic rule for TypeScript as a proof of concept.
diff --git a/AGENT_DOCS/PRD/PRD.md b/AGENT_DOCS/PRD/PRD.md
@@ -0,0 +1,136 @@
+# PRD & Technical Plan: context-sherpa - AI-Powered Code Analysis Server
+
+**Version: 1.2**
+**Date: 2025-10-01**
+Author: Gemini
+
+## 1. Overview
+
+This document outlines the requirements for an MCP (Model-as-a-Tool Protocol) server, written in Go, that provides an AI coding agent with tools to interact with `ast-grep`. The primary objective is to create a system where an AI agent can not only lint and validate code using a predefined set of rules but also dynamically create, update, and remove those rules based on natural language feedback from a developer.
+
+The final product will be a single, portable, cross-platform binary with no external runtime dependencies, making setup trivial for the end-user.
+
+## 2. Core Objective & User Story
+
+As a developer using an AI coding agent, I want to:
+
+-   Have my agent automatically validate the code it generates against my project's specific coding patterns.
+-   Be able to provide natural language feedback (e.g., "From now on, all async functions must have a try/catch block") to my agent.
+-   Have the agent intelligently convert my feedback into a permanent, machine-readable linting rule using `ast-grep`.
+-   Be able to easily remove rules that are no longer needed.
+-   Ensure this system is self-contained in a single executable that I can easily run without managing servers, dependencies, or security risks.
+
+## 3. Key Features & Tool Definitions for the AI Agent
+
+The MCP server will expose the following **four** tools to the AI agent. The agent will use the tool descriptions to decide which tool to call based on the user's request.
+
+### Tool 1: `initialize_ast_grep`
+
+-   **Description**: "Initializes an ast-grep project if one is not already present. It creates the `sgconfig.yml` file and a `rules` directory. This tool should be suggested if another tool fails due to a missing configuration file."
+-   **Input Schema**: (None)
+-   **Output Schema**:
+    -   `success` (boolean): `true` if the project was initialized successfully.
+    -   `message` (string): A confirmation message (e.g., "ast-grep project initialized successfully. Created sgconfig.yml and rules/ directory.").
+
+### Tool 2: `scan_code`
+
+-   **Description**: "Scans a given code snippet using the project's central `ast-grep` ruleset (`sgconfig.yml`). Use this to validate code, check for rule violations, or before committing changes."
+-   **Input Schema**:
+    -   `code` (string, required): The raw source code to scan.
+    -   `language` (string, required): The programming language of the code (e.g., `javascript`, `python`, `go`).
+-   **Output Schema**:
+    -   `success` (boolean): `true` if no issues were found, `false` otherwise.
+    -   `issues` (array of objects): A list of violations found. Each object contains:
+        -   `ruleId` (string): The ID of the rule that was violated.
+        -   `message` (string): The error message for the violation.
+        -   `line` (integer): The line number where the violation occurred.
+
+### Tool 3: `add_or_update_rule`
+
+-   **Description**: "Adds a new rule or updates an existing rule in the project's central `sgconfig.yml` file. Use this after a rule has been generated and confirmed by the user."
+-   **Input Schema**:
+    -   `rule_id` (string, required): A unique identifier for the rule (e.g., `no-console-log`).
+    -   `rule_yaml` (string, required): The complete YAML definition for the rule.
+-   **Output Schema**:
+    -   `success` (boolean): `true` if the file was written successfully.
+    -   `message` (string): A confirmation message (e.g., "Rule 'no-console-log' was added successfully.").
+
+### Tool 4: `remove_rule`
+
+-   **Description**: "Removes a rule from the project's central `sgconfig.yml` file by its unique ID. Use this when a coding standard is no longer desired."
+-   **Input Schema**:
+    -   `rule_id` (string, required): The unique identifier of the rule to remove.
+-   **Output Schema**:
+    -   `success` (boolean): `true` if the rule was found and removed successfully.
+    -   `message` (string): A confirmation message (e.g., "Rule 'no-console-log' was removed successfully.").
+
+## 4. Technical Implementation Plan
+
+This plan details the steps to build the server using Go, embedding the `ast-grep` binary directly.
+
+### Step A: Project Setup
+(No changes)
+
+### Step B: Bundle the ast-grep Binary
+(No changes)
+
+### Step C: Implement the MCP Server
+
+-   In `main()`, create a new MCP server instance.
+-   Register a handler for each of the three tools defined above.
+    -   `server.RegisterTool("scan_code", scanCodeHandler)`
+    -   `server.RegisterTool("add_or_update_rule", addOrUpdateRuleHandler)`
+    -   `server.RegisterTool("remove_rule", removeRuleHandler)`
+-   Start the server to listen for requests from the AI agent.
+
+### Step D: Implement Tool Handler Functions
+
+-   `scanCodeHandler(req mcp.Request) mcp.Response`: (No changes)
+-   `addOrUpdateRuleHandler(req mcp.Request) mcp.Response`: (No changes)
+-   `removeRuleHandler(req mcp.Request) mcp.Response`: (No changes)
+
+## 5. Non-Functional Requirements
+(No changes)
+
+## 6. Testing Strategy & Usage Example
+
+### Testing
+
+-   **Unit Tests**: Each handler function (`scanCodeHandler`, `addOrUpdateRuleHandler`, etc.) will have corresponding unit tests. External dependencies will be mocked.
+-   **Integration Tests**: An end-to-end test script will be created to compile the binary, start it, and simulate an MCP client making a sequence of calls to add, scan, and remove rules, verifying the `sgconfig.yml` content at each stage.
+
+### Example Usage Workflow
+
+This scenario illustrates the updated interaction between a developer, the AI agent, and the MCP server.
+
+1.  **Agent handles a missing configuration file:**
+    -   **Developer**: "Hey agent, please add a rule to disallow `fmt.Println` in our Go code."
+    -   **AI Agent**: Calls the `add_or_update_rule` tool.
+    -   **MCP Server**: Fails because `sgconfig.yml` does not exist and returns an error: "Error: sgconfig.yml not found. Please run the 'initialize_ast_grep' tool first to set up the project."
+    -   **AI Agent**: "It looks like this project hasn't been set up for ast-grep yet. Would you like me to initialize it for you?"
+    -   **Developer**: "Yes, please."
+    -   **AI Agent**: Calls the `initialize_ast_grep` tool.
+    -   **MCP Server**: Creates `sgconfig.yml` and the `rules/` directory.
+    -   **AI Agent**: "The project has been initialized. I will now add the rule."
+    -   **AI Agent**: Calls `add_or_update_rule` again, which now succeeds.
+
+2.  **Developer sets a new rule:**
+    -   **Developer**: "Hey agent, we should never use the `eval()` function in our JavaScript. It's a security risk. Can you create a rule to block it?"
+    -   **AI Agent**: Understands the intent. It uses its internal knowledge to generate a valid `ast-grep` YAML rule.
+    -   **AI Agent**: "I've drafted this rule: [shows YAML to user]. Does this look correct?"
+    -   **Developer**: "Yes, that's perfect."
+    -   **AI Agent**: Calls the `add_or_update_rule` tool with the YAML to save it to `sgconfig.yml`.
+
+2.  **Agent self-corrects its own code:**
+    -   **Developer**: "Now, please refactor this old utility file to be more modern."
+    -   **AI Agent**: Generates a refactored version of the code, mistakenly using `eval()`.
+    -   **Agent (Internal Step)**: Before showing the code, it calls the `scan_code` tool on its own output.
+    -   **MCP Server**: Runs `ast-grep scan`, finds the `eval()` call, and returns a violation.
+    -   **AI Agent**: Sees the violation, corrects its code to use a safer alternative, and re-scans to confirm compliance.
+    -   **AI Agent**: "Here is the refactored code. I've also updated a part that was using `eval()` to be more secure."
+
+3.  **Developer removes a rule:**
+    -   **Developer**: "You know what, the `no-console-log` rule is too annoying for our debug builds. Please remove it."
+    -   **AI Agent**: Calls the `remove_rule` tool with `rule_id: "no-console-log"`.
+    -   **MCP Server**: Removes the rule from `sgconfig.yml`.
+    -   **AI Agent**: "Done. The `no-console-log` rule has been removed."