|
3 | 3 | [](LICENSE) |
4 | 4 | []() |
5 | 5 |
|
6 | | -**LSAP** (Language Server Agent Protocol) is a semantic abstraction layer that transforms **Language Server Protocol (LSP)** into an agent-native cognitive framework. |
| 6 | +LSAP is an open protocol that defines how AI coding agents interact with Language Servers. Each LSAP capability is designed to be exposed as an **agent tool** - the agent calls it via function calling, and receives Markdown output ready for reasoning. |
7 | 7 |
|
8 | | -While traditional LSP was optimized for human-centric, incremental UI updates, LSAP is engineered for the **Progressive Disclosure** of codebase intelligence to LLM Agents. It provides the structured, high-fidelity context necessary for agents to reason about, navigate, and modify complex software systems autonomously. |
9 | | - |
10 | | ---- |
11 | | - |
12 | | -## 🧠 The Core Philosophy: Agent-Native Progressive Disclosure |
13 | | - |
14 | | -The fundamental challenge for Coding Agents is not the lack of information, but the **noise-to-signal ratio**. Standard LSP is often too granular, leading to fragmented context and reasoning failures. LSAP solves this by: |
| 8 | +``` |
| 9 | +┌──────────────┐ function call ┌──────────────┐ LSP ┌──────────────┐ |
| 10 | +│ AI Agent │ ──────────────────► │ LSAP Tool │ ─────────────► │ Language │ |
| 11 | +│ │ ◄────────────────── │ │ ◄───────────── │ Server │ |
| 12 | +└──────────────┘ markdown output └──────────────┘ └──────────────┘ |
| 13 | +``` |
15 | 14 |
|
16 | | -- **Strategic Disclosure**: Dynamically revealing code structure and semantics based on the agent's current task state, ensuring it has _exactly_ what it needs to reason, and nothing more. |
17 | | -- **Semantic Aggregation**: Collapsing multiple low-level LSP round-trips into high-density "Cognitive Snapshots" (e.g., merging definition, signature help, and implementation into a single atomic context). |
18 | | -- **Markdown-First Reasoning**: Serving information in structured Markdown templates that leverage the LLM's pre-trained ability to parse documentation, allowing the agent to "read" the codebase rather than just processing tokens. |
19 | | -- **Contextual Anchoring**: Providing robust "Locating" mechanisms that allow agents to resolve ambiguous intent into precise architectural coordinates. |
| 15 | +This repository contains the protocol specification and a Python reference implementation. |
20 | 16 |
|
21 | 17 | --- |
22 | 18 |
|
23 | | -## 🔄 Cognitive Flow: Strategic Aggregation |
24 | | - |
25 | | -LSAP acts as a sophisticated orchestrator, converting high-level agent intents into coordinated language server operations: |
26 | | - |
27 | | -```mermaid |
28 | | -sequenceDiagram |
29 | | - participant Agent as LLM Coding Agent |
30 | | - participant LSAP as LSAP SymbolCapability |
31 | | - participant Locate as LocateCapability |
32 | | - participant LSP as Language Server (LSP) |
| 19 | +## How It Works |
33 | 20 |
|
34 | | - Note over Agent, LSP: Task: "Understand this method's implementation" |
| 21 | +LSAP capabilities are exposed as tools that agents can call. For example, the `Symbol` capability: |
35 | 22 |
|
36 | | - Agent->>LSAP: SymbolRequest(locate={symbol_path: ["process_data"]}) |
| 23 | +**Tool Definition** (JSON Schema): |
37 | 24 |
|
38 | | - activate LSAP |
39 | | - LSAP->>Locate: LocateRequest |
40 | | -
|
41 | | - activate Locate |
42 | | - Locate->>LSP: textDocument/documentSymbol |
43 | | - LSP-->>Locate: DocumentSymbol[] |
| 25 | +```json |
| 26 | +{ |
| 27 | + "name": "get_symbol", |
| 28 | + "description": "Get the source code and documentation of a symbol", |
| 29 | + "parameters": { |
| 30 | + "type": "object", |
| 31 | + "properties": { |
| 32 | + "file_path": { "type": "string" }, |
| 33 | + "symbol_path": { "type": "array", "items": { "type": "string" } } |
| 34 | + } |
| 35 | + } |
| 36 | +} |
| 37 | +``` |
44 | 38 |
|
45 | | - Locate-->>LSAP: file_path, position |
46 | | - deactivate Locate |
| 39 | +**Agent calls the tool**: |
47 | 40 |
|
48 | | - par Parallel Deep Inspection |
49 | | - LSAP->>LSP: textDocument/hover |
50 | | - LSAP->>LSP: textDocument/documentSymbol |
51 | | - LSAP->>LSP: read file content |
52 | | - end |
| 41 | +```json |
| 42 | +{ |
| 43 | + "name": "get_symbol", |
| 44 | + "arguments": { |
| 45 | + "file_path": "src/auth.py", |
| 46 | + "symbol_path": ["UserService", "authenticate"] |
| 47 | + } |
| 48 | +} |
| 49 | +``` |
53 | 50 |
|
54 | | - LSP-->>LSAP: Hover documentation |
55 | | - LSP-->>LSAP: DocumentSymbol[] |
56 | | - LSP-->>LSAP: Source code content |
| 51 | +**Tool returns Markdown** (directly usable by the agent): |
57 | 52 |
|
58 | | - LSAP->>LSAP: Find symbol from DocumentSymbol |
59 | | - LSAP->>LSAP: Extract code snippet using DocumentReader |
60 | | - LSAP->>LSAP: Aggregate into Markdown |
| 53 | +````markdown |
| 54 | +# Symbol: `UserService.authenticate` (`Method`) at `src/auth.py` |
61 | 55 |
|
62 | | - LSAP-->>Agent: SymbolResponse (Markdown) |
63 | | - deactivate LSAP |
| 56 | +## Implementation |
64 | 57 |
|
65 | | - Note over Agent: Agent receives structured markdown<br/>with documentation + source code |
| 58 | +```python |
| 59 | +def authenticate(self, username: str, password: str) -> Optional[User]: |
| 60 | + """Verify user credentials and return user if valid.""" |
| 61 | + user = self.db.get_user(username) |
| 62 | + if user and user.check_password(password): |
| 63 | + return user |
| 64 | + return None |
66 | 65 | ``` |
| 66 | +```` |
67 | 67 |
|
68 | | ---- |
69 | | - |
70 | | -## 🛠 Case Studies: Agent-Native Design |
71 | | - |
72 | | -LSAP's superiority over standard LSP for coding agents is best demonstrated through its "intent-to-action" mapping: |
| 68 | +```` |
73 | 69 |
|
74 | | -### 1. 📍 Locate: The "Universal Link" for Cognitive Anchoring |
| 70 | +The agent receives structured, readable context without needing to parse JSON or understand LSP internals. |
75 | 71 |
|
76 | | -In standard LSP, every request (hover, definition, references) requires a precise `(line, character)` coordinate. However, an LLM agent's "mental model" of the code is often based on **textual evidence** or **symbolic paths**. |
| 72 | +--- |
77 | 73 |
|
78 | | -- **The LSP Way**: The agent must first read the entire file, use its own reasoning to find the line/column of a snippet, and then send a request. This is high-latency, token-expensive, and fragile (a single space change breaks the coordinate). |
79 | | -- **The LSAP Way**: LSAP introduces a **Unified Locating Layer**. Any request can be anchored using: |
80 | | - - **`LocateText`**: Find a position by searching for a code snippet within a file or range. |
81 | | - - **`LocateSymbol`**: Resolve a hierarchical path (e.g., `["User", "Profile", "save"]`) to its exact implementation. |
82 | | - - **Heuristic Resolution**: LSAP uses fuzzy matching and AST context to ensure that if an agent says _"find the `logger` call near the end of the `try` block"_, it resolves to the correct node regardless of formatting changes. |
| 74 | +## Why Not Raw LSP? |
83 | 75 |
|
84 | | -This makes `Locate` the universal entry point—the agent no longer needs to worry about "where" things are in terms of raw coordinates, focusing instead on "what" it wants to inspect. |
| 76 | +Raw LSP requires `line:column` coordinates and returns fragmented JSON: |
85 | 77 |
|
86 | | -### 2. 📞 Call Hierarchy: From Stateful Items to Relational Graphs |
| 78 | +```python |
| 79 | +# Agent would need to: read file → find line number → call LSP → parse response → format output |
| 80 | +# This is error-prone and wastes tokens on coordinate calculation |
| 81 | +```` |
87 | 82 |
|
88 | | -LSP's call hierarchy is a stateful, multi-step process: `prepare` -> `incoming` (for each item). Managing these handles across a long-running agent session is complex. |
| 83 | +LSAP lets agents reference code by **symbol names** and get **complete, formatted context** in one call. |
89 | 84 |
|
90 | | -- **The LSP Way**: The agent must manage `CallHierarchyItem` objects and make sequential calls to expand the tree, often losing context or getting stuck in state management. |
91 | | -- **The LSAP Way**: The agent makes a single `CallHierarchyRequest` specifying a `depth` (e.g., `depth=2`). LSAP recursively traverses the hierarchy and returns a **flattened relational graph** as a single Markdown snapshot. The agent immediately sees the broader architectural impact of a change without needing to manually "click through" nodes. |
| 85 | +| LSP | LSAP | |
| 86 | +| :------------------------------- | :----------------------------------- | |
| 87 | +| `Position(line=42, character=8)` | `symbol_path: ["MyClass", "method"]` | |
| 88 | +| Multiple round-trips | Single request | |
| 89 | +| Raw JSON for IDEs | Markdown for LLMs | |
92 | 90 |
|
93 | 91 | --- |
94 | 92 |
|
95 | | -## 🛠 Core Capabilities |
| 93 | +## Capabilities (Agent Tools) |
96 | 94 |
|
97 | | -The LSAP specification categorizes capabilities into functional layers, facilitating progressive disclosure of codebase intelligence: |
| 95 | +Each capability is a tool the agent can call: |
98 | 96 |
|
99 | | -### 🌐 Discovery & Resolution |
| 97 | +### Stable |
100 | 98 |
|
101 | | -| Capability | Description | |
102 | | -| :---------------------- | :------------------------------------------------------------------------- | |
103 | | -| 🌐 **Workspace Search** | Global, paginated search for symbols across the entire project. | |
104 | | -| 📍 **Locate** | Resolve ambiguous text snippets or symbol paths to exact file coordinates. | |
| 99 | +| Tool | What the agent gets | Spec | |
| 100 | +| :--------------------- | :-------------------------------------------- | :------------------------------------- | |
| 101 | +| **get_symbol** | Source code, signature, docstring of a symbol | [docs](docs/schemas/symbol.md) | |
| 102 | +| **get_symbol_outline** | List of all symbols in a file | [docs](docs/schemas/symbol_outline.md) | |
| 103 | +| **get_references** | All locations where a symbol is used | [docs](docs/schemas/reference.md) | |
| 104 | +| **get_hover** | Documentation/type info at a position | [docs](docs/schemas/hover.md) | |
| 105 | +| **get_definition** | Where a symbol is defined | [docs](docs/schemas/definition.md) | |
| 106 | +| **search_workspace** | Find symbols by name across the project | [docs](docs/schemas/workspace.md) | |
105 | 107 |
|
106 | | -### 🔍 Deep Inspection |
| 108 | +### Experimental |
107 | 109 |
|
108 | | -| Capability | Description | |
109 | | -| :-------------------- | :-------------------------------------------------------------------------------- | |
110 | | -| 🔍 **Symbol Info** | High-density retrieval of documentation, signatures, and source code for symbols. | |
111 | | -| 🗂 **Symbol Outline** | Generate a hierarchical map (AST-lite) of all symbols within a file. | |
112 | | -| 💬 **Hover** | Quick access to documentation and type information at a specific location. | |
113 | | -| 💡 **Inlay Hints** | Augment source code with static types and runtime values for enhanced reasoning. | |
| 110 | +| Tool | Status | Spec | |
| 111 | +| :--------------------- | :----- | :------------------------------------------- | |
| 112 | +| **get_call_hierarchy** | Beta | [docs](docs/schemas/draft/call_hierarchy.md) | |
| 113 | +| **get_type_hierarchy** | Beta | [docs](docs/schemas/draft/type_hierarchy.md) | |
| 114 | +| **get_diagnostics** | Alpha | [docs](docs/schemas/draft/diagnostics.md) | |
| 115 | +| **rename_symbol** | Alpha | [docs](docs/schemas/draft/rename.md) | |
| 116 | +| **get_inlay_hints** | Alpha | [docs](docs/schemas/draft/inlay_hints.md) | |
| 117 | +| **get_completions** | Alpha | [docs](docs/schemas/completion.md) | |
114 | 118 |
|
115 | | -### 🔗 Relational Mapping |
| 119 | +Full spec: [docs/schemas/README.md](docs/schemas/README.md) |
116 | 120 |
|
117 | | -| Capability | Description | |
118 | | -| :-------------------- | :------------------------------------------------------------------- | |
119 | | -| 🔗 **References** | Trace all usages and call sites of a symbol project-wide. | |
120 | | -| 🏗 **Implementation** | Discover concrete implementations of interfaces or abstract methods. | |
121 | | -| 📞 **Call Hierarchy** | Map incoming and outgoing function call relationships. | |
122 | | -| 🌳 **Type Hierarchy** | Explore complex inheritance and class relationship trees. | |
| 121 | +--- |
123 | 122 |
|
124 | | -### 🩺 Environmental Awareness |
| 123 | +## Locate: How Agents Reference Code |
125 | 124 |
|
126 | | -| Capability | Description | |
127 | | -| :----------------- | :------------------------------------------------------------------------ | |
128 | | -| 🩺 **Diagnostics** | Real-time access to linting issues, syntax errors, and suggested fixes. | |
129 | | -| 📝 **Rename** | Predict and execute safe symbol renaming with project-wide diff analysis. | |
| 125 | +LSAP's `Locate` abstraction lets agents reference code without coordinates: |
130 | 126 |
|
131 | | ---- |
| 127 | +```json |
| 128 | +// By symbol path - "get the authenticate method in UserService" |
| 129 | +{"symbol_path": ["UserService", "authenticate"]} |
132 | 130 |
|
133 | | -## 🚀 Quick Start |
| 131 | +// By text pattern - "find where we call self.db" |
| 132 | +{"text": "self.db.<HERE>"} |
134 | 133 |
|
135 | | -LSAP provides a high-level API for agents to interact with codebases. |
| 134 | +// By scope - "lines 10-20 inside the main function" |
| 135 | +{"scope": {"symbol_path": ["main"]}, "line": [10, 20]} |
| 136 | +``` |
136 | 137 |
|
137 | | -### Python |
| 138 | +--- |
138 | 139 |
|
139 | | -```python |
140 | | -from lsap.symbol import SymbolCapability |
141 | | -from lsap_schema import SymbolRequest |
142 | | -from lsp_client.clients.pyright import PyrightClient |
| 140 | +## Example: Agent Workflow |
143 | 141 |
|
144 | | -async with PyrightClient() as lsp_client: |
145 | | - # Initialize the LSAP capability |
146 | | - symbol_info = SymbolCapability(client=lsp_client) |
| 142 | +A coding agent reviewing a function might: |
147 | 143 |
|
148 | | - # Request high-density information about a symbol |
149 | | - response = await symbol_info(SymbolRequest( |
150 | | - locate={"file_path": "src/main.py", "symbol_path": ["my_function"]} |
151 | | - )) |
| 144 | +1. **Call `get_symbol`** to get the function's implementation |
| 145 | +2. **Call `get_references`** to see how it's used |
| 146 | +3. **Reason** over the Markdown output to identify issues |
152 | 147 |
|
153 | | - if response: |
154 | | - # LSAP responses include pre-rendered markdown for LLM consumption |
155 | | - print(response.markdown) |
156 | 148 | ``` |
| 149 | +Agent: I need to review the handle_request function. |
157 | 150 |
|
158 | | -## 📦 SDKs & Framework Integration |
| 151 | +→ Tool call: get_symbol(file_path="api.py", symbol_path=["handle_request"]) |
| 152 | +← Returns: markdown with source code |
159 | 153 |
|
160 | | -LSAP provides first-class SDKs for both Python and TypeScript, making it effortless to integrate into modern AI Agent frameworks (such as LangChain, AutoGPT, CrewAI, or custom solutions). |
| 154 | +→ Tool call: get_references(file_path="api.py", symbol_path=["handle_request"]) |
| 155 | +← Returns: markdown with all call sites |
161 | 156 |
|
162 | | -- **Python SDK**: High-performance, async-native implementation. Ideal for server-side agents and research environments. |
163 | | -- **TypeScript SDK**: Zod-based schema validation and type-safe utilities. Perfect for browser-based IDEs or Node.js agent runtimes. |
| 157 | +Agent: Based on the implementation and usage, I found a potential SQL injection... |
| 158 | +``` |
164 | 159 |
|
165 | | -These SDKs allow you to treat LSAP capabilities as standard "Tools" within your agent's reasoning loop, providing a consistent interface across different programming languages and LSP servers. |
| 160 | +The agent never deals with line numbers or JSON parsing - it receives context in a format it can directly reason over. |
166 | 161 |
|
167 | 162 | --- |
168 | 163 |
|
169 | | -## 🏗 Project Architecture |
170 | | - |
171 | | -LSAP is a cross-language protocol ecosystem: |
| 164 | +## Comparison with Other Approaches |
172 | 165 |
|
173 | | -- **`schema/`**: The source of truth. Formal protocol definitions and data models. |
174 | | -- **`python/`**: Core LSAP Python implementation and its schema. |
175 | | -- **`typescript/`**: Zod-based schema definitions and utilities for TypeScript/Node.js. |
176 | | -- **`web/`**: Minimalist, developer-focused protocol explorer and documentation viewer. |
177 | | -- **`docs/schemas/`**: Detailed specifications for each protocol method and data model. |
| 166 | +| | Claude Code | Serena | Cursor | Aider | LSAP | |
| 167 | +| :----------------- | :---------- | :---------- | :---------- | :---- | :------------ | |
| 168 | +| **Type** | Proprietary | MCP server | IDE feature | CLI | Open protocol | |
| 169 | +| **Position model** | Coordinates | Coordinates | Coordinates | Text | Symbol paths | |
| 170 | +| **Output format** | JSON | Custom | Internal | Text | Markdown | |
| 171 | +| **Cold start** | Low | High | Low | Low | Low | |
| 172 | +| **Type precision** | Yes | Yes | No | No | Yes | |
178 | 173 |
|
179 | | -## 🛠 Protocol Integrity |
| 174 | +LSAP is a protocol specification, not a product. The schema is open and can be implemented for any agent framework. |
180 | 175 |
|
181 | | -LSAP is designed as a single-source-of-truth protocol. The core definitions are maintained in the `schema/` package and automatically propagated to other language implementations: |
| 176 | +--- |
182 | 177 |
|
183 | | -1. **Python**: Core definitions using Pydantic models. |
184 | | -2. **JSON Schema**: Exported from Python models for cross-language compatibility. |
185 | | -3. **TypeScript**: Zod schemas automatically generated from the JSON Schema definitions. |
| 178 | +## Reference Implementation |
186 | 179 |
|
187 | | -Run the codegen pipeline: |
| 180 | +This repo includes a Python implementation you can use directly or as a reference: |
188 | 181 |
|
189 | 182 | ```bash |
190 | | -just codegen |
| 183 | +pip install lsap lsp-client |
191 | 184 | ``` |
192 | 185 |
|
193 | | -## 📖 Protocol Specification |
| 186 | +```python |
| 187 | +from lsap.symbol import SymbolCapability |
| 188 | +from lsap_schema import SymbolRequest |
| 189 | +from lsp_client.clients.pyright import PyrightClient |
| 190 | + |
| 191 | +async def main(): |
| 192 | + async with PyrightClient() as client: |
| 193 | + symbol = SymbolCapability(client) |
194 | 194 |
|
195 | | -For detailed information on each capability, request/response models, and the complete data schema, please refer to our formal documentation: |
| 195 | + response = await symbol(SymbolRequest( |
| 196 | + locate={ |
| 197 | + "file_path": "src/main.py", |
| 198 | + "symbol_path": ["MyClass", "my_method"] |
| 199 | + } |
| 200 | + )) |
196 | 201 |
|
197 | | -- **[Full API Documentation](docs/schemas/README.md)**: A comprehensive guide to all LSAP methods. |
198 | | -- **[JSON Schema Definitions](schema/README.md)**: Formal machine-readable specifications. |
| 202 | + if response: |
| 203 | + print(response.markdown) |
| 204 | +``` |
199 | 205 |
|
200 | | -### Individual Capability Specs: |
| 206 | +TypeScript schemas are also available: |
201 | 207 |
|
202 | | -- [Locate](docs/schemas/locate.md) | [Symbol](docs/schemas/symbol.md) | [Symbol Outline](docs/schemas/symbol_outline.md) |
203 | | -- [Definition](docs/schemas/definition.md) | [Hover](docs/schemas/hover.md) | [Workspace Search](docs/schemas/workspace.md) |
204 | | -- [References](docs/schemas/reference.md) | [Implementation](docs/schemas/implementation.md) |
205 | | -- [Call Hierarchy](docs/schemas/call_hierarchy.md) | [Type Hierarchy](docs/schemas/type_hierarchy.md) |
206 | | -- [Completion](docs/schemas/completion.md) | [Diagnostics](docs/schemas/diagnostics.md) |
207 | | -- [Rename](docs/schemas/rename.md) | [Inlay Hints](docs/schemas/inlay_hints.md) |
| 208 | +```bash |
| 209 | +npm install @lsap/schema |
| 210 | +``` |
208 | 211 |
|
209 | 212 | --- |
210 | 213 |
|
211 | | -## 🚀 Design Principles |
212 | | - |
213 | | -1. **Cognitive Efficiency**: Maximize information density per token. Every byte returned to the agent should contribute to its reasoning process. |
214 | | -2. **Task-Oriented Granularity**: Provide information at the level of abstraction relevant to the agent's current goal (from high-level workspace maps to low-level implementation details). |
215 | | -3. **Deterministic Structure**: Strict schema adherence ensures the agent can rely on a consistent "mental model" of the codebase across different languages and environments. |
216 | | -4. **Agentic Autonomy**: Proactively provide the metadata (like pagination hints or related symbols) that empowers agents to explore the codebase without needing human intervention. |
| 214 | +## Project Structure |
217 | 215 |
|
218 | | -## 📜 License |
| 216 | +``` |
| 217 | +LSAP/ |
| 218 | +├── src/lsap_schema/ # Protocol schema (Pydantic) - source of truth |
| 219 | +├── python/src/lsap/ # Python reference implementation |
| 220 | +├── typescript/ # TypeScript/Zod schemas (generated) |
| 221 | +├── docs/schemas/ # Capability specifications |
| 222 | +└── web/ # Documentation viewer |
| 223 | +``` |
219 | 224 |
|
220 | | -This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details. |
| 225 | +Schema generation: `just codegen` (Python → JSON Schema → TypeScript) |
221 | 226 |
|
222 | 227 | --- |
223 | 228 |
|
224 | | -Built for the next generation of AI Software Engineers. |
| 229 | +## License |
| 230 | + |
| 231 | +MIT - see [LICENSE](LICENSE) |
0 commit comments