Skip to content

Commit c8b30ac

Browse files
authored
feat: MCP and Agent server (#49)
* feat: MCP server * feat: code-analysis agent
1 parent 9e354ce commit c8b30ac

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

50 files changed

+32793
-154
lines changed

.gitignore

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@
44
*.dll
55
*.so
66
*.dylib
7+
.DS_Store
78

89
# Test binary, built with `go test -c`
910
*.test
@@ -75,3 +76,5 @@ src/lang/testdata
7576

7677
tools
7778
abcoder
79+
80+
!testdata/asts/*.json

README.md

Lines changed: 87 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -3,20 +3,18 @@
33
![ABCoder](images/ABCoder.png)
44

55
# Overview
6-
ABCoder, an general AI-oriented code-processing SDK, is designed to enhance coding context for Large-Language-Model (LLM), and boost developing AI-assisted-programming applications.
6+
ABCoder, an general AI-oriented Code-processing **Framework**, is designed to enhance coding context for Large-Language-Model (LLM), and boost developing AI-assisted-programming applications.
77

88

99
## Features
1010

11-
- Universal Abstract-Syntax-Tree (UniAST), an language-independent, AI-friendly specification of code information, providing a boundless, flexible and structrual coding context for both AI and hunman.
11+
- Universal Abstract-Syntax-Tree (UniAST), an language-independent, AI-friendly specification of code information, providing a boundless, flexible and structrual coding context for both AI and hunman.
1212

13-
- General Parser, parses abitary-language codes to UniAST.
13+
- General Parser, parses abitary-language codes to UniAST.
1414

15-
- General Writer, transforms UniAST back to codes.
16-
17-
- (Comming Soon) General Iterator, a framework for visiting the UniAST and implementing code-batch-processing workflows.
15+
- General Writer, transforms UniAST back to codes.
1816

19-
- (Comming Soon) Code Retrieval-Augmented-Generation (RAG), provides a set of tools and functions to help the LLM understand your codes much deeper than ever.
17+
- Code-Retrieval-Augmented-Generation (Code-RAG), provides a set of MCP tools to help the LLM understand your codes more precisely.
2018

2119
Based on these features, developers can easily implement or enhance their AI-assisted-programming applications, such as reviewing, optimizing, translating, etc.
2220

@@ -26,22 +24,94 @@ Based on these features, developers can easily implement or enhance their AI-ass
2624
see [UniAST Specification](docs/uniast-zh.md)
2725

2826

29-
# Getting Started
27+
# Quick Start
28+
29+
## Use ABCoder as a MCP server
3030

3131
1. Install ABCoder:
32-
```bash
33-
go install github.com/cloudwego/abcoder@latest
34-
```
32+
33+
```bash
34+
go install github.com/cloudwego/abcoder@latest
35+
```
36+
3537
2. Use ABCoder to parse a repository to UniAST (JSON)
38+
39+
```bash
40+
abcoder parse {language} {repo-path} > xxx.json
41+
```
42+
43+
for example:
44+
45+
```bash
46+
git clone https://github.com/cloudwego/localsession.git localsession
47+
abcoder parse go localsession -o /abcoder-asts/localsession.json
48+
```
49+
50+
3. Integrate ABCoder's MCP tools into your AI agent.
51+
52+
```json
53+
{
54+
"mcpServers": {
55+
"abcoder": {
56+
"command": "abcoder",
57+
"args": [
58+
"mcp",
59+
"{the-AST-directory}"
60+
]
61+
}
62+
}
63+
}
64+
```
65+
66+
67+
4. Enjoy it!
68+
69+
See using ABCoder MCP in TRAE demo:
70+
71+
<div align="center">
72+
73+
[<img src="images/abcoder-hertz-trae.png" alt="MCP" width="500"/>](https://www.bilibili.com/video/BV14ggJzCEnK)
74+
75+
</div>
76+
77+
78+
## Tips:
79+
80+
- You can add more repo ASTs into the AST directory without restarting abcoder MCP server.
81+
82+
- Try to use [the recommaned prompt](llm/prompt/analyzer.md) and combine planning/memory tools like [sequential-thinking](https://github.com/modelcontextprotocol/servers/tree/main/src/sequentialthinking) in your AI agent.
83+
84+
85+
## Use ABCoder as an Agent (WIP)
86+
87+
You can alse use ABCoder as a command-line Agent like:
88+
3689
```bash
37-
abcoder parse {language} {repo-path} > ast.json
90+
export API_TYPE='{openai|ollama|ark|claude}'
91+
export API_KEY='{your-api-key}'
92+
export MODEL_NAME='{model-endpoint}'
93+
abcoder agent {the-AST-directory}
3894
```
39-
3. Do your magic with UniAST...
40-
4. Use ABCoder to write an UniAST back to codes
95+
For example:
96+
4197
```bash
42-
abcoder write {language} ast.json
98+
$ API_TYPE='ark' API_KEY='xxx' MODEL_NAME='zzz' abcoder agent ./testdata/asts
99+
100+
Hello! I'm ABCoder, your coding assistant. What can I do for you today?
101+
102+
$ what the repo 'localsession' does?
103+
104+
The `localsession` repository appears to be a Go module (`github.com/cloudwego/localsession`) that provides functionality related to managing local sessions. Here's a breakdown of its structure and purpose:
105+
...
106+
If you'd like to explore specific functionalities or code details, let me know, and I can dive deeper into the relevant files or nodes. For example:
107+
- What does `session.go` or `manager.go` implement?
108+
- How is the backup functionality used?
109+
110+
$ exit
43111
```
44112

113+
- NOTICE: This feature is Work-In-Progress. It only support code-analyzing at present.
114+
45115

46116
# Supported Languages
47117

@@ -51,9 +121,8 @@ ABCoder currently supports the following languages:
51121
| -------- | ----------- | ----------- |
52122
| Go |||
53123
| Rust || Coming Soon |
54-
| C | Coming Soon ||
55-
| Python | Coming Soon ||
56-
124+
| C |||
125+
| Python | Coming Soon | Coming Soon |
57126

58127

59128
# Getting Involved

docs/parser-en.md

Lines changed: 97 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,97 @@
1+
# ABCoder - Language Parser Introduction
2+
3+
ABCoder currently implements Parser based on the [LSP](https://microsoft.github.io/language-server-protocol/) protocol to achieve precise dependency collection and facilitate future multi-language extensions.
4+
5+
## Code Structure
6+
7+
Located under the [lang](/lang) package, including:
8+
9+
- uniast: Golang definitions for unified AST structure
10+
- lsp: LSP protocol processing client, providing interfaces for file parsing, reference lookup, syntax tree parsing, definition lookup, etc., as well as the **generic language specification LanguageSpec interface**
11+
- collect: Responsible for LSP symbol collection and UniAST export, which is the core computation logic
12+
- {language}: Mainly implements the corresponding {language} specification for the lsp#Spec interface. Also includes some specific calling logic for LSP servers
13+
14+
## Operation Process
15+
16+
![lang-parser](../images/lang-parser.png)
17+
18+
1. Identify the language through command line parameters to start the corresponding LSP server and pass initialization parameters
19+
2. Traverse repository files, call the `textDocument/documentSymbol` method to get all symbols for each file. For each symbol:
20+
1. Call the `textDocument/semanticTokens/range` method to get tokens in the symbol code
21+
2. Identify valid entity tokens, call `textDocument/definition` to jump to the corresponding symbol location, thus establishing node dependency relationships
22+
3. Repeat step 2 until file processing is complete. Finally convert the collected LSP symbols to UniAST format and output
23+
24+
## Extending Other Language Implementations
25+
26+
Since UniAST is not completely equivalent to LSP, some language-specific behavior interfaces need to be implemented for conversion. Refer to the lang/rust package, generally the following capabilities need to be implemented:
27+
28+
- GetDefaultLSP(): Map user input language to specific lsp.Language and corresponding LSP name
29+
- CheckRepo(): Check user repository status, handle toolchain issues according to language specifications, and return the first file to open by default (for triggering LSP server) and the waiting time for server initialization (determined by repository size)
30+
- **LanguageSpec interface**: Core module for handling non-LSP generic syntax information, such as determining if a token is a standard library symbol, function signature parsing, etc.
31+
- ModulePatcher: Post-processing module for handling language-specific information collection. For example, rust's use symbol collection (not collected by LSP). Can be left unimplemented
32+
33+
### LanguageSpec
34+
35+
```go
36+
// Detailed implementation used for collect LSP symbols and transform them to UniAST
37+
type LanguageSpec interface {
38+
// initialize a root workspace, and return all modules [modulename=>abs-path] inside
39+
WorkSpace(root string) (map[string]string, error)
40+
41+
// give an absolute file path and returns its module name and package path
42+
// external path should alse be supported
43+
// FIXEM: some language (like rust) may have sub-mods inside a file, but we still consider it as a unity mod here
44+
NameSpace(path string) (string, string, error)
45+
46+
// tells if a file belang to language AST
47+
ShouldSkip(path string) bool
48+
49+
// FileImports parse file codes to get its imports
50+
FileImports(content []byte) ([]uniast.Import, error)
51+
52+
// return the first declaration token of a symbol, as Type-Name
53+
DeclareTokenOfSymbol(sym DocumentSymbol) int
54+
55+
// tells if a token is an AST entity
56+
IsEntityToken(tok Token) bool
57+
58+
// tells if a token is a std token
59+
IsStdToken(tok Token) bool
60+
61+
// return the SymbolKind of a token
62+
TokenKind(tok Token) SymbolKind
63+
64+
// tells if a symbol is a main function
65+
IsMainFunction(sym DocumentSymbol) bool
66+
67+
// tells if a symbol is a language symbol (func, type, variable, etc) in workspace
68+
IsEntitySymbol(sym DocumentSymbol) bool
69+
70+
// tells if a symbol is public in workspace
71+
IsPublicSymbol(sym DocumentSymbol) bool
72+
73+
// declare if the language has impl symbol
74+
// if it return true, the ImplSymbol() will be called
75+
HasImplSymbol() bool
76+
// if a symbol is an impl symbol, return the token index of interface type, receiver type and first-method start (-1 means not found)
77+
// ortherwise the collector will use FunctionSymbol() as receiver type token index (-1 means not found)
78+
ImplSymbol(sym DocumentSymbol) (int, int, int)
79+
80+
// if a symbol is a Function or Method symbol, return the token index of Receiver (-1 means not found),TypeParameters, InputParameters and Outputs
81+
FunctionSymbol(sym DocumentSymbol) (int, []int, []int, []int)
82+
}
83+
```
84+
85+
- Rust-parser implementation location: [RustSpec](/lang/rust/spec.go)
86+
87+
### ModulePatcher
88+
89+
```go
90+
// ModulePatcher supplements some information for module
91+
type ModulePatcher interface {
92+
// Patch is called after collect all symbols
93+
Patch(ast *parse.Module)
94+
}
95+
```
96+
97+
- Rust-parser implementation: [RustModulePatcher](/lang/rust/patch.go)

docs/parser-zh.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -50,6 +50,9 @@ type LanguageSpec interface {
5050
// tells if a file belang to language AST
5151
ShouldSkip(path string) bool
5252
53+
// FileImports parse file codes to get its imports
54+
FileImports(content []byte) ([]uniast.Import, error)
55+
5356
// return the first declaration token of a symbol, as Type-Name
5457
DeclareTokenOfSymbol(sym DocumentSymbol) int
5558

0 commit comments

Comments
 (0)