Skip to content

Commit f9eceaa

Browse files
authored
refactor: ready for open (#12)
* refactor: remove compress implementations * refactor: move out main * go mod tidy * readme * feat: comand write * doc: add docs * remove * opt doc * fix:(go_ast) get receiver name bug * remove rust ci * tmp remove line CI * add NewIdentityFromString() * move example json to huggingface * fix * fix
1 parent 51d99f7 commit f9eceaa

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

96 files changed

+1409
-6041
lines changed

.github/workflows/rust.yml

Lines changed: 0 additions & 22 deletions
This file was deleted.

.github/workflows/simple_checks.yml

Lines changed: 0 additions & 18 deletions
This file was deleted.

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -74,3 +74,4 @@ src/lang/testdata
7474
*.json
7575

7676
tools
77+
abcoder

Cargo.toml

Lines changed: 0 additions & 28 deletions
This file was deleted.

README.md

Lines changed: 33 additions & 95 deletions
Original file line numberDiff line numberDiff line change
@@ -1,126 +1,64 @@
1-
<!--
2-
Copyright 2025 CloudWeGo Authors
3-
4-
Licensed under the Apache License, Version 2.0 (the "License");
5-
you may not use this file except in compliance with the License.
6-
You may obtain a copy of the License at
7-
8-
https://www.apache.org/licenses/LICENSE-2.0
9-
10-
Unless required by applicable law or agreed to in writing, software
11-
distributed under the License is distributed on an "AS IS" BASIS,
12-
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13-
See the License for the specific language governing permissions and
14-
limitations under the License.
15-
-->
16-
171
# ABCoder: AI-Based Coder(AKA: A Brand-new Coder)
182

193
![ABCoder](images/ABCoder.png)
204

21-
ABCoder, an AI-powered tool, streamlines coding by keeping real-time status updates, providing lossless code compression, and giving development guidance. It enhances testing by identifying quality, generating reports, and auto-creating test cases. It also offers guidance for refactoring, including language stack switches.
22-
23-
# Table of Contents
24-
25-
- [ABCoder: AI-Based Coder(AKA: A Brand-new Coder)](#abcoder-ai-based-coderaka-a-brand-new-coder)
26-
- [Table of Contents](#table-of-contents)
27-
- [Overview](#overview)
28-
- [Quick Start](#quick-start)
29-
- [Prerequisites](#prerequisites)
30-
- [Running through Coze OpenAPI](#running-through-coze-openapi)
31-
- [Status Update](#status-update)
32-
- [Lossless Compression](#lossless-compression)
33-
- [Development Guide](#development-guide)
34-
- [Testing Enhancements](#testing-enhancements)
35-
- [Refactor/Rewrite Guide](#refactorrewrite-guide)
36-
- [Getting Involved](#getting-involved)
37-
385
# Overview
6+
ABCoder, an AI-oriented code-processing SDK, is designed to enhance coding context for Large-Language-Model (LLM), and boost developing AI-assisted-coding applications.
397

40-
ABCoder is a comprehensive open-source software development tool that aims to utilize artificial intelligence to enhance
41-
the process of coding. This project focuses on various aspects of software development ranging from repository analysis,
42-
issue and pull request tracking, to automated code compression, development guidance, testing enhancement, and
43-
refactoring guidance.
448

45-
# Quick Start
9+
## Features
4610

47-
## Prerequisites
48-
- install git and set your access token for github on cmd-line
49-
- install [rust-toolchain](https://www.rust-lang.org/tools/install) (stable)
50-
- (optional) install [ollama](https://github.com/ollama/ollama) and run your LLM
51-
- (optional) create a [Coze](https://www.coze.com/docs/developer_guides/coze_api_overview?_lang=en) agent and set its OpenAPI key
11+
- Universal Abstract Syntax Tree (UniAST), an language-independent, AI-friendly specification of code information, providing a flexible and structrual coding context for both AI and hunman.
12+
13+
- General Parser, parses abitary-language codes to UniAST.
5214

53-
## Running through Coze OpenAPI
54-
1. Set .env file for configuration on ABCoder's working directory. Taking Coze as an example:
55-
```
56-
# cache for repo,AST and so on
57-
WORK_DIR=tmp_abcoder
15+
- General Writer, transforms UniAST back to codes.
16+
17+
- (Comming Soon) General Iterator, a framework for visiting the UniAST easily and implementing batch-code-processing workflows.
5818

59-
# exclude dirs for repo parsing, separated by comma
60-
EXCLUDE_DIRS=target,gen-codes
19+
- (Comming Soon) Code RAG, provides a set of tools and functions to help the LLM understand your codes much deeper than ever.
6120

62-
# LLM's api type
63-
API_TYPE=coze # coze|ollama
21+
Based on these features, developers can easily implement or enhance their AI-assisted-coding applications, such as reviewing, optimizing, translating, etc.
6422

65-
# LLM's output language
66-
LANGUAGE=zh
6723

68-
# Coze options
69-
COZE_API_TOKEN="{YOUR_COZE_API_TOKEN}"
70-
COZE_BOT_ID={YOUR_COZE_BOT_ID}
71-
```
24+
## Universal-Abstract-Syntax-Tree Specification
7225

73-
2. compile the parsers
74-
```
75-
./script/make_parser.sh
76-
```
26+
see [UniAST Specification](docs/uniast-zh.md)
7727

78-
3. compile and run ABCoder
79-
```
80-
cargo run --bin cmd compress https://xxx.git
81-
```
8228

83-
4. Once triggered, ABCoder will take three steps:
84-
1. Download the repository in {REPO_DIR}
85-
2. Parse the repository and store the AST in {CACHE_DIR}
86-
3. Call the LLM to compress the repository codes, and refresh the AST for each call.
87-
You can stop the process at anytime after step 2. You can restart the compressing by running the same command.
29+
# Getting Started
8830

89-
5. Export the compressed results
31+
1. Install ABCoder:
32+
```bash
33+
go install github.com/cloudwego/abcoder@latest
9034
```
91-
cargo run --bin cmd export https://xxx.git --out-dir {OUTPUT_DIR}
35+
2. Use ABCoder to parse a repository to UniAST (JSON)
36+
```bash
37+
abcoder parse {language} {repo-path} > ast.json
38+
```
39+
3. Do your magic with UniAST...
40+
4. Use ABCoder to write a UniAST back to codes
41+
```bash
42+
abcoder write {language} ast.json
9243
```
9344

94-
# Status Update
95-
96-
The system is designed to automatically fetch the latest data from Github upon triggering relevant tasks, ensuring the
97-
repository status is always up-to-date. It can answer queries related to function, defects based on issue and PR
98-
information. For more details, check out our Issues and Pull Requests sections on Github.
99-
100-
# Lossless Compression
101-
102-
The system also offers a lossless compression feature for repository code. The specific implementation methods are being
103-
optimized, and more details will be available soon.
104-
105-
# Development Guide
10645

107-
We welcome all developers wishing to contribute to ABCoder. Our system provides detailed guidance for manual development
108-
and also supports auto-generation of instructions. Check out our Contribution Guide for more information.
46+
# Supported Languages
10947

110-
# Testing Enhancements
48+
ABCoder currently supports the following languages:
11149

112-
The system is designed to analyze existing functions and corresponding tests, identify the overall quality of testing,
113-
produce reports, and automatically generate test cases for weakly covered items. Our goal is to help repositories
114-
enhance and perfect their test cases.
50+
| Language | Parser | Writer |
51+
| -------- | ----------- | ----------- |
52+
| Go |||
53+
| Rust || Coming Soon |
54+
| C | Coming Soon ||
11555

116-
# Refactor/Rewrite Guide
11756

118-
We offer guidance for both small-scale feature iterations and large-scale rewrites, including language stack switches.
119-
Our system provides a detailed guide for manual development and also supports automated guidance generation.
12057

12158
# Getting Involved
12259

12360
We encourage developers to contribute and make this tool more powerful. If you are interested in contributing to ABCoder
124-
project, kindly check out our Getting Involved Guide.
61+
project, kindly check out our Getting Involved Guide:
62+
- [Parser Extension](docs/parser-zh.md)
12563

12664
> Note: This is a dynamic README and is subject to changes as the project evolves.

docs/parser-zh.md

Lines changed: 102 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,102 @@
1+
# ABCoder - Language Parser 介绍
2+
3+
当前 ABCoder 基于 [LSP](https://microsoft.github.io/language-server-protocol/) 协议实现 Parser ,以达到精确依赖收集,并方便后续多语言扩展
4+
5+
## 代码结构
6+
7+
位于 [lang](/lang) 包下,包括:
8+
9+
- uniast:统一 AST 结构的 golang 定义
10+
- lsp:LSP 协议处理 client,提供了 文件解析、引用查找、语法树解析、定义查找等接口,以及**通用的语言规范 LanguageSpec 接口**
11+
- collect:负责基于 LSP 符号收集和导出 UniAST,是核心运算逻辑
12+
- {language}:主体是对 lsp#Spec 接口的对应 {language} 规范的实现。此外还有具体 LSP server 的一些具体调用逻辑
13+
14+
## 运算过程
15+
16+
![lang-parser](../images/lang-parser.png)
17+
18+
1. 通过命令行参数识别语言启动对应 LSP server,并传入初始化参数
19+
2. 遍历仓库文件,调用 `textDocument/documentSymbol` 方法获取每个文件的所有符号。对于每个符号
20+
1. 调用 `textDocument/semanticTokens/range` 方法获取符号代码中的 tokens
21+
2. 识别出有效实体的 token,调用 `textDocument/definition` 跳转到对应符号位置,从而建立节点依赖关系
22+
3. 循环 2 直文件结束。最终将收集到的 lsp symbols 转换为 UniAST 格式并输出
23+
24+
## 扩展其它语言实现
25+
26+
由于 UniAST 并不完全等价 LSP, 因此需要实现一些特定语言专属的行为接口才能进行转换。参考 lang/rust 包,大体需要实现以下能力:
27+
28+
- GetDefaultLSP():映射用户输入 language 到具体的 lsp.Language,以及对应的 LSP 名称
29+
- CheckRepo():检查用户仓库情况,根据各语言规范额处理工具链等问题,并返回默认打开的第一个文件(用于触发 LSP server),以及等候 sever 初始化完成的时间(根据仓库大小来决定)
30+
- **LanguageSpec interface**: 核心模块,用于处理非 LSP 通用的语法信息、比如判断一个 token 是否是标准库的符号、函数签名解析等:
31+
- ModulePatcher: 后处理模块,用于处理语言特殊的信息收集。比如 rust 的 use 符号收集(LSP 不收集)。可以不实现
32+
33+
### LaunguageSpec
34+
35+
```
36+
用于在 LSP 符号收集过程中转换为 UniAST 所需信息,并且这些信息非 LSP 通用定义
37+
38+
```go
39+
40+
// Detailed implementation used for collect LSP symbols and transform them to UniAST
41+
type LanguageSpec interface {
42+
// initialize a root workspace, and return all modules [modulename=>abs-path] inside
43+
WorkSpace(root string) (map[string]string, error)
44+
45+
// give an absolute file path and returns its module name and package path
46+
// external path should alse be supported
47+
// FIXEM: some language (like rust) may have sub-mods inside a file, but we still consider it as a unity mod here
48+
NameSpace(path string) (string, string, error)
49+
50+
// tells if a file belang to language AST
51+
ShouldSkip(path string) bool
52+
53+
// return the first declaration token of a symbol, as Type-Name
54+
DeclareTokenOfSymbol(sym DocumentSymbol) int
55+
56+
// tells if a token is an AST entity
57+
IsEntityToken(tok Token) bool
58+
59+
// tells if a token is a std token
60+
IsStdToken(tok Token) bool
61+
62+
// return the SymbolKind of a token
63+
TokenKind(tok Token) SymbolKind
64+
65+
// tells if a symbol is a main function
66+
IsMainFunction(sym DocumentSymbol) bool
67+
68+
// tells if a symbol is a language symbol (func, type, variable, etc) in workspace
69+
IsEntitySymbol(sym DocumentSymbol) bool
70+
71+
// tells if a symbol is public in workspace
72+
IsPublicSymbol(sym DocumentSymbol) bool
73+
74+
// declare if the language has impl symbol
75+
// if it return true, the ImplSymbol() will be called
76+
HasImplSymbol() bool
77+
// if a symbol is an impl symbol, return the token index of interface type, receiver type and first-method start (-1 means not found)
78+
// ortherwise the collector will use FunctionSymbol() as receiver type token index (-1 means not found)
79+
ImplSymbol(sym DocumentSymbol) (int, int, int)
80+
81+
// if a symbol is a Function or Method symbol, return the token index of Receiver (-1 means not found),TypeParameters, InputParameters and Outputs
82+
FunctionSymbol(sym DocumentSymbol) (int, []int, []int, []int)
83+
}
84+
```
85+
86+
- Rust-parser 实现位置:[RustSpec](/lang/rust/spec.go)
87+
88+
```
89+
90+
### ModulePatcher
91+
92+
用于后处理收集完成的模块信息
93+
94+
```go
95+
// ModulePatcher supplements some information for module
96+
type ModulePatcher interface {
97+
// Patch is called after collect all symbols
98+
Patch(ast *parse.Module)
99+
}
100+
```
101+
102+
- Rust-parser 实现: [RustModulePatcher](/lang/rust/patch.go)

0 commit comments

Comments
 (0)