You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Refactored toolAgent.ts into modular components for improved maintainability and testability. Split into config.ts, messageUtils.ts, toolExecutor.ts, tokenTracking.ts, and types.ts modules.
Copy file name to clipboardExpand all lines: CONTRIBUTING.md
+5-3Lines changed: 5 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -80,17 +80,19 @@ This project and everyone participating in it is governed by our Code of Conduct
80
80
5. Push to your fork and create a Pull Request
81
81
82
82
6. Pre-commit Hooks:
83
-
83
+
84
84
We use [husky](https://typicode.github.io/husky/) and [lint-staged](https://github.com/okonet/lint-staged) to automatically run linting and formatting on staged files before each commit. This helps maintain code quality and consistency.
85
85
86
86
The pre-commit hooks are configured to run:
87
-
-`pnpm lint`: Lints the staged files using ESLint
87
+
88
+
-`pnpm lint`: Lints the staged files using ESLint
88
89
-`pnpm format`: Formats the staged files using Prettier
89
90
90
91
If either of these commands fails due to linting errors or formatting issues, the commit will be aborted. Please fix the reported issues and try committing again.
91
92
92
93
You can also run the lint and format commands manually at any time:
Copy file name to clipboardExpand all lines: docs/LargeCodeBase_Plan.md
+14Lines changed: 14 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -11,16 +11,19 @@ This document presents research findings on how leading AI coding tools handle l
11
11
While detailed technical documentation on Claude Code's internal architecture is limited in public sources, we can infer several approaches from Anthropic's general AI architecture and Claude Code's capabilities:
12
12
13
13
1.**Chunking and Retrieval Augmentation**:
14
+
14
15
- Claude Code likely employs retrieval-augmented generation (RAG) to handle large codebases
15
16
- Files are likely chunked into manageable segments with semantic understanding
16
17
- Relevant code chunks are retrieved based on query relevance
17
18
18
19
2.**Hierarchical Code Understanding**:
20
+
19
21
- Builds a hierarchical representation of code (project → modules → files → functions)
20
22
- Maintains a graph of relationships between code components
21
23
- Prioritizes context based on relevance to the current task
22
24
23
25
3.**Incremental Context Management**:
26
+
24
27
- Dynamically adjusts the context window to include only relevant code
25
28
- Maintains a "working memory" of recently accessed or modified files
26
29
- Uses sliding context windows to process large files sequentially
@@ -35,16 +38,19 @@ While detailed technical documentation on Claude Code's internal architecture is
35
38
Aider's approach to handling large codebases can be inferred from its open-source codebase and documentation:
36
39
37
40
1.**Git Integration**:
41
+
38
42
- Leverages Git to track file changes and understand repository structure
39
43
- Uses Git history to prioritize recently modified files
40
44
- Employs Git's diff capabilities to minimize context needed for changes
41
45
42
46
2.**Selective File Context**:
47
+
43
48
- Only includes relevant files in the context rather than the entire codebase
44
49
- Uses heuristics to identify related files based on imports, references, and naming patterns
45
50
- Implements a "map-reduce" approach where it first analyzes the codebase structure, then selectively processes relevant files
46
51
47
52
3.**Prompt Engineering and Chunking**:
53
+
48
54
- Designs prompts that can work with limited context by focusing on specific tasks
49
55
- Chunks large files and processes them incrementally
50
56
- Uses summarization to compress information about non-focal code parts
@@ -90,6 +96,7 @@ Based on the research findings, we recommend the following enhancements to MyCod
90
96
```
91
97
92
98
**Implementation Details:**
99
+
93
100
- Create a lightweight indexer that runs during project initialization
94
101
- Generate embeddings for code files, focusing on API definitions, function signatures, and documentation
95
102
- Build a graph of relationships between files based on imports/exports and references
@@ -120,6 +127,7 @@ Based on the research findings, we recommend the following enhancements to MyCod
120
127
```
121
128
122
129
**Implementation Details:**
130
+
123
131
- Develop a working set manager that tracks currently relevant files
124
132
- Implement a relevance scoring algorithm that considers:
125
133
- Semantic similarity to the current task
@@ -148,6 +156,7 @@ Based on the research findings, we recommend the following enhancements to MyCod
148
156
```
149
157
150
158
**Implementation Details:**
159
+
151
160
- Chunk files at meaningful boundaries (functions, classes, modules)
152
161
- Implement overlapping chunks to maintain context across boundaries
153
162
- Develop a progressive loading strategy:
@@ -181,6 +190,7 @@ Based on the research findings, we recommend the following enhancements to MyCod
181
190
```
182
191
183
192
**Implementation Details:**
193
+
184
194
- Implement a multi-level caching system:
185
195
- Token cache: Store tokenized representations of files to avoid re-tokenization
186
196
- Embedding cache: Store vector embeddings for semantic search
@@ -209,6 +219,7 @@ Based on the research findings, we recommend the following enhancements to MyCod
209
219
```
210
220
211
221
**Implementation Details:**
222
+
212
223
- Improve task decomposition to identify parallelizable sub-tasks
213
224
- Implement smart context distribution to sub-agents:
214
225
- Provide each sub-agent with only the context it needs
@@ -222,16 +233,19 @@ Based on the research findings, we recommend the following enhancements to MyCod
222
233
## Implementation Roadmap
223
234
224
235
### Phase 1: Foundation (1-2 months)
236
+
225
237
- Develop the basic indexing system for project structure and file metadata
226
238
- Implement a simple relevance-based context selection mechanism
227
239
- Create a basic chunking strategy for large files
228
240
229
241
### Phase 2: Advanced Features (2-3 months)
242
+
230
243
- Implement the semantic indexing system with code embeddings
231
244
- Develop the full context management system with working sets
232
245
- Create the multi-level caching system
233
246
234
247
### Phase 3: Optimization and Integration (1-2 months)
248
+
235
249
- Enhance sub-agent coordination for parallel processing
236
250
- Optimize performance with better caching and context management
0 commit comments