Skip to content

Commit 378487b

Browse files
committed
Add tree-sitter AST parsing and git2 integration
New features: - Tree-sitter based AST parsing for 6 languages (TypeScript, JavaScript, Rust, Python, Go, Java) - Git2 integration with blame tracking, file history, and contributor metadata - Symbol extraction with signatures, visibility, generics, and doc comments - Git metadata per file (last commit, author, contributors) and per symbol (code age, last modifier) Changes: - Add cli/src/ast/ module with AstParser and 6 language extractors - Add cli/src/git/ module with GitRepository, BlameInfo, FileHistory - Update cache types with GitFileInfo and GitSymbolInfo - Update cache.schema.json with git metadata definitions - Integrate AST parsing and git metadata into indexer - Update README with current capabilities, CLI commands, and roadmap - Add comprehensive testing guide
1 parent 0e864f7 commit 378487b

File tree

24 files changed

+7654
-58
lines changed

24 files changed

+7654
-58
lines changed
Lines changed: 257 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,257 @@
1+
# Validate ACP Schemas for Schema Store Submission
2+
3+
Perform a comprehensive sequential analysis of the ACP JSON schemas (`attempts.schema.json`, `primer.schema.json`, `sync.schema.json`) to prepare them for Schema Store submission.
4+
5+
## Objective
6+
7+
Validate, analyze, and remediate the three new ACP schemas before submitting to the JSON Schema Store. This requires meticulous attention to schema correctness, cross-schema consistency, edge case handling, and Schema Store compliance.
8+
9+
---
10+
11+
## Phase 1: Individual Schema Structural Validation
12+
13+
For each schema (`attempts.schema.json`, `primer.schema.json`, `sync.schema.json`), perform:
14+
15+
### 1.1 JSON Schema Draft Compliance
16+
- [ ] Verify `$schema` is set to `https://json-schema.org/draft-07/schema#`
17+
- [ ] Verify `$id` follows pattern `https://acp-protocol.dev/schemas/v1/{name}.schema.json`
18+
- [ ] Verify `title` and `description` are present and descriptive
19+
- [ ] Check all `$ref` references resolve correctly within the schema
20+
21+
### 1.2 Required Fields Analysis
22+
- [ ] Identify all `required` arrays and verify referenced properties exist
23+
- [ ] Check for orphan required fields (required but not defined in properties)
24+
- [ ] Verify `required` arrays don't contain duplicates
25+
26+
### 1.3 Type Consistency
27+
- [ ] All properties have explicit `type` declarations (or use `$ref`, `oneOf`, `anyOf`, `allOf`)
28+
- [ ] No implicit types that could cause validation ambiguity
29+
- [ ] Arrays have `items` defined
30+
- [ ] Objects have `properties` or `additionalProperties` defined
31+
32+
### 1.4 Default Values
33+
- [ ] All `default` values match their declared type
34+
- [ ] Default values are valid according to any `enum`, `pattern`, `minimum`, `maximum` constraints
35+
- [ ] Consider if defaults make sense for the use case
36+
37+
### 1.5 Pattern and Format Validation
38+
- [ ] All `pattern` regexes are valid and test correctly
39+
- [ ] All `format` values are standard JSON Schema formats (date-time, uri, email, etc.)
40+
- [ ] Version patterns match semver: `^\\d+\\.\\d+\\.\\d+`
41+
42+
---
43+
44+
## Phase 2: Cross-Schema Consistency Analysis
45+
46+
### 2.1 Shared Definitions Alignment
47+
Check consistency across schemas for:
48+
- [ ] Version field patterns (should be identical)
49+
- [ ] Timestamp formats (should all use `date-time`)
50+
- [ ] ID field patterns and naming conventions
51+
- [ ] File path conventions
52+
53+
### 2.2 Enumeration Consistency
54+
- [ ] `attempt_status` values in attempts.schema.json
55+
- [ ] `capabilities` values referenced in primer.schema.json and sync.schema.json
56+
- [ ] `toolName` values in sync.schema.json
57+
- [ ] Lock levels referenced across schemas match constraints.schema.json
58+
59+
### 2.3 Naming Convention Consistency
60+
- [ ] Property naming style (camelCase vs snake_case)
61+
- [ ] Definition naming style (`$defs` keys)
62+
- [ ] Consistent terminology (e.g., "budget" vs "tokenBudget")
63+
64+
### 2.4 Cross-References
65+
- [ ] primer.schema.json sections reference cache/vars data sources correctly
66+
- [ ] sync.schema.json primer config aligns with primer.schema.json structure
67+
- [ ] attempts.schema.json status values align with any references in other schemas
68+
69+
---
70+
71+
## Phase 3: Edge Case Analysis
72+
73+
### 3.1 Boundary Conditions
74+
For each schema, identify and verify handling of:
75+
- [ ] Empty arrays: `[]`
76+
- [ ] Empty objects: `{}`
77+
- [ ] Empty strings: `""`
78+
- [ ] Null values (if allowed)
79+
- [ ] Maximum/minimum values at boundaries
80+
- [ ] Very long strings
81+
- [ ] Unicode characters in strings
82+
83+
### 3.2 attempts.schema.json Specific
84+
- [ ] What happens with 0 attempts?
85+
- [ ] What happens with 0 checkpoints?
86+
- [ ] Maximum number of history entries?
87+
- [ ] File content storage with empty files
88+
- [ ] Hash collision handling (documented?)
89+
- [ ] Circular checkpoint references possible?
90+
91+
### 3.3 primer.schema.json Specific
92+
- [ ] Section with 0 tokens valid?
93+
- [ ] Section with tokens: "dynamic" but no data source?
94+
- [ ] Empty `modifiers` array behavior
95+
- [ ] Circular `dependsOn` references possible?
96+
- [ ] `conflictsWith` self-reference?
97+
- [ ] Score calculation with all weights = 0?
98+
- [ ] Empty `formats` object valid?
99+
- [ ] Category with no sections?
100+
101+
### 3.4 sync.schema.json Specific
102+
- [ ] Empty `tools` array behavior (auto-detect vs none?)
103+
- [ ] Tool in both `tools` and `exclude`?
104+
- [ ] Custom adapter with same name as built-in?
105+
- [ ] `primer.budget` below minimum required sections?
106+
- [ ] `mergeStrategy: "section"` without `sectionMarker`?
107+
- [ ] Conflicting tool configurations?
108+
109+
---
110+
111+
## Phase 4: Error Message Quality
112+
113+
### 4.1 Validation Error Context
114+
- [ ] Each property has a `description` for clear error messages
115+
- [ ] Enums have meaningful value names
116+
- [ ] Pattern constraints have `title` or `description` explaining expected format
117+
- [ ] Complex `allOf`/`anyOf`/`oneOf` have clear descriptions
118+
119+
### 4.2 Error Recovery Guidance
120+
Consider documenting in descriptions:
121+
- [ ] What to do when validation fails
122+
- [ ] Common mistakes and how to fix them
123+
- [ ] Links to documentation
124+
125+
---
126+
127+
## Phase 5: Schema Store Requirements
128+
129+
### 5.1 Schema Store Catalog Entry
130+
Verify each schema is ready for catalog.json entry:
131+
```json
132+
{
133+
"name": "ACP [Type] Configuration",
134+
"description": "[Clear description]",
135+
"fileMatch": ["[patterns]"],
136+
"url": "https://acp-protocol.dev/schemas/v1/[name].schema.json"
137+
}
138+
```
139+
140+
### 5.2 File Match Patterns
141+
Verify appropriate file patterns:
142+
- [ ] `attempts.schema.json`: `["acp.attempts.json", ".acp/acp.attempts.json"]`
143+
- [ ] `primer.schema.json`: `["primer.json", "primer.defaults.json", ".acp/primer.json"]`
144+
- [ ] `sync.schema.json`: `["acp.sync.json", ".acp/acp.sync.json"]`
145+
146+
### 5.3 URL Accessibility
147+
- [ ] Schema URLs will be accessible at acp-protocol.dev
148+
- [ ] `$id` matches planned hosting URL
149+
- [ ] No localhost or development URLs
150+
151+
### 5.4 Schema Store Validation
152+
- [ ] Schemas pass `ajv compile` without errors
153+
- [ ] Schemas pass Schema Store's own validation
154+
- [ ] No deprecated JSON Schema keywords used
155+
- [ ] File size reasonable (<100KB each)
156+
157+
---
158+
159+
## Phase 6: Test Fixtures
160+
161+
### 6.1 Valid Examples
162+
For each schema, verify `examples` array contains:
163+
- [ ] Minimal valid document
164+
- [ ] Full-featured document
165+
- [ ] Examples covering major use cases
166+
167+
### 6.2 Invalid Examples (for testing)
168+
Document expected rejections:
169+
- [ ] Missing required fields
170+
- [ ] Invalid enum values
171+
- [ ] Type mismatches
172+
- [ ] Pattern violations
173+
- [ ] Constraint violations (min/max)
174+
175+
---
176+
177+
## Phase 7: Remediation
178+
179+
For each issue found:
180+
181+
### 7.1 Issue Documentation
182+
```markdown
183+
### Issue: [Brief title]
184+
- **Schema**: [which schema]
185+
- **Location**: [JSON path]
186+
- **Severity**: Critical/High/Medium/Low
187+
- **Description**: [What's wrong]
188+
- **Impact**: [What breaks]
189+
- **Fix**: [How to fix]
190+
```
191+
192+
### 7.2 Apply Fixes
193+
- [ ] Make minimal, targeted changes
194+
- [ ] Document each change
195+
- [ ] Re-validate after changes
196+
- [ ] Update examples if affected
197+
198+
### 7.3 Verification
199+
- [ ] Run `ajv validate` on all schemas
200+
- [ ] Run `ajv validate` on all examples against their schemas
201+
- [ ] Verify cross-schema references still work
202+
- [ ] Test with real-world sample data
203+
204+
---
205+
206+
## Phase 8: Final Pre-Submission Checklist
207+
208+
### 8.1 Schema Quality
209+
- [ ] All schemas pass JSON Schema meta-validation
210+
- [ ] All examples validate against their schemas
211+
- [ ] No validation warnings
212+
- [ ] Consistent formatting (2-space indent, sorted keys optional)
213+
214+
### 8.2 Documentation
215+
- [ ] All properties have descriptions
216+
- [ ] Examples are representative
217+
- [ ] Schema Store entry is prepared
218+
219+
### 8.3 Testing
220+
- [ ] Positive test cases pass
221+
- [ ] Negative test cases fail as expected
222+
- [ ] Edge cases handled appropriately
223+
224+
### 8.4 Submission Ready
225+
- [ ] Schemas are at stable URLs
226+
- [ ] PR to SchemaStore/schemastore prepared
227+
- [ ] Catalog entry JSON ready
228+
229+
---
230+
231+
## Execution Instructions
232+
233+
Run this analysis sequentially. For each phase:
234+
235+
1. **Read** the relevant schema file(s) completely
236+
2. **Analyze** according to the checklist items
237+
3. **Document** all findings (pass/fail/needs attention)
238+
4. **Remediate** any issues before proceeding to next phase
239+
5. **Verify** fixes don't introduce new issues
240+
241+
After completing all phases, provide:
242+
1. Summary of all issues found and fixed
243+
2. Any remaining concerns or recommendations
244+
3. Final schemas ready for submission
245+
4. Schema Store catalog entry JSON
246+
247+
---
248+
249+
## Schema Locations
250+
251+
```
252+
schemas/v1/attempts.schema.json
253+
schemas/v1/primer.schema.json
254+
schemas/v1/sync.schema.json
255+
```
256+
257+
Begin with Phase 1 on `attempts.schema.json`, then proceed through all phases for each schema before moving to cross-schema analysis.

0 commit comments

Comments
 (0)