Skip to content

Commit c457869

Browse files
committed
docs: Add scope.md, expand AI enhancement documentation with architecture and targets, and introduce output verification for writers.
1 parent a6339dd commit c457869

File tree

7 files changed

+377
-38
lines changed

7 files changed

+377
-38
lines changed

docs/ai-enhancement.md

Lines changed: 151 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -1,56 +1,177 @@
1-
# AI-Driven Metadata Enhancement for apcore-toolkit
1+
# AI-Driven Metadata Enhancement
22

3-
This document outlines the strategy for using Small Language Models (SLMs) like **Qwen 1.5 (0.6B - 1.7B)** to enhance the metadata extracted by `apcore-toolkit-python`.
3+
This document specifies how `apcore-toolkit` uses Small Language Models (SLMs) to fill metadata gaps that static analysis cannot resolve.
44

55
## 1. Goal
66

7-
The toolkit's primary mission is to make existing code "AI-Perceivable". While static analysis (regex, AST) is efficient, it often fails to:
8-
- Generate meaningful `description` and `documentation` for legacy code.
9-
- Create effective `ai_guidance` for complex error handling.
10-
- Infer `input_schema` for functions using `*args` or `**kwargs`.
7+
The toolkit's primary mission is to make existing code "AI-Perceivable". While static analysis (regex, AST, type hints) is efficient, it often fails to:
118

12-
Using a local SLM allows the toolkit to "understand" the code logic and fill these gaps with high speed and zero cost.
9+
- Generate meaningful `description` and `documentation` for legacy code with no docstrings.
10+
- Create effective `ai_guidance` for complex error handling paths.
11+
- Infer `input_schema` for functions using `*args` or `**kwargs`.
12+
- Determine behavioral `annotations` (e.g., is this function destructive?) from code logic.
1313

14-
## 2. Architecture: Local LLM Provider (Option B)
14+
A local SLM fills these gaps with high speed, zero cost, and no data leakage.
1515

16-
To keep `apcore-toolkit-python` lightweight, we **DO NOT** bundle model weights. Instead, we use an OpenAI-compatible local API provider (e.g., Ollama, vLLM, LM Studio).
16+
## 2. Architecture
1717

18-
### Configuration via Environment Variables
18+
To keep `apcore-toolkit` lightweight, we **do not** bundle model weights. Instead, we call an OpenAI-compatible local API provider.
1919

20-
The AI enhancement feature is controlled by the following environment variables:
20+
### Configuration
2121

2222
| Variable | Description | Default |
2323
|----------|-------------|---------|
24-
| `APCORE_AI_ENABLED` | Whether to enable SLM-based metadata enhancement. | `false` |
25-
| `APCORE_AI_ENDPOINT` | The URL of the OpenAI-compatible local API. | `http://localhost:11434/v1` |
26-
| `APCORE_AI_MODEL` | The model name to use (e.g., `qwen:0.6b`). | `qwen:0.6b` |
27-
| `APCORE_AI_THRESHOLD` | Confidence threshold for AI-generated metadata (0-1). | `0.7` |
28-
29-
## 3. Recommended Setup (Ollama)
24+
| `APCORE_AI_ENABLED` | Enable SLM-based metadata enhancement. | `false` |
25+
| `APCORE_AI_ENDPOINT` | URL of the OpenAI-compatible API. | `http://localhost:11434/v1` |
26+
| `APCORE_AI_MODEL` | Model name (e.g., `qwen:0.6b`). | `qwen:0.6b` |
27+
| `APCORE_AI_THRESHOLD` | Confidence threshold for accepting AI-generated metadata (0.0–1.0). | `0.7` |
28+
| `APCORE_AI_BATCH_SIZE` | Number of modules to enhance per API call. | `5` |
29+
| `APCORE_AI_TIMEOUT` | Timeout in seconds for each API call. | `30` |
3030

31-
For the best developer experience, we recommend using [Ollama](https://ollama.com/):
31+
### Recommended Setup (Ollama)
3232

33-
1. **Install Ollama**.
34-
2. **Pull the recommended model**:
33+
1. **Install Ollama**: [ollama.com](https://ollama.com/)
34+
2. **Pull a model**:
3535
```bash
3636
ollama run qwen:0.6b
3737
```
38-
3. **Configure environment**:
38+
3. **Configure**:
3939
```bash
4040
export APCORE_AI_ENABLED=true
41-
export APCORE_AI_MODEL="qwen:0.6b"
4241
```
4342

43+
## 3. Enhancement Targets
44+
45+
The enhancer operates on `ScannedModule` instances **after** static scanning is complete. It only fills fields that are missing or below the confidence threshold.
46+
47+
### 3.1 Description Generation
48+
49+
**When**: `description` is empty or auto-generated (e.g., copied from function name).
50+
51+
**Prompt strategy**: Send the function signature, docstring (if partial), and first 50 lines of the function body. Ask for a ≤200-character description following apcore's convention.
52+
53+
**Audit tag**: `x-generated-by: slm` in `metadata`.
54+
55+
### 3.2 Documentation Generation
56+
57+
**When**: `documentation` is empty and the function has non-trivial logic (>10 lines).
58+
59+
**Prompt strategy**: Send the full function body. Ask for a ≤5000-character Markdown explanation covering purpose, parameters, return value, and error conditions.
60+
61+
### 3.3 Annotation Inference
62+
63+
**When**: All annotations are at their default values (no explicit annotation was set by the scanner).
64+
65+
This is where the SLM adds the most value — inferring behavioral semantics that static analysis cannot determine reliably.
66+
67+
**Prompt strategy**: Send the function body and ask the model to classify each annotation with a confidence score:
68+
69+
| Annotation | What the SLM looks for |
70+
|-----------|----------------------|
71+
| `readonly` | No writes to databases, files, or external services |
72+
| `destructive` | Deletes data, overwrites files, drops resources |
73+
| `idempotent` | Same input always produces same output, safe to retry |
74+
| `requires_approval` | Sends money, deletes accounts, modifies permissions |
75+
| `open_world` | HTTP calls, file I/O, database queries, subprocess calls |
76+
| `streaming` | Yields/iterates results incrementally |
77+
78+
**Acceptance rule**: Only apply an annotation if the SLM's confidence ≥ `APCORE_AI_THRESHOLD`. Otherwise, leave as default and add a warning to `ScannedModule.warnings`.
79+
80+
!!! tip "Inspired by HARNESS.md"
81+
The annotation inference approach draws from [CLI-Anything's HARNESS.md](https://github.com/HKUDS/CLI-Anything) methodology, which catalogs undo/redo systems to determine destructiveness. For web frameworks, the equivalent is analyzing database transactions, file operations, and external API calls in the function body.
82+
83+
### 3.4 Schema Inference for Untyped Functions
84+
85+
**When**: `input_schema` is empty and the function uses `*args`, `**kwargs`, or `request` objects without type annotations.
86+
87+
**Prompt strategy**: Send the function body. Ask the model to infer parameter names, types, and whether they are required, based on how `kwargs` keys are accessed in the code.
88+
89+
**Output format**: A JSON Schema object that the toolkit merges into `ScannedModule.input_schema`.
90+
4491
## 4. Enhancement Workflow
4592
46-
When `APCORE_AI_ENABLED` is set to `true`, the `Scanner` will:
93+
```
94+
Scanner.scan()
95+
96+
97+
list[ScannedModule] ← static metadata (may have gaps)
98+
99+
100+
AIEnhancer.enhance(modules) ← fills gaps using SLM
101+
102+
├─ For each module:
103+
│ 1. Check which fields are missing/default
104+
│ 2. Build targeted prompt for each gap
105+
│ 3. Call SLM API
106+
│ 4. Parse response, check confidence
107+
│ 5. Merge accepted enhancements
108+
│ 6. Tag with x-generated-by: slm
109+
│ 7. Add warnings for rejected/low-confidence results
110+
111+
112+
list[ScannedModule] ← enriched metadata
113+
114+
115+
Writer.write(modules) ← output as YAML/Python/Registry
116+
```
117+
118+
### Integration with BaseScanner
119+
120+
The enhancer is **not** called automatically by `BaseScanner.scan()`. Framework adapters opt in explicitly:
121+
122+
=== "Python"
123+
124+
```python
125+
from apcore_toolkit import AIEnhancer
126+
127+
scanner = MyFrameworkScanner()
128+
modules = scanner.scan()
129+
130+
if AIEnhancer.is_enabled():
131+
enhancer = AIEnhancer()
132+
modules = enhancer.enhance(modules)
133+
134+
writer.write(modules, output_dir="./bindings")
135+
```
136+
137+
=== "TypeScript"
138+
139+
```typescript
140+
import { AIEnhancer } from "apcore-toolkit";
141+
142+
const scanner = new MyFrameworkScanner();
143+
let modules = scanner.scan();
144+
145+
if (AIEnhancer.isEnabled()) {
146+
const enhancer = new AIEnhancer();
147+
modules = await enhancer.enhance(modules);
148+
}
149+
150+
writer.write(modules, { outputDir: "./bindings" });
151+
```
152+
153+
## 5. Confidence Scoring
154+
155+
Each AI-generated field includes a confidence score (0.0–1.0) stored in `metadata`:
156+
157+
```yaml
158+
metadata:
159+
x-generated-by: slm
160+
x-ai-confidence:
161+
description: 0.92
162+
annotations.destructive: 0.85
163+
annotations.readonly: 0.45 # below threshold, not applied
164+
```
165+
166+
Fields below `APCORE_AI_THRESHOLD` are **not** applied to the module. Instead, a warning is added:
47167
48-
1. **Extract static metadata** from docstrings and type hints.
49-
2. **Identify missing fields** (e.g., empty `description` or missing `ai_guidance`).
50-
3. **Send code snippets** to the local SLM with a structured prompt.
51-
4. **Merge the AI-generated metadata** into the final `ScannedModule`, marking them with a `x-generated-by: "slm"` tag for human audit.
168+
```
169+
"Low confidence (0.45) for annotations.readonly — skipped. Review manually."
170+
```
52171
53-
## 5. Security and Privacy
172+
## 6. Security and Privacy
54173
55-
- **No Data Leakage**: Since the model runs locally, your source code never leaves your machine.
56-
- **Auditability**: All AI-generated fields MUST be reviewed by the developer before committing the generated `apcore.yaml`.
174+
- **No data leakage**: The model runs locally. Source code never leaves the machine.
175+
- **Auditability**: All AI-generated fields are tagged with `x-generated-by: slm` for human review.
176+
- **Opt-in only**: Disabled by default (`APCORE_AI_ENABLED=false`).
177+
- **Graceful degradation**: If the SLM endpoint is unreachable, the enhancer logs a warning and returns modules unchanged.

docs/features/output-writers.md

Lines changed: 59 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -86,6 +86,65 @@ Directly registers the scanned modules into an active `apcore.Registry` instance
8686
writer.write(modules, registry);
8787
```
8888

89+
## Output Verification
90+
91+
Writers can optionally **verify** that their output artifacts are well-formed after writing. This prevents silent failures where a writer produces a file that apcore cannot load.
92+
93+
### Verification by Writer Type
94+
95+
| Writer | Verification Checks |
96+
|--------|-------------------|
97+
| `YAMLWriter` | File exists, YAML parses without error, contains required `module_id` and `target` fields |
98+
| `PythonWriter` / `TypeScriptWriter` | File exists, source code parses without syntax errors (AST/TS compiler check) |
99+
| `RegistryWriter` | Module ID is registered, `registry.get(module_id)` returns a valid module |
100+
101+
### Usage
102+
103+
Verification is enabled via the `verify` parameter:
104+
105+
=== "Python"
106+
107+
```python
108+
from apcore_toolkit import YAMLWriter
109+
110+
writer = YAMLWriter()
111+
results = writer.write(modules, output_dir="./bindings", verify=True)
112+
113+
# results contains verification status per module
114+
for r in results:
115+
if not r.verified:
116+
print(f"WARNING: {r.module_id} — {r.verification_error}")
117+
```
118+
119+
=== "TypeScript"
120+
121+
```typescript
122+
import { YAMLWriter } from "apcore-toolkit";
123+
124+
const writer = new YAMLWriter();
125+
const results = writer.write(modules, { outputDir: "./bindings", verify: true });
126+
127+
for (const r of results) {
128+
if (!r.verified) {
129+
console.warn(`WARNING: ${r.moduleId} — ${r.verificationError}`);
130+
}
131+
}
132+
```
133+
134+
### Verification Result
135+
136+
Each write operation returns a list of `WriteResult` objects:
137+
138+
| Field | Type | Description |
139+
|-------|------|-------------|
140+
| `module_id` | `str` | The module that was written |
141+
| `path` | `str \| None` | Output file path (None for RegistryWriter) |
142+
| `verified` | `bool` | Whether verification passed (always `True` if `verify=False`) |
143+
| `verification_error` | `str \| None` | Error message if verification failed |
144+
145+
!!! tip "Use in CI"
146+
Enable verification in CI pipelines to catch binding generation issues before deployment. A scan → write → verify cycle ensures that generated artifacts are always loadable by apcore.
147+
89148
## Choosing a Writer
90149

91150
| Use Case | Recommended Writer |

docs/features/overview.md

Lines changed: 8 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -6,16 +6,21 @@
66

77
| Feature | Description |
88
|---------|-------------|
9-
| **[Smart Scanning](scanning.md)** | Abstract base classes and utilities for framework-specific scanners. |
9+
| **[Smart Scanning](scanning.md)** | Abstract base classes and utilities for framework-specific scanners, with a 5-phase ability extraction methodology. |
1010
| **[OpenAPI Integration](openapi.md)** | Extract JSON Schemas directly from OpenAPI operation objects. |
1111
| **[Schema Utilities](pydantic.md)** | Flatten complex models (Pydantic / Zod) for easier AI interaction. |
12-
| **[Output Writers](output-writers.md)** | Export metadata to YAML bindings, Python wrappers, or direct Registry registration. |
12+
| **[Output Writers](output-writers.md)** | Export metadata to YAML bindings, source code wrappers, or direct Registry registration — with optional output verification. |
1313
| **[Formatting](formatting.md)** | Convert data structures into beautiful, human-readable Markdown. |
14-
| **[AI Enhancement](../ai-enhancement.md)** | Use local SLMs to automatically fill in missing metadata. |
14+
| **[AI Enhancement](../ai-enhancement.md)** | Use local SLMs to automatically fill in missing metadata, including behavioral annotation inference. |
1515

1616
## Design Philosophy
1717

1818
- **Framework Agnostic**: The core logic has no dependency on specific web frameworks (Django, Flask, FastAPI).
1919
- **Separation of Concerns**: Scanning (extraction), Schema Utilities (refinement), and Writers (export) are kept distinct.
2020
- **Developer First**: Focuses on automating the tedious tasks of writing `apcore.yaml` or `@module` decorators.
2121
- **AI-Native**: Built with the assumption that the ultimate consumer of this metadata is a Large Language Model (LLM) or AI agent.
22+
- **Dual-Language Parity**: Every feature is implementable in both Python and TypeScript.
23+
24+
## Scope
25+
26+
For a detailed definition of what the toolkit does and does not do, see the [Scope & Boundaries](../scope.md) document.

docs/features/scanning.md

Lines changed: 69 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,69 @@ The `BaseScanner` ABC (Abstract Base Class) provides a consistent interface and
1313
| `infer_annotations(...)` | Infer `readonly`, `destructive`, or `idempotent` from HTTP methods. |
1414
| `deduplicate_ids(...)` | Automatically resolve duplicate module IDs by appending suffixes (`_2`, `_3`). |
1515

16+
## Ability Extraction Methodology
17+
18+
When building a scanner for a new framework, follow this systematic approach to ensure comprehensive metadata extraction. This methodology is adapted from real-world experience scanning 10+ software systems.
19+
20+
### Phase 1: Identify the Backend Engine
21+
22+
Separate the framework's **routing/dispatch layer** from its **business logic layer**. The scanner should target the dispatch layer to discover endpoints, then reach into the business logic layer for metadata.
23+
24+
| Framework | Dispatch Layer | Business Logic |
25+
|-----------|---------------|----------------|
26+
| Django REST | `urlpatterns` + `ViewSet` | serializer methods, queryset logic |
27+
| Flask | `@app.route` + Blueprints | view function body |
28+
| FastAPI | `@router.get/post` | endpoint function with type hints |
29+
| Express | `router.get/post` | handler functions |
30+
| NestJS | `@Controller` + `@Get/@Post` | service methods |
31+
32+
### Phase 2: Map Operations to Modules
33+
34+
For each discovered endpoint, extract the canonical mapping:
35+
36+
```
37+
Framework endpoint → ScannedModule
38+
───────────────── ─────────────
39+
route path module_id
40+
handler function target
41+
request schema input_schema
42+
response schema output_schema
43+
docstring description + documentation
44+
```
45+
46+
### Phase 3: Extract Data Models
47+
48+
Leverage the framework's native schema system:
49+
50+
- **Python**: Pydantic models, Django serializers, marshmallow schemas
51+
- **TypeScript**: Zod schemas, class-validator decorators, interfaces
52+
53+
Use the toolkit's `flatten_pydantic_params()` or `flattenParams()` to convert nested models into flat schemas when needed.
54+
55+
### Phase 4: Discover Existing API Contracts
56+
57+
Check for existing machine-readable API definitions that can supplement or replace code scanning:
58+
59+
- OpenAPI/Swagger specs (use `extract_input_schema()` / `extract_output_schema()`)
60+
- GraphQL schemas
61+
- gRPC/Protobuf definitions
62+
- Existing MCP server manifests
63+
64+
### Phase 5: Infer Behavioral Annotations
65+
66+
Go beyond HTTP method heuristics. Analyze the function body for behavioral signals:
67+
68+
| Signal in Code | Inferred Annotation |
69+
|---------------|-------------------|
70+
| `DELETE` method, `.delete()` calls, `DROP` SQL | `destructive=True` |
71+
| `GET` method, no DB writes, pure computation | `readonly=True` |
72+
| `PUT` method, upsert patterns | `idempotent=True` |
73+
| Sends email/SMS, processes payment, modifies permissions | `requires_approval=True` |
74+
| HTTP client calls, file I/O, subprocess | `open_world=True` |
75+
| `yield`, `StreamingResponse`, `async for` | `streaming=True` |
76+
77+
Static analysis can detect some of these patterns. For ambiguous cases, the [AI Enhancement](../ai-enhancement.md) module can assist with SLM-based inference.
78+
1679
## Implementation Example
1780

1881
When implementing a custom scanner, you inherit from `BaseScanner`:
@@ -93,7 +156,9 @@ Scanners often encounter naming collisions (e.g., `GET /users` and `POST /users`
93156
## Behavioral Inference
94157

95158
`infer_annotations_from_method()` provides a sensible default for mapping HTTP verbs to apcore's `ModuleAnnotations`:
96-
- `GET` $\rightarrow$ `readonly=True`
97-
- `DELETE` $\rightarrow$ `destructive=True`
98-
- `PUT` $\rightarrow$ `idempotent=True`
99-
- Others $\rightarrow$ Default (all False)
159+
- `GET``readonly=True`
160+
- `DELETE``destructive=True`
161+
- `PUT``idempotent=True`
162+
- Others → Default (all False)
163+
164+
For deeper behavioral analysis beyond HTTP methods, see [Phase 5](#phase-5-infer-behavioral-annotations) above and the [AI Enhancement](../ai-enhancement.md) module.

docs/getting-started.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -263,7 +263,7 @@ Enhance your metadata using local Small Language Models (SLMs).
263263
```
264264
3. **Run your scanner**: Missing descriptions and documentation will be automatically inferred.
265265

266-
See the [AI Enhancement Guide](AI_ENHANCEMENT.md) for more details.
266+
See the [AI Enhancement Guide](ai-enhancement.md) for more details.
267267

268268
---
269269

0 commit comments

Comments
 (0)