|
1 | | -# AI-Driven Metadata Enhancement for apcore-toolkit |
| 1 | +# AI-Driven Metadata Enhancement |
2 | 2 |
|
3 | | -This document outlines the strategy for using Small Language Models (SLMs) like **Qwen 1.5 (0.6B - 1.7B)** to enhance the metadata extracted by `apcore-toolkit-python`. |
| 3 | +This document specifies how `apcore-toolkit` uses Small Language Models (SLMs) to fill metadata gaps that static analysis cannot resolve. |
4 | 4 |
|
5 | 5 | ## 1. Goal |
6 | 6 |
|
7 | | -The toolkit's primary mission is to make existing code "AI-Perceivable". While static analysis (regex, AST) is efficient, it often fails to: |
8 | | -- Generate meaningful `description` and `documentation` for legacy code. |
9 | | -- Create effective `ai_guidance` for complex error handling. |
10 | | -- Infer `input_schema` for functions using `*args` or `**kwargs`. |
| 7 | +The toolkit's primary mission is to make existing code "AI-Perceivable". While static analysis (regex, AST, type hints) is efficient, it often fails to: |
11 | 8 |
|
12 | | -Using a local SLM allows the toolkit to "understand" the code logic and fill these gaps with high speed and zero cost. |
| 9 | +- Generate meaningful `description` and `documentation` for legacy code with no docstrings. |
| 10 | +- Create effective `ai_guidance` for complex error handling paths. |
| 11 | +- Infer `input_schema` for functions using `*args` or `**kwargs`. |
| 12 | +- Determine behavioral `annotations` (e.g., is this function destructive?) from code logic. |
13 | 13 |
|
14 | | -## 2. Architecture: Local LLM Provider (Option B) |
| 14 | +A local SLM fills these gaps with high speed, zero cost, and no data leakage. |
15 | 15 |
|
16 | | -To keep `apcore-toolkit-python` lightweight, we **DO NOT** bundle model weights. Instead, we use an OpenAI-compatible local API provider (e.g., Ollama, vLLM, LM Studio). |
| 16 | +## 2. Architecture |
17 | 17 |
|
18 | | -### Configuration via Environment Variables |
| 18 | +To keep `apcore-toolkit` lightweight, we **do not** bundle model weights. Instead, we call an OpenAI-compatible local API provider. |
19 | 19 |
|
20 | | -The AI enhancement feature is controlled by the following environment variables: |
| 20 | +### Configuration |
21 | 21 |
|
22 | 22 | | Variable | Description | Default | |
23 | 23 | |----------|-------------|---------| |
24 | | -| `APCORE_AI_ENABLED` | Whether to enable SLM-based metadata enhancement. | `false` | |
25 | | -| `APCORE_AI_ENDPOINT` | The URL of the OpenAI-compatible local API. | `http://localhost:11434/v1` | |
26 | | -| `APCORE_AI_MODEL` | The model name to use (e.g., `qwen:0.6b`). | `qwen:0.6b` | |
27 | | -| `APCORE_AI_THRESHOLD` | Confidence threshold for AI-generated metadata (0-1). | `0.7` | |
28 | | - |
29 | | -## 3. Recommended Setup (Ollama) |
| 24 | +| `APCORE_AI_ENABLED` | Enable SLM-based metadata enhancement. | `false` | |
| 25 | +| `APCORE_AI_ENDPOINT` | URL of the OpenAI-compatible API. | `http://localhost:11434/v1` | |
| 26 | +| `APCORE_AI_MODEL` | Model name (e.g., `qwen:0.6b`). | `qwen:0.6b` | |
| 27 | +| `APCORE_AI_THRESHOLD` | Confidence threshold for accepting AI-generated metadata (0.0–1.0). | `0.7` | |
| 28 | +| `APCORE_AI_BATCH_SIZE` | Number of modules to enhance per API call. | `5` | |
| 29 | +| `APCORE_AI_TIMEOUT` | Timeout in seconds for each API call. | `30` | |
30 | 30 |
|
31 | | -For the best developer experience, we recommend using [Ollama](https://ollama.com/): |
| 31 | +### Recommended Setup (Ollama) |
32 | 32 |
|
33 | | -1. **Install Ollama**. |
34 | | -2. **Pull the recommended model**: |
| 33 | +1. **Install Ollama**: [ollama.com](https://ollama.com/) |
| 34 | +2. **Pull a model**: |
35 | 35 | ```bash |
36 | 36 | ollama run qwen:0.6b |
37 | 37 | ``` |
38 | | -3. **Configure environment**: |
| 38 | +3. **Configure**: |
39 | 39 | ```bash |
40 | 40 | export APCORE_AI_ENABLED=true |
41 | | - export APCORE_AI_MODEL="qwen:0.6b" |
42 | 41 | ``` |
43 | 42 |
|
| 43 | +## 3. Enhancement Targets |
| 44 | + |
| 45 | +The enhancer operates on `ScannedModule` instances **after** static scanning is complete. It only fills fields that are missing or below the confidence threshold. |
| 46 | + |
| 47 | +### 3.1 Description Generation |
| 48 | + |
| 49 | +**When**: `description` is empty or auto-generated (e.g., copied from function name). |
| 50 | + |
| 51 | +**Prompt strategy**: Send the function signature, docstring (if partial), and first 50 lines of the function body. Ask for a ≤200-character description following apcore's convention. |
| 52 | +
|
| 53 | +**Audit tag**: `x-generated-by: slm` in `metadata`. |
| 54 | +
|
| 55 | +### 3.2 Documentation Generation |
| 56 | +
|
| 57 | +**When**: `documentation` is empty and the function has non-trivial logic (>10 lines). |
| 58 | +
|
| 59 | +**Prompt strategy**: Send the full function body. Ask for a ≤5000-character Markdown explanation covering purpose, parameters, return value, and error conditions. |
| 60 | +
|
| 61 | +### 3.3 Annotation Inference |
| 62 | +
|
| 63 | +**When**: All annotations are at their default values (no explicit annotation was set by the scanner). |
| 64 | +
|
| 65 | +This is where the SLM adds the most value — inferring behavioral semantics that static analysis cannot determine reliably. |
| 66 | +
|
| 67 | +**Prompt strategy**: Send the function body and ask the model to classify each annotation with a confidence score: |
| 68 | +
|
| 69 | +| Annotation | What the SLM looks for | |
| 70 | +|-----------|----------------------| |
| 71 | +| `readonly` | No writes to databases, files, or external services | |
| 72 | +| `destructive` | Deletes data, overwrites files, drops resources | |
| 73 | +| `idempotent` | Same input always produces same output, safe to retry | |
| 74 | +| `requires_approval` | Sends money, deletes accounts, modifies permissions | |
| 75 | +| `open_world` | HTTP calls, file I/O, database queries, subprocess calls | |
| 76 | +| `streaming` | Yields/iterates results incrementally | |
| 77 | +
|
| 78 | +**Acceptance rule**: Only apply an annotation if the SLM's confidence ≥ `APCORE_AI_THRESHOLD`. Otherwise, leave as default and add a warning to `ScannedModule.warnings`. |
| 79 | + |
| 80 | +!!! tip "Inspired by HARNESS.md" |
| 81 | + The annotation inference approach draws from [CLI-Anything's HARNESS.md](https://github.com/HKUDS/CLI-Anything) methodology, which catalogs undo/redo systems to determine destructiveness. For web frameworks, the equivalent is analyzing database transactions, file operations, and external API calls in the function body. |
| 82 | +
|
| 83 | +### 3.4 Schema Inference for Untyped Functions |
| 84 | +
|
| 85 | +**When**: `input_schema` is empty and the function uses `*args`, `**kwargs`, or `request` objects without type annotations. |
| 86 | +
|
| 87 | +**Prompt strategy**: Send the function body. Ask the model to infer parameter names, types, and whether they are required, based on how `kwargs` keys are accessed in the code. |
| 88 | +
|
| 89 | +**Output format**: A JSON Schema object that the toolkit merges into `ScannedModule.input_schema`. |
| 90 | +
|
44 | 91 | ## 4. Enhancement Workflow |
45 | 92 |
|
46 | | -When `APCORE_AI_ENABLED` is set to `true`, the `Scanner` will: |
| 93 | +``` |
| 94 | +Scanner.scan() |
| 95 | + │ |
| 96 | + ▼ |
| 97 | +list[ScannedModule] ← static metadata (may have gaps) |
| 98 | + │ |
| 99 | + ▼ |
| 100 | +AIEnhancer.enhance(modules) ← fills gaps using SLM |
| 101 | + │ |
| 102 | + ├─ For each module: |
| 103 | + │ 1. Check which fields are missing/default |
| 104 | + │ 2. Build targeted prompt for each gap |
| 105 | + │ 3. Call SLM API |
| 106 | + │ 4. Parse response, check confidence |
| 107 | + │ 5. Merge accepted enhancements |
| 108 | + │ 6. Tag with x-generated-by: slm |
| 109 | + │ 7. Add warnings for rejected/low-confidence results |
| 110 | + │ |
| 111 | + ▼ |
| 112 | +list[ScannedModule] ← enriched metadata |
| 113 | + │ |
| 114 | + ▼ |
| 115 | +Writer.write(modules) ← output as YAML/Python/Registry |
| 116 | +``` |
| 117 | +
|
| 118 | +### Integration with BaseScanner |
| 119 | +
|
| 120 | +The enhancer is **not** called automatically by `BaseScanner.scan()`. Framework adapters opt in explicitly: |
| 121 | +
|
| 122 | +=== "Python" |
| 123 | +
|
| 124 | + ```python |
| 125 | + from apcore_toolkit import AIEnhancer |
| 126 | +
|
| 127 | + scanner = MyFrameworkScanner() |
| 128 | + modules = scanner.scan() |
| 129 | +
|
| 130 | + if AIEnhancer.is_enabled(): |
| 131 | + enhancer = AIEnhancer() |
| 132 | + modules = enhancer.enhance(modules) |
| 133 | +
|
| 134 | + writer.write(modules, output_dir="./bindings") |
| 135 | + ``` |
| 136 | +
|
| 137 | +=== "TypeScript" |
| 138 | +
|
| 139 | + ```typescript |
| 140 | + import { AIEnhancer } from "apcore-toolkit"; |
| 141 | +
|
| 142 | + const scanner = new MyFrameworkScanner(); |
| 143 | + let modules = scanner.scan(); |
| 144 | +
|
| 145 | + if (AIEnhancer.isEnabled()) { |
| 146 | + const enhancer = new AIEnhancer(); |
| 147 | + modules = await enhancer.enhance(modules); |
| 148 | + } |
| 149 | +
|
| 150 | + writer.write(modules, { outputDir: "./bindings" }); |
| 151 | + ``` |
| 152 | +
|
| 153 | +## 5. Confidence Scoring |
| 154 | +
|
| 155 | +Each AI-generated field includes a confidence score (0.0–1.0) stored in `metadata`: |
| 156 | +
|
| 157 | +```yaml |
| 158 | +metadata: |
| 159 | + x-generated-by: slm |
| 160 | + x-ai-confidence: |
| 161 | + description: 0.92 |
| 162 | + annotations.destructive: 0.85 |
| 163 | + annotations.readonly: 0.45 # below threshold, not applied |
| 164 | +``` |
| 165 | +
|
| 166 | +Fields below `APCORE_AI_THRESHOLD` are **not** applied to the module. Instead, a warning is added: |
47 | 167 |
|
48 | | -1. **Extract static metadata** from docstrings and type hints. |
49 | | -2. **Identify missing fields** (e.g., empty `description` or missing `ai_guidance`). |
50 | | -3. **Send code snippets** to the local SLM with a structured prompt. |
51 | | -4. **Merge the AI-generated metadata** into the final `ScannedModule`, marking them with a `x-generated-by: "slm"` tag for human audit. |
| 168 | +``` |
| 169 | +"Low confidence (0.45) for annotations.readonly — skipped. Review manually." |
| 170 | +``` |
52 | 171 |
|
53 | | -## 5. Security and Privacy |
| 172 | +## 6. Security and Privacy |
54 | 173 |
|
55 | | -- **No Data Leakage**: Since the model runs locally, your source code never leaves your machine. |
56 | | -- **Auditability**: All AI-generated fields MUST be reviewed by the developer before committing the generated `apcore.yaml`. |
| 174 | +- **No data leakage**: The model runs locally. Source code never leaves the machine. |
| 175 | +- **Auditability**: All AI-generated fields are tagged with `x-generated-by: slm` for human review. |
| 176 | +- **Opt-in only**: Disabled by default (`APCORE_AI_ENABLED=false`). |
| 177 | +- **Graceful degradation**: If the SLM endpoint is unreachable, the enhancer logs a warning and returns modules unchanged. |
0 commit comments