Skip to content

Commit 80679b6

Browse files
committed
feat: Introduce initial project documentation, MkDocs site setup, and deployment workflow.
0 parents  commit 80679b6

File tree

14 files changed

+876
-0
lines changed

14 files changed

+876
-0
lines changed

.github/workflows/deploy-docs.yml

Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,54 @@
1+
name: Deploy Documentation
2+
3+
on:
4+
push:
5+
branches:
6+
- main
7+
8+
permissions:
9+
contents: read
10+
pages: write
11+
id-token: write
12+
13+
concurrency:
14+
group: "pages"
15+
cancel-in-progress: false
16+
17+
jobs:
18+
build:
19+
runs-on: ubuntu-latest
20+
steps:
21+
- uses: actions/checkout@v4
22+
23+
- uses: actions/setup-python@v5
24+
with:
25+
python-version: "3.x"
26+
27+
- run: pip install mkdocs-material
28+
29+
- name: Prepare docs for MkDocs
30+
run: |
31+
# Sync root-level docs into docs/ to ensure latest content
32+
cp README.md docs/index.md
33+
cp CHANGELOG.md docs/changelog.md
34+
35+
# Fix doc links in index.md: docs/X → X
36+
# This converts links like (docs/ai-enhancement.md) to (ai-enhancement.md)
37+
# which is required when index.md is moved inside docs/
38+
sed -i 's|(docs/|(|g' docs/index.md
39+
40+
- run: mkdocs build
41+
42+
- uses: actions/upload-pages-artifact@v3
43+
with:
44+
path: site
45+
46+
deploy:
47+
needs: build
48+
runs-on: ubuntu-latest
49+
environment:
50+
name: github-pages
51+
url: ${{ steps.deployment.outputs.page_url }}
52+
steps:
53+
- id: deployment
54+
uses: actions/deploy-pages@v4

CHANGELOG.md

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
# Changelog
2+
3+
All notable changes to this project will be documented in this file.
4+
5+
## [0.1.0] - 2026-03-06
6+
7+
Initial release. Extracts shared framework-agnostic logic from `django-apcore`
8+
and `flask-apcore` into a standalone toolkit package.
9+
10+
### Added
11+
12+
- `ScannedModule` dataclass — canonical representation of a scanned endpoint
13+
- `BaseScanner` ABC with `filter_modules()`, `deduplicate_ids()`,
14+
`infer_annotations_from_method()`, and `extract_docstring()` utilities
15+
- `YAMLWriter` — generates `.binding.yaml` files for `apcore.BindingLoader`
16+
- `PythonWriter` — generates `@module`-decorated Python wrapper files
17+
- `RegistryWriter` — registers modules directly into `apcore.Registry`
18+
- `to_markdown()` — generic dict-to-Markdown conversion with depth control
19+
and table heuristics
20+
- `flatten_pydantic_params()` — flattens Pydantic model parameters into
21+
scalar kwargs for MCP tool invocation
22+
- `resolve_target()` — resolves `module.path:qualname` target strings
23+
- `enrich_schema_descriptions()` — merges docstring parameter descriptions
24+
into JSON Schema properties
25+
- `annotations_to_dict()` / `module_to_dict()` — serialization utilities
26+
- OpenAPI utilities: `resolve_ref()`, `resolve_schema()`,
27+
`extract_input_schema()`, `extract_output_schema()`
28+
- Output format factory via `get_writer()`
29+
- 150 tests with 94% code coverage
30+
31+
### Dependencies
32+
33+
- apcore >= 0.9.0
34+
- pydantic >= 2.0
35+
- PyYAML >= 6.0

README.md

Lines changed: 55 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,55 @@
1+
# apcore-toolkit
2+
3+
[![License](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
4+
[![Python Version](https://img.shields.io/badge/python-3.11%2B-blue)](https://github.com/aipartnerup/apcore-toolkit-python)
5+
6+
**apcore-toolkit** is a shared scanner, schema extraction, and output toolkit for the [apcore](https://github.com/aipartnerup/apcore-python) ecosystem. It provides framework-agnostic logic to extract metadata from existing code and make it "AI-Perceivable".
7+
8+
---
9+
10+
## Key Features
11+
12+
- **🔍 Smart Scanning**: Abstract base classes for framework scanners with filtering and deduplication.
13+
- **📄 Output Generation**: Writers for YAML bindings, Python wrappers, and direct Registry registration.
14+
- **🛠️ Schema Utilities**: Tools for Pydantic model flattening and OpenAPI schema extraction.
15+
- **🤖 AI Enhancement**: Metadata enrichment using local SLMs (Small Language Models).
16+
- **📝 Markdown Formatting**: Convert arbitrary data structures to structured Markdown.
17+
18+
---
19+
20+
## Installation
21+
22+
**Python**
23+
```bash
24+
pip install apcore-toolkit
25+
```
26+
Requires Python 3.11+ and apcore 0.9.0+.
27+
28+
---
29+
30+
## Core Modules
31+
32+
| Module | Description |
33+
|--------|-------------|
34+
| `ScannedModule` | Canonical dataclass representing a scanned endpoint |
35+
| `BaseScanner` | Abstract base class for framework scanners |
36+
| `YAMLWriter` | Generates `.binding.yaml` files for `apcore.BindingLoader` |
37+
| `PythonWriter` | Generates `@module`-decorated Python wrapper files |
38+
| `RegistryWriter` | Registers modules directly into an `apcore.Registry` |
39+
| `to_markdown` | Converts arbitrary dicts to Markdown with depth control |
40+
41+
---
42+
43+
## Documentation
44+
45+
- **[Getting Started Guide](docs/getting-started.md)** — Installation and core usage
46+
- **[Features Overview](docs/features/overview.md)** — Detailed look at toolkit capabilities
47+
- **[AI Enhancement Guide](docs/ai-enhancement.md)** — Metadata enrichment strategy
48+
- **[Changelog](docs/changelog.md)**
49+
50+
---
51+
52+
## License
53+
54+
Apache-2.0
55+

docs/ai-enhancement.md

Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,56 @@
1+
# AI-Driven Metadata Enhancement for apcore-toolkit
2+
3+
This document outlines the strategy for using Small Language Models (SLMs) like **Qwen 1.5 (0.6B - 1.7B)** to enhance the metadata extracted by `apcore-toolkit-python`.
4+
5+
## 1. Goal
6+
7+
The toolkit's primary mission is to make existing code "AI-Perceivable". While static analysis (regex, AST) is efficient, it often fails to:
8+
- Generate meaningful `description` and `documentation` for legacy code.
9+
- Create effective `ai_guidance` for complex error handling.
10+
- Infer `input_schema` for functions using `*args` or `**kwargs`.
11+
12+
Using a local SLM allows the toolkit to "understand" the code logic and fill these gaps with high speed and zero cost.
13+
14+
## 2. Architecture: Local LLM Provider (Option B)
15+
16+
To keep `apcore-toolkit-python` lightweight, we **DO NOT** bundle model weights. Instead, we use an OpenAI-compatible local API provider (e.g., Ollama, vLLM, LM Studio).
17+
18+
### Configuration via Environment Variables
19+
20+
The AI enhancement feature is controlled by the following environment variables:
21+
22+
| Variable | Description | Default |
23+
|----------|-------------|---------|
24+
| `APCORE_AI_ENABLED` | Whether to enable SLM-based metadata enhancement. | `false` |
25+
| `APCORE_AI_ENDPOINT` | The URL of the OpenAI-compatible local API. | `http://localhost:11434/v1` |
26+
| `APCORE_AI_MODEL` | The model name to use (e.g., `qwen:0.6b`). | `qwen:0.6b` |
27+
| `APCORE_AI_THRESHOLD` | Confidence threshold for AI-generated metadata (0-1). | `0.7` |
28+
29+
## 3. Recommended Setup (Ollama)
30+
31+
For the best developer experience, we recommend using [Ollama](https://ollama.com/):
32+
33+
1. **Install Ollama**.
34+
2. **Pull the recommended model**:
35+
```bash
36+
ollama run qwen:0.6b
37+
```
38+
3. **Configure environment**:
39+
```bash
40+
export APCORE_AI_ENABLED=true
41+
export APCORE_AI_MODEL="qwen:0.6b"
42+
```
43+
44+
## 4. Enhancement Workflow
45+
46+
When `APCORE_AI_ENABLED` is set to `true`, the `Scanner` will:
47+
48+
1. **Extract static metadata** from docstrings and type hints.
49+
2. **Identify missing fields** (e.g., empty `description` or missing `ai_guidance`).
50+
3. **Send code snippets** to the local SLM with a structured prompt.
51+
4. **Merge the AI-generated metadata** into the final `ScannedModule`, marking them with a `x-generated-by: "slm"` tag for human audit.
52+
53+
## 5. Security and Privacy
54+
55+
- **No Data Leakage**: Since the model runs locally, your source code never leaves your machine.
56+
- **Auditability**: All AI-generated fields MUST be reviewed by the developer before committing the generated `apcore.yaml`.

docs/changelog.md

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
# Changelog
2+
3+
All notable changes to this project will be documented in this file.
4+
5+
## [0.1.0] - 2026-03-06
6+
7+
Initial release. Extracts shared framework-agnostic logic from `django-apcore`
8+
and `flask-apcore` into a standalone toolkit package.
9+
10+
### Added
11+
12+
- `ScannedModule` dataclass — canonical representation of a scanned endpoint
13+
- `BaseScanner` ABC with `filter_modules()`, `deduplicate_ids()`,
14+
`infer_annotations_from_method()`, and `extract_docstring()` utilities
15+
- `YAMLWriter` — generates `.binding.yaml` files for `apcore.BindingLoader`
16+
- `PythonWriter` — generates `@module`-decorated Python wrapper files
17+
- `RegistryWriter` — registers modules directly into `apcore.Registry`
18+
- `to_markdown()` — generic dict-to-Markdown conversion with depth control
19+
and table heuristics
20+
- `flatten_pydantic_params()` — flattens Pydantic model parameters into
21+
scalar kwargs for MCP tool invocation
22+
- `resolve_target()` — resolves `module.path:qualname` target strings
23+
- `enrich_schema_descriptions()` — merges docstring parameter descriptions
24+
into JSON Schema properties
25+
- `annotations_to_dict()` / `module_to_dict()` — serialization utilities
26+
- OpenAPI utilities: `resolve_ref()`, `resolve_schema()`,
27+
`extract_input_schema()`, `extract_output_schema()`
28+
- Output format factory via `get_writer()`
29+
- 150 tests with 94% code coverage
30+
31+
### Dependencies
32+
33+
- apcore >= 0.9.0
34+
- pydantic >= 2.0
35+
- PyYAML >= 6.0

docs/features/formatting.md

Lines changed: 66 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,66 @@
1+
# Formatting Utilities
2+
3+
`apcore-toolkit` includes powerful tools for converting complex data structures into formatted, human-readable Markdown. This is especially useful for creating "AI-perceivable" documentation or for logging results in a readable format.
4+
5+
## `to_markdown()`
6+
7+
The `to_markdown()` function converts an arbitrary dictionary or list into a structured Markdown string.
8+
9+
### Features
10+
- **Depth Control**: Specify how many levels deep the conversion should go.
11+
- **Table Heuristics**: Automatically detects when data can be better represented as a Markdown table.
12+
- **Recursive Processing**: Handles nested dictionaries and lists gracefully.
13+
14+
### Example
15+
16+
```python
17+
from apcore_toolkit import to_markdown
18+
19+
user_data = {
20+
"name": "Alice",
21+
"role": "admin",
22+
"preferences": {
23+
"theme": "dark",
24+
"notifications": True
25+
},
26+
"recent_activity": [
27+
{"action": "login", "timestamp": "2024-03-07T12:00:00Z"},
28+
{"action": "upload", "timestamp": "2024-03-07T12:05:00Z"}
29+
]
30+
}
31+
32+
# Convert to Markdown with a title
33+
md = to_markdown(user_data, title="User Profile")
34+
print(md)
35+
```
36+
37+
## Schema Enrichment
38+
39+
The `enrich_schema_descriptions()` utility helps bridge the gap when a JSON Schema lacks parameter descriptions but they are available in a function's docstring.
40+
41+
### Features
42+
- **Description Merging**: Merges descriptions from a dictionary into the `properties` of a JSON Schema.
43+
- **Safe by Default**: Won't overwrite existing descriptions unless explicitly requested.
44+
- **Scanned Integration**: Used by concrete scanners to supplement schemas extracted from Pydantic or OpenAPI with docstring-level documentation.
45+
46+
```python
47+
from apcore_toolkit import enrich_schema_descriptions
48+
49+
raw_schema = {
50+
"type": "object",
51+
"properties": {
52+
"user_id": {"type": "integer"}
53+
}
54+
}
55+
56+
param_descriptions = {
57+
"user_id": "The ID of the user to retrieve."
58+
}
59+
60+
# Enrich the schema with parameter descriptions
61+
enriched = enrich_schema_descriptions(raw_schema, param_descriptions)
62+
```
63+
64+
## Use Case: AI Documentation
65+
66+
By converting complex internal states to Markdown tables or sections, you provide an LLM with a highly structured and easy-to-parse context. This improves the agent's ability to reason about the system's current state and available actions.

docs/features/openapi.md

Lines changed: 50 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,50 @@
1+
# OpenAPI Integration
2+
3+
The `apcore_toolkit.openapi` module provides utilities for extracting JSON Schemas directly from an OpenAPI specification, either by parsing the JSON document or by interacting with live OpenAPI endpoints.
4+
5+
## JSON Schema Extraction
6+
7+
The toolkit handles the extraction and merging of OpenAPI operation parameters into canonical JSON Schemas.
8+
9+
| Method | Description |
10+
|--------|-------------|
11+
| `extract_input_schema(op, doc)` | Merges query, path, and request body parameters into a single object schema. |
12+
| `extract_output_schema(op, doc)` | Extracts response schema for `200` or `201` status codes. |
13+
| `resolve_ref(ref_string, doc)` | Resolves internal JSON pointer references (e.g., `#/components/schemas/User`). |
14+
| `resolve_schema(schema, doc)` | Recursively resolves `$ref` in a schema object. |
15+
16+
## Parameter Merging
17+
18+
The `extract_input_schema()` function performs intelligent merging:
19+
1. **Path Parameters**: Extracted and marked as `required: true`.
20+
2. **Query Parameters**: Extracted, with required status preserved.
21+
3. **Request Body**: Properties from the `application/json` request body are merged into the same input schema.
22+
23+
This produces the flat `input_schema` required by the `ScannedModule`.
24+
25+
## Example Usage
26+
27+
```python
28+
from apcore_toolkit.openapi import extract_input_schema, extract_output_schema
29+
30+
# Load an OpenAPI spec
31+
openapi_spec = { ... }
32+
# Get an operation object
33+
operation = openapi_spec["paths"]["/users"]["post"]
34+
35+
# Extract metadata
36+
input_schema = extract_input_schema(operation, openapi_spec)
37+
output_schema = extract_output_schema(operation, openapi_spec)
38+
39+
# Create a ScannedModule
40+
module = ScannedModule(
41+
module_id="users.create",
42+
input_schema=input_schema,
43+
output_schema=output_schema,
44+
# ... other metadata
45+
)
46+
```
47+
48+
## Reference Resolution
49+
50+
The toolkit includes a standalone JSON pointer resolver (`resolve_ref`) that ensures complex, nested OpenAPI schemas are correctly flattened into standalone JSON Schema objects, even when components are shared across many endpoints.

0 commit comments

Comments
 (0)