zzstoatzz · zzstoatzz · Sep 4, 2025
diff --git a/PERFORMANCE.md b/PERFORMANCE.md
@@ -0,0 +1,106 @@
+# Performance Guide
+
+This guide provides recommendations for optimal mdxify performance in CI/CD and pre-commit scenarios.
+
+## Quick Start: Fastest Invocation
+
+### 1. Use the Python API (Recommended for CI/CD)
+
+For best performance in automated workflows, use the programmatic API:
+
+```python
+# scripts/generate_api_ref.py
+from mdxify import generate_docs
+
+result = generate_docs(
+    "prefect",
+    output_dir="docs/v3/api-ref/python",
+    exclude=["prefect.agent"],
+    anchor_name="Python SDK Reference",
+    include_inheritance=True,
+    repo_url="https://github.com/PrefectHQ/prefect",
+)
+
+print(f"✓ Generated {result['modules_processed']} modules in {result['time_elapsed']:.3f}s")
+if result['modules_failed']:
+    print(f"✗ Failed: {result['modules_failed']} modules")
+```
+
+This avoids all CLI startup overhead and is the fastest option.
+
+### 2. Use `uv run` for CLI (Good Performance)
+
+If you need the CLI, use `uv run` directly:
+
+```bash
+uv run mdxify \
+  --all \
+  --root-module prefect \
+  --output-dir docs/v3/api-ref/python \
+  --exclude prefect.agent
+```
+
+### 3. Use `uvx` Without --refresh-package (Acceptable Performance)
+
+For one-off runs with uvx:
+
+```bash
+uvx mdxify \
+  --all \
+  --root-module prefect \
+  --output-dir docs/v3/api-ref/python
+```
+
+**Note:** Avoid `--refresh-package` unless necessary. It adds ~2s overhead.
+
+## Performance Comparison
+
+Based on benchmarking with Prefect (290 modules):
+
+| Method | Time | Notes |
+|--------|------|-------|
+| Python API | ~0.6-1.0s | Core generation only, no overhead |
+| `uv run` | ~0.7-1.5s | Minimal CLI overhead |
+| `uvx` (no refresh) | ~1.0-2.0s | Some environment resolution |
+| `uvx --refresh-package` | ~3.0-5.0s | Full package refresh |
+
+## Pre-commit Hook Example
+
+For pre-commit/pre-push hooks, use the Python API:
+
+```yaml
+# .pre-commit-config.yaml
+repos:
+  - repo: local
+    hooks:
+      - id: generate-api-docs
+        name: Generate API Documentation
+        entry: python scripts/generate_api_ref.py
+        language: python
+        additional_dependencies: [mdxify]
+        pass_filenames: false
+        stages: [push]
+```
+
+## Tips for Large Codebases
+
+1. **Use parallel processing**: mdxify automatically uses 8 workers for parallel processing
+2. **Exclude unnecessary modules**: Use `--exclude` to skip internal/test modules
+3. **Consider incremental updates**: For development, generate only changed modules
+4. **Pin mdxify version**: Avoid version resolution overhead by pinning: `mdxify==0.x.x`
+
+## Troubleshooting Slow Performance
+
+If mdxify seems slow:
+
+1. **Check for --refresh-package**: Remove it if not needed
+2. **Verify Python environment**: Ensure mdxify is installed in the active environment
+3. **Profile imports**: Heavy user code imports can slow down parsing
+4. **Use verbose mode**: Add `-v` to see per-module timing
+
+## Future Improvements
+
+We're working on:
+- Incremental generation (only rebuild changed modules)
+- Caching of parsed module data
+- Further lazy loading optimizations
diff --git a/README.md b/README.md
@@ -17,6 +17,8 @@ pip install mdxify
 
 ## Usage
 
+### CLI Usage
+
 Generate documentation for all modules in a package:
 
 ```bash
@@ -35,6 +37,24 @@ Exclude internal modules from documentation:
 mdxify --all --root-module mypackage --exclude mypackage.internal --exclude mypackage.tests
 ```
 
+### Programmatic API (Recommended for CI/CD)
+
+For best performance in automated workflows:
+
+```python
+from mdxify import generate_docs
+
+result = generate_docs(
+    "mypackage",
+    output_dir="docs/python-sdk",
+    exclude=["mypackage.internal", "mypackage.tests"],
+)
+
+print(f"Generated {result['modules_processed']} modules in {result['time_elapsed']:.3f}s")
+```
+
+See [PERFORMANCE.md](PERFORMANCE.md) for detailed performance optimization tips.
+
 ### Options
 
 - `modules`: Specific modules to document

diff --git a/docs/mdxify-api.mdx b/docs/mdxify-api.mdx
@@ -0,0 +1,48 @@
+---
+title: api
+sidebarTitle: api
+---
+
+# `mdxify.api`
+
+
+Programmatic API for mdxify.
+
+This module provides a Python API for generating MDX documentation without CLI overhead.
+This is the recommended approach for CI/CD and pre-commit scenarios where performance matters.
+
+
+## Functions
+
+### `generate_docs` <sup><a href="https://github.com/zzstoatzz/mdxify/blob/main/src/mdxify/api.py#L10" target="_blank"><Icon icon="github" style="width: 14px; height: 14px;" /></a></sup>
+
+```python
+generate_docs(root_module: str, output_dir: str | Path = 'docs/python-sdk') -> dict
+```
+
+
+Generate MDX documentation for a Python package.
+
+This is the programmatic API for mdxify, designed for optimal performance
+when called from Python scripts (e.g., in CI/CD pipelines).
+
+**Args:**
+- `root_module`: The root module to document (e.g., 'prefect')
+- `output_dir`: Output directory for MDX files
+- `exclude`: List of module patterns to exclude
+- `anchor_name`: Navigation anchor name in docs.json
+- `repo_url`: GitHub repository URL for source links
+- `branch`: Git branch for source links
+- `include_internal`: Include internal/private modules
+- `include_inheritance`: Include inherited methods in docs
+- `skip_empty_parents`: Skip parent modules with only boilerplate
+- `verbose`: Enable verbose output
+
+**Returns:**
+- Dictionary with generation statistics:
+- - modules_processed: Number of modules processed
+- - modules_failed: Number of modules that failed
+- - time_elapsed: Total time in seconds
+- - files_created: Number of new files created
+- - files_updated: Number of existing files updated
+
diff --git a/repros/20.py b/repros/20.py
@@ -0,0 +1,99 @@
+#!/usr/bin/env python
+import os
+import statistics
+import subprocess
+import time
+from pathlib import Path
+
+# Ensure we're running from mdxify root
+os.chdir(Path(__file__).parent.parent)
+
+def run_benchmark(command: list[str], runs: int = 5) -> dict:
+    """Run a command multiple times and measure performance."""
+    times = []
+    for i in range(runs):
+        start = time.perf_counter()
+        result = subprocess.run(command, capture_output=True, text=True)
+        end = time.perf_counter()
+        elapsed = end - start
+        times.append(elapsed)
+        print(f"  Run {i+1}: {elapsed:.3f}s")
+        if result.returncode != 0:
+            print(f"  Error: {result.stderr}")
+
+    return {
+        "min": min(times),
+        "max": max(times),
+        "mean": statistics.mean(times),
+        "stdev": statistics.stdev(times) if len(times) > 1 else 0,
+        "times": times
+    }
+
+def main():
+    print("=== mdxify Performance Benchmark ===\n")
+
+    # Test on Prefect (matching the issue)
+    prefect_path = Path("sandbox/prefect")
+    if prefect_path.exists():
+        print("Testing on Prefect codebase (290 modules as per issue #20)\n")
+
+        print("1. Testing with uvx (simulating Prefect's current usage):")
+        cmd_uvx = [
+            "uvx", "--with-editable", ".", "--refresh-package", "mdxify",
+            "mdxify", "--all", "--root-module", "prefect", 
+            "--output-dir", str(prefect_path / "docs/v3/api-ref/python"),
+            "--anchor-name", "Python SDK Reference",
+            "--exclude", "prefect.agent",
+            "--include-inheritance",
+            "--repo-url", "https://github.com/PrefectHQ/prefect"
+        ]
+        print("Command: uvx ... mdxify --all --root-module prefect ...")
+        results_uvx = run_benchmark(cmd_uvx, runs=3)
+        print(f"  Average: {results_uvx['mean']:.3f}s ± {results_uvx['stdev']:.3f}s\n")
+
+        print("2. Testing with uvx without --refresh-package:")
+        cmd_uvx_no_refresh = [
+            "uvx", "--with-editable", ".", 
+            "mdxify", "--all", "--root-module", "prefect",
+            "--output-dir", str(prefect_path / "docs/v3/api-ref/python"),
+            "--anchor-name", "Python SDK Reference",
+            "--exclude", "prefect.agent",
+            "--include-inheritance",
+            "--repo-url", "https://github.com/PrefectHQ/prefect"
+        ]
+        print("Command: uvx --with-editable . mdxify ...")
+        results_no_refresh = run_benchmark(cmd_uvx_no_refresh, runs=3)
+        print(f"  Average: {results_no_refresh['mean']:.3f}s ± {results_no_refresh['stdev']:.3f}s\n")
+
+        print("3. Testing with uv run (direct execution):")
+        cmd_uv = [
+            "uv", "run", "mdxify", "--all", "--root-module", "prefect",
+            "--output-dir", str(prefect_path / "docs/v3/api-ref/python"),
+            "--anchor-name", "Python SDK Reference",
+            "--exclude", "prefect.agent",
+            "--include-inheritance",
+            "--repo-url", "https://github.com/PrefectHQ/prefect"
+        ]
+        print("Command: uv run mdxify ...")
+        results_uv = run_benchmark(cmd_uv, runs=3)
+        print(f"  Average: {results_uv['mean']:.3f}s ± {results_uv['stdev']:.3f}s\n")
+
+        print("4. Testing import time only:")
+        import_test = [
+            "uv", "run", "python", "-c", 
+            "import time; s=time.perf_counter(); from mdxify.cli import app; print(f'Import time: {time.perf_counter()-s:.3f}s')"
+        ]
+        print("Testing CLI import time...")
+        subprocess.run(import_test)
+
+        print("\n=== Summary ===")
+        print(f"uvx with --refresh-package: {results_uvx['mean']:.3f}s")
+        print(f"uvx without refresh: {results_no_refresh['mean']:.3f}s")
+        print(f"uv run (direct): {results_uv['mean']:.3f}s")
+        print(f"Overhead from --refresh-package: {results_uvx['mean'] - results_no_refresh['mean']:.3f}s")
+        print(f"Overhead from uvx vs uv run: {results_no_refresh['mean'] - results_uv['mean']:.3f}s")
+    else:
+        print("Prefect test directory not found. Please ensure sandbox/prefect exists.")
+
+if __name__ == "__main__":
+    main()
diff --git a/src/mdxify/__init__.py b/src/mdxify/__init__.py
@@ -1,5 +1,6 @@
 """mdxify - Generate MDX API documentation from Python modules."""
 
+from .api import generate_docs
 from .cli import main
 from .discovery import find_all_modules, get_module_source_file, should_include_module
 from .formatter import escape_mdx_content, format_docstring_with_griffe
@@ -12,6 +13,7 @@
 from .parser import extract_docstring, extract_function_signature, parse_module_fast, parse_modules_with_inheritance, ClassRegistry
 
 __all__ = [
+    "generate_docs",
     "main",
     "find_all_modules",
     "get_module_source_file",