|
1 | | -# aieng-bot-maintain Documentation |
| 1 | +# aieng-bot-maintain |
2 | 2 |
|
3 | | -Comprehensive documentation for the AI Engineering Maintenance Bot - an automated system that manages bot PRs (Dependabot and pre-commit-ci) across all Vector Institute repositories. |
| 3 | +---------------------------------------------------------------------------------------- |
4 | 4 |
|
5 | | -## Getting Started |
| 5 | +[](https://pypi.org/project/aieng-bot-maintain) |
| 6 | +[](https://github.com/VectorInstitute/aieng-bot-maintain/actions/workflows/code_checks.yml) |
| 7 | +[](https://github.com/VectorInstitute/aieng-bot-maintain/actions/workflows/unit_tests.yml) |
| 8 | +[](https://github.com/VectorInstitute/aieng-bot-maintain/actions/workflows/docs.yml) |
| 9 | +[](https://codecov.io/github/VectorInstitute/aieng-bot-maintain) |
| 10 | + |
6 | 11 |
|
7 | | -- **[Setup Guide](setup.md)** - Complete setup instructions including API keys, tokens, and configuration |
8 | | -- **[Deployment Guide](deployment.md)** - Step-by-step deployment process and monitoring strategies |
9 | | -- **[Testing Guide](testing.md)** - Test cases, validation procedures, and debugging |
10 | 12 |
|
11 | | -## Quick Links |
| 13 | +Centralized maintenance bot that automatically manages bot PRs (Dependabot and pre-commit-ci) across all Vector Institute repositories from a single location. |
12 | 14 |
|
13 | | -- [Main Repository](../) - Return to repository root |
14 | | -- [Workflow Files](../.github/workflows/) - GitHub Actions workflows |
15 | | -- [Prompt Templates](../.github/prompts/) - AI prompt templates for different failure types |
| 15 | +## Features |
16 | 16 |
|
17 | | -## Overview |
| 17 | +- **Organization-wide monitoring** - Scans all VectorInstitute repos every 6 hours |
| 18 | +- **Auto-merge** - Merges bot PRs (Dependabot and pre-commit-ci) when all checks pass |
| 19 | +- **Auto-fix** - Fixes test failures, linting issues, security vulnerabilities, and build errors using Claude AI Agent SDK |
| 20 | +- **Centralized operation** - No installation needed in individual repositories |
| 21 | +- **Smart detection** - Categorizes failures and applies appropriate fix strategies |
| 22 | +- **Transparent** - Comments on PRs with status updates |
18 | 23 |
|
19 | | -The bot operates from a single centralized repository and requires no installation in individual repositories. It: |
| 24 | +## Architecture |
20 | 25 |
|
21 | | -- Monitors all VectorInstitute repositories every 6 hours |
22 | | -- Auto-merges bot PRs (Dependabot and pre-commit-ci) when all checks pass |
23 | | -- Automatically fixes common issues using Claude Agent SDK |
24 | | -- Posts transparent status updates on PRs |
| 26 | +``` |
| 27 | +┌─────────────────────────────────┐ |
| 28 | +│ aieng-bot-maintain Repository │ |
| 29 | +│ (This Repo - Central Bot) │ |
| 30 | +│ │ |
| 31 | +│ Runs every 6 hours: │ |
| 32 | +│ 1. Scans VectorInstitute org │ |
| 33 | +│ 2. Finds bot PRs │ |
| 34 | +│ 3. Checks status │ |
| 35 | +│ 4. Merges or fixes PRs │ |
| 36 | +└──────────────┬──────────────────┘ |
| 37 | + │ |
| 38 | + │ Operates on |
| 39 | + ▼ |
| 40 | +┌───────────────────────────────────┐ |
| 41 | +│ VectorInstitute Organization │ |
| 42 | +│ │ |
| 43 | +│ ├─ repo-1 (Bot PR #1) │ |
| 44 | +│ ├─ repo-2 (Bot PR #2) │ |
| 45 | +│ ├─ repo-3 (Bot PR #3) │ |
| 46 | +│ └─ repo-N ... │ |
| 47 | +└───────────────────────────────────┘ |
| 48 | +``` |
25 | 49 |
|
26 | | -## Key Features |
| 50 | +## Quick Start |
27 | 51 |
|
28 | | -### Organization-Wide Monitoring |
29 | | -Scans all repositories in the VectorInstitute organization for open bot PRs (Dependabot and pre-commit-ci) and processes them automatically. |
| 52 | +### Setup (in this repository) |
30 | 53 |
|
31 | | -### Intelligent Auto-Merge |
32 | | -Analyzes PR status checks and automatically approves and merges PRs when all tests pass. |
| 54 | +**1. Create Anthropic API Key** |
| 55 | +- Get from [Anthropic Console](https://console.anthropic.com/settings/keys) |
| 56 | +- Add as repository secret: `ANTHROPIC_API_KEY` |
33 | 57 |
|
34 | | -### AI-Powered Auto-Fix |
35 | | -Uses Claude Agent SDK to analyze failures and directly modify code to fix: |
36 | | -- Test failures from dependency updates |
37 | | -- Linting and formatting issues |
38 | | -- Security audit failures |
39 | | -- Build configuration problems |
| 58 | +**2. Create GitHub Personal Access Token** |
| 59 | +- Go to Settings → Developer settings → Personal access tokens → Fine-grained tokens |
| 60 | +- Configure: Resource owner: `VectorInstitute`, Repository access: `All repositories` |
| 61 | +- Permissions: `contents: write`, `pull_requests: write`, `issues: write` |
| 62 | +- Add as repository secret: `ORG_ACCESS_TOKEN` |
40 | 63 |
|
41 | | -### Centralized Operation |
42 | | -All logic runs from this single repository - target repositories need only: |
43 | | -- Dependabot enabled |
44 | | -- Auto-merge enabled (optional but recommended) |
| 64 | +**3. Enable GitHub Actions** |
| 65 | +- Go to Actions tab → Enable workflows |
45 | 66 |
|
46 | | -## Architecture |
| 67 | +The bot now monitors all VectorInstitute repositories automatically. |
47 | 68 |
|
48 | | -``` |
49 | | -┌───────────────────────────┐ |
50 | | -│ aieng-bot-maintain │ |
51 | | -│ (This Repository) │ |
52 | | -│ │ |
53 | | -│ Workflows: │ |
54 | | -│ • monitor (every 6hrs) │ |
55 | | -│ • fix (on-demand) │ |
56 | | -└────────────┬──────────────┘ |
57 | | - │ |
58 | | - │ Manages |
59 | | - ▼ |
60 | | -┌────────────────────────────┐ |
61 | | -│ VectorInstitute Org Repos │ |
62 | | -│ │ |
63 | | -│ Finds & processes │ |
64 | | -│ bot PRs │ |
65 | | -└────────────────────────────┘ |
66 | | -``` |
| 69 | +## How It Works |
| 70 | + |
| 71 | +**1. Monitor** (every 6 hours) |
| 72 | +- Scans all VectorInstitute repositories for open bot PRs (Dependabot and pre-commit-ci) |
| 73 | +- Checks status of each PR |
| 74 | +- Routes to merge or fix workflow |
| 75 | + |
| 76 | +**2. Auto-Merge** (when all checks pass) |
| 77 | +- Approves PR and enables auto-merge |
| 78 | +- Comments with status |
| 79 | +- PR merges automatically |
| 80 | + |
| 81 | +**3. Auto-Fix** (when checks fail) |
| 82 | +- Clones target repository and PR branch |
| 83 | +- Analyzes failure type: test, lint, security, or build |
| 84 | +- Loads appropriate AI prompt template |
| 85 | +- Uses Claude Agent SDK to automatically apply fixes |
| 86 | +- Commits and pushes fixes to PR |
67 | 87 |
|
68 | 88 | ## Configuration |
69 | 89 |
|
70 | | -### Required Secrets |
71 | | -- `ANTHROPIC_API_KEY` - API access for Claude (get from [Anthropic Console](https://console.anthropic.com/settings/keys)) |
72 | | -- `ORG_ACCESS_TOKEN` - GitHub PAT with org-wide write permissions |
| 90 | +**Required Secrets** |
| 91 | +- `ANTHROPIC_API_KEY` - Anthropic API access for Claude |
| 92 | +- `ORG_ACCESS_TOKEN` - GitHub PAT with org-wide permissions |
| 93 | + |
| 94 | +**Workflows** |
| 95 | +- `monitor-org-bot-prs.yml` - Scans org for bot PRs (Dependabot and pre-commit-ci) every 6 hours |
| 96 | +- `fix-remote-pr.yml` - Fixes failing PRs using AI |
| 97 | + |
| 98 | +**AI Prompt Templates** (customize for your needs) |
| 99 | +- `fix-merge-conflicts.md` - Resolve merge conflicts with best practices |
| 100 | +- `fix-test-failures.md` - Test failure resolution strategies |
| 101 | +- `fix-lint-failures.md` - Linting/formatting fixes |
| 102 | +- `fix-security-audit.md` - Security vulnerability handling |
| 103 | +- `fix-build-failures.md` - Build/compilation error fixes |
73 | 104 |
|
74 | | -### Workflows |
75 | | -- `monitor-org-bot-prs.yml` - Scheduled workflow that scans organization |
76 | | -- `fix-remote-pr.yml` - On-demand workflow triggered for failing PRs |
| 105 | +## Capabilities |
77 | 106 |
|
78 | | -### Customization |
79 | | -- Edit `.github/prompts/*.md` files to customize fix strategies |
80 | | -- Adjust cron schedule in monitor workflow for different frequencies |
81 | | -- Modify failure detection logic in workflow files |
| 107 | +**Can fix:** |
| 108 | +- Merge conflicts (dependency files, lock files, code) |
| 109 | +- Linting and formatting issues |
| 110 | +- Security vulnerabilities (dependency updates) |
| 111 | +- Simple test failures from API changes |
| 112 | +- Build configuration issues |
| 113 | + |
| 114 | +**Cannot fix:** |
| 115 | +- Complex logic errors |
| 116 | +- Breaking changes requiring refactoring |
| 117 | +- Issues requiring architectural decisions |
82 | 118 |
|
83 | | -## Common Tasks |
| 119 | +## Manual Testing |
84 | 120 |
|
85 | | -### Manual Testing |
| 121 | +**Trigger via CLI:** |
86 | 122 | ```bash |
87 | | -# Test monitoring workflow |
| 123 | +# Monitor all repositories |
88 | 124 | gh workflow run monitor-org-bot-prs.yml |
89 | 125 |
|
90 | | -# Test fix on specific PR |
| 126 | +# Fix a specific PR (test with aieng-template-mvp#17) |
91 | 127 | gh workflow run fix-remote-pr.yml \ |
92 | 128 | --field target_repo="VectorInstitute/aieng-template-mvp" \ |
93 | 129 | --field pr_number="17" |
94 | 130 | ``` |
95 | 131 |
|
96 | | -### Monitoring Bot Activity |
| 132 | +**Trigger via GitHub UI:** |
| 133 | +Actions → Select workflow → Run workflow → Enter parameters |
| 134 | + |
| 135 | +## Dashboard |
| 136 | + |
| 137 | +**View comprehensive analytics and agent execution traces:** |
| 138 | +- 📊 **[Bot Dashboard](https://catalog.vectorinstitute.ai/aieng-bot-maintain)** - Interactive dashboard with: |
| 139 | + - Overview table of all bot PR fixes |
| 140 | + - Success rates and performance metrics |
| 141 | + - Detailed agent execution traces (like LangSmith/Langfuse) |
| 142 | + - Code diffs with syntax highlighting |
| 143 | + - Failure analysis and reasoning timeline |
| 144 | + |
| 145 | +**Features:** |
| 146 | +- Real-time PR status tracking |
| 147 | +- Agent observability (tool calls, reasoning, actions) |
| 148 | +- Historical metrics and trends |
| 149 | +- Per-repo and per-failure-type analytics |
| 150 | +- Sortable/filterable PR table |
| 151 | + |
| 152 | +**Authentication:** |
| 153 | +- Restricted to @vectorinstitute.ai email addresses |
| 154 | +- Google OAuth 2.0 sign-in |
| 155 | + |
| 156 | +## Monitoring |
| 157 | + |
| 158 | +**View bot activity:** |
| 159 | +- [Dashboard](https://catalog.vectorinstitute.ai/aieng-bot-maintain) - Comprehensive analytics and traces |
| 160 | +- Actions tab - All workflow runs and success/failure rates |
| 161 | +- PR comments - Detailed status updates on each PR |
| 162 | +- Run summary - PR count and actions taken per run |
| 163 | + |
| 164 | +**Debug commands:** |
97 | 165 | ```bash |
98 | | -# View recent runs |
| 166 | +# View recent workflow runs |
99 | 167 | gh run list --workflow=monitor-org-bot-prs.yml --limit 5 |
100 | 168 |
|
101 | | -# View specific run logs |
| 169 | +# View logs for specific run |
102 | 170 | gh run view RUN_ID --log |
| 171 | + |
| 172 | +# Collect metrics manually |
| 173 | +gh workflow run collect-bot-metrics.yml |
103 | 174 | ``` |
104 | 175 |
|
105 | | -### Debugging Issues |
106 | | -Check: |
107 | | -- Actions tab for workflow execution logs |
108 | | -- PR comments for bot status updates |
109 | | -- Repository secrets are properly set |
110 | | -- Token permissions are correct |
| 176 | +## Documentation |
| 177 | + |
| 178 | +- [Setup Guide](docs/setup.md) - Detailed configuration and permissions |
| 179 | +- [Deployment Guide](docs/deployment.md) - Rollout strategy and monitoring |
| 180 | +- [Testing Guide](docs/testing.md) - Test cases and validation |
| 181 | + |
| 182 | +## Troubleshooting |
111 | 183 |
|
112 | | -## Support |
| 184 | +| Issue | Solution | |
| 185 | +|-------|----------| |
| 186 | +| Workflow doesn't run | Check Actions enabled and secrets are set | |
| 187 | +| Can't find PRs | Verify `ORG_ACCESS_TOKEN` has correct permissions | |
| 188 | +| Can't merge PRs | Ensure token has `contents: write` permission | |
| 189 | +| Can't push fixes | Check token has write access to target repos | |
| 190 | +| Claude API errors | Verify `ANTHROPIC_API_KEY` is valid | |
| 191 | +| Rate limits | Reduce monitoring frequency in workflow cron schedule | |
113 | 192 |
|
114 | | -For issues, questions, or contributions: |
115 | | -- Open an issue in this repository |
116 | | -- Check workflow logs for error details |
117 | | -- Review PR comments for bot activity |
118 | | -- Contact AI Engineering team for urgent issues |
| 193 | +See [Setup Guide](docs/setup.md) for detailed troubleshooting. |
119 | 194 |
|
120 | 195 | --- |
121 | 196 |
|
|
0 commit comments