|
1 | 1 | # Code Review Agent |
2 | 2 |
|
3 | 3 |
|
| 4 | +## Documentation: GitHub Models PR Review Workflow |
| 5 | + |
| 6 | +## Overview |
| 7 | + |
| 8 | +This GitHub Actions workflow is designed to automate **pull request (PR) code reviews** using **GitHub-hosted LLMs** (specifically `gpt-4.1-nano`) via the GitHub Models API. It uses: |
| 9 | + |
| 10 | +* Smart diff file prioritization to manage token limits. |
| 11 | +* Markdown-rich prompts for structured review. |
| 12 | +* Graceful fallbacks for oversized diffs. |
| 13 | + |
| 14 | +## Objective |
| 15 | + |
| 16 | +To enhance pull request quality checks by automatically generating detailed, structured code reviews that: |
| 17 | + |
| 18 | +* Flag security, performance, and maintainability issues |
| 19 | +* Recommend improvements with code snippets |
| 20 | +* Provide clear summaries and checklists |
| 21 | + |
| 22 | +--- |
| 23 | + |
| 24 | +## Workflow Triggers |
| 25 | + |
| 26 | +| Trigger | Description | |
| 27 | +| ------------------- | --------------------------------------------- | |
| 28 | +| `pull_request` | Runs on PRs targeting the `init-proj` branch. | |
| 29 | +| `workflow_dispatch` | Allows manual triggering from GitHub UI. | |
| 30 | + |
| 31 | +```yaml |
| 32 | +ame: GitHub Models PR Review |
| 33 | +on: |
| 34 | + pull_request: |
| 35 | + branches: |
| 36 | + - init-proj |
| 37 | + workflow_dispatch: |
| 38 | +``` |
| 39 | +
|
| 40 | +--- |
| 41 | +
|
| 42 | +## Permissions |
| 43 | +
|
| 44 | +This workflow needs to: |
| 45 | +
|
| 46 | +* Read the repo content |
| 47 | +* Comment on PRs (`pull-requests: write`) |
| 48 | +* Create issues if needed |
| 49 | +* PAT (Personal Access Token) with Github Models `read-only` permission with `GH_PAT_MODELS` secret for API access |
| 50 | + |
| 51 | +```yaml |
| 52 | +permissions: |
| 53 | + contents: read |
| 54 | + pull-requests: write |
| 55 | + issues: write |
| 56 | +``` |
| 57 | + |
| 58 | +--- |
| 59 | + |
| 60 | +## Job: `pr_review` |
| 61 | + |
| 62 | +| Property | Value | |
| 63 | +| --------------- | ------------------ | |
| 64 | +| Runs on | `ubuntu-latest` | |
| 65 | +| Conditional Run | PR or manual event | |
| 66 | + |
| 67 | +### Step 1: Checkout |
| 68 | + |
| 69 | +Fetches the full repository history: |
| 70 | + |
| 71 | +```yaml |
| 72 | +- name: Checkout Repository |
| 73 | + uses: actions/checkout@v4 |
| 74 | + with: |
| 75 | + fetch-depth: 0 |
| 76 | +``` |
| 77 | + |
| 78 | +### Step 2: Generate Diff |
| 79 | + |
| 80 | +Creates a diff from base branch to HEAD and saves to `pr_diff.txt`: |
| 81 | + |
| 82 | +```bash |
| 83 | +git fetch origin ${{ github.base_ref }} |
| 84 | +git diff origin/${{ github.base_ref }}...HEAD > pr_diff.txt |
| 85 | +``` |
| 86 | + |
| 87 | +--- |
| 88 | + |
| 89 | +## Step 3: Smart Diff Prioritization |
| 90 | + |
| 91 | +Handles token limits (OpenAI models have token size limits): |
| 92 | + |
| 93 | +### Logic: |
| 94 | + |
| 95 | +| Condition | Action | |
| 96 | +| ------------------------- | ------------------------------ | |
| 97 | +| Diff within token limit | Use full diff | |
| 98 | +| Diff exceeds token limit | Prioritize critical files only | |
| 99 | +| Still exceeds safe tokens | Truncate file to \~5000 tokens | |
| 100 | + |
| 101 | +### Categories Used: |
| 102 | + |
| 103 | +| Priority | Category | File Types | |
| 104 | +| -------- | ------------- | ---------------------------------- | |
| 105 | +| 1 | Core Code | `.js`, `.py`, `.java`, etc. | |
| 106 | +| 2 | Configuration | `Dockerfile`, `.env`, `.yml`, etc. | |
| 107 | +| 3 | Tests/Docs | `README`, `test_*.py`, etc. | |
| 108 | +| 4 | Styles/Docs | `.css`, `.md`, `.txt` | |
| 109 | +| 5 | Other | Anything else | |
| 110 | + |
| 111 | +### Snippet: File Categorization |
| 112 | + |
| 113 | +```bash |
| 114 | +if echo "$file" | grep -qE '\.(js|jsx|ts|tsx|py|java|...)'; then |
| 115 | + PRIORITY=1 |
| 116 | + CATEGORY="Core Code" |
| 117 | +``` |
| 118 | + |
| 119 | +### Token Calculation (Estimates 1 token ≈ 4 characters): |
| 120 | + |
| 121 | +```bash |
| 122 | +ESTIMATED_TOKENS=$((DIFF_SIZE / 4)) |
| 123 | +``` |
| 124 | + |
| 125 | +### Prioritized Files Output: |
| 126 | + |
| 127 | +* Sorted by priority, then file size. |
| 128 | +* Only included if total token count stays below `6000`. |
| 129 | +* Review summary added to `focused_diff.txt` |
| 130 | + |
| 131 | +### Final Check: |
| 132 | + |
| 133 | +If total tokens still > 6000: |
| 134 | + |
| 135 | +```bash |
| 136 | +head -c 20000 pr_diff.txt > truncated.txt |
| 137 | +``` |
| 138 | + |
| 139 | +--- |
| 140 | + |
| 141 | +## Step 4: Review PR with GitHub Models |
| 142 | + |
| 143 | +Uses the GitHub Models API to post a review comment. |
| 144 | + |
| 145 | +### Prompt Structure Sent to Model: |
| 146 | + |
| 147 | +| Section | Content Details | |
| 148 | +| ---------------------------- | ------------------------------------- | |
| 149 | +| `## Code Review Summary` | High-level summary | |
| 150 | +| `## Critical Issues` | Blocking problems | |
| 151 | +| `## Code Quality Analysis` | Security, Performance, Best Practices | |
| 152 | +| `## Detailed Findings` | Tabular issues | |
| 153 | +| `## Code Examples` | Before/After code with explanations | |
| 154 | +| `## Testing Recommendations` | Test ideas | |
| 155 | +| `## Documentation Notes` | Documentation suggestions | |
| 156 | + |
| 157 | +### Model Call Example: |
| 158 | + |
| 159 | +```js |
| 160 | +fetch('https://models.github.ai/inference/chat/completions', { |
| 161 | + method: 'POST', |
| 162 | + headers: { |
| 163 | + 'Authorization': `Bearer ${{ secrets.GH_PAT_MODELS }}`, |
| 164 | + ... |
| 165 | + }, |
| 166 | + body: JSON.stringify({ |
| 167 | + model: "gpt-4.1-nano", |
| 168 | + messages: [...], |
| 169 | + temperature: 0.1, |
| 170 | + max_tokens: 4000 |
| 171 | + }) |
| 172 | +}) |
| 173 | +``` |
| 174 | + |
| 175 | +--- |
| 176 | + |
| 177 | +## Fallback: Graceful Failure Handling |
| 178 | + |
| 179 | +If the diff is too large for GPT: |
| 180 | + |
| 181 | +* Parse filenames from diff |
| 182 | +* Show summary stats (lines added/removed, file size) |
| 183 | +* Post fallback message with: |
| 184 | + |
| 185 | + * Review checklist |
| 186 | + * Manual review strategy |
| 187 | + |
| 188 | +```markdown |
| 189 | +## AI Code Review - Large Diff Analysis |
| 190 | +... |
| 191 | +- [ ] Check for hardcoded secrets |
| 192 | +- [ ] Optimize database queries |
| 193 | +... |
| 194 | +``` |
| 195 | + |
| 196 | +--- |
| 197 | + |
| 198 | +## Step 5: Display Final Stats |
| 199 | + |
| 200 | +For debugging and transparency: |
| 201 | + |
| 202 | +```bash |
| 203 | +wc -c < pr_diff.txt # Byte size |
| 204 | +wc -l < pr_diff.txt # Line count |
| 205 | +``` |
| 206 | + |
| 207 | +Shows if the final diff obeys the token limit (\~6700 tokens max). |
| 208 | + |
| 209 | +--- |
| 210 | + |
| 211 | +## Summary Table |
| 212 | + |
| 213 | +| Feature | Status | |
| 214 | +| ----------------------------- | ------------ | |
| 215 | +| Smart diff prioritization | ✅ Enabled | |
| 216 | +| File-based categorization | ✅ Advanced | |
| 217 | +| Token-safe fallbacks | ✅ Included | |
| 218 | +| Markdown review formatting | ✅ Structured | |
| 219 | +| Graceful API error handling | ✅ Robust | |
| 220 | +| Final logging and diagnostics | ✅ Verbose | |
| 221 | + |
| 222 | + |
| 223 | +--- |
| 224 | + |
| 225 | +## Related Files Generated |
| 226 | + |
| 227 | +| File | Description | |
| 228 | +| ----------------------- | --------------------------------- | |
| 229 | +| `pr_diff.txt` | Full or focused diff for LLM | |
| 230 | +| `focused_diff.txt` | Diff after smart prioritization | |
| 231 | +| `sorted_files.txt` | File list sorted by priority/size | |
| 232 | +| `priority_analysis.txt` | Token and category for each file | |
0 commit comments