Skip to content

Commit 06cd88f

Browse files
Merge pull request #3 from EuclidStellar/llm
Documentation Added
2 parents e0955d9 + 0276c1f commit 06cd88f

File tree

1 file changed

+229
-0
lines changed

1 file changed

+229
-0
lines changed

README.md

Lines changed: 229 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,232 @@
11
# Code Review Agent
22

33

4+
## Documentation: GitHub Models PR Review Workflow
5+
6+
## Overview
7+
8+
This GitHub Actions workflow is designed to automate **pull request (PR) code reviews** using **GitHub-hosted LLMs** (specifically `gpt-4.1-nano`) via the GitHub Models API. It uses:
9+
10+
* Smart diff file prioritization to manage token limits.
11+
* Markdown-rich prompts for structured review.
12+
* Graceful fallbacks for oversized diffs.
13+
14+
## Objective
15+
16+
To enhance pull request quality checks by automatically generating detailed, structured code reviews that:
17+
18+
* Flag security, performance, and maintainability issues
19+
* Recommend improvements with code snippets
20+
* Provide clear summaries and checklists
21+
22+
---
23+
24+
## Workflow Triggers
25+
26+
| Trigger | Description |
27+
| ------------------- | --------------------------------------------- |
28+
| `pull_request` | Runs on PRs targeting the `init-proj` branch. |
29+
| `workflow_dispatch` | Allows manual triggering from GitHub UI. |
30+
31+
```yaml
32+
ame: GitHub Models PR Review
33+
on:
34+
pull_request:
35+
branches:
36+
- init-proj
37+
workflow_dispatch:
38+
```
39+
40+
---
41+
42+
## Permissions
43+
44+
This workflow needs to:
45+
46+
* Read the repo content
47+
* Comment on PRs (`pull-requests: write`)
48+
* Create issues if needed
49+
* PAT (Personal Access Token) with Github Models `read-only` permission with `GH_PAT_MODELS` secret for API access
50+
51+
```yaml
52+
permissions:
53+
contents: read
54+
pull-requests: write
55+
issues: write
56+
```
57+
58+
---
59+
60+
## Job: `pr_review`
61+
62+
| Property | Value |
63+
| --------------- | ------------------ |
64+
| Runs on | `ubuntu-latest` |
65+
| Conditional Run | PR or manual event |
66+
67+
### Step 1: Checkout
68+
69+
Fetches the full repository history:
70+
71+
```yaml
72+
- name: Checkout Repository
73+
uses: actions/checkout@v4
74+
with:
75+
fetch-depth: 0
76+
```
77+
78+
### Step 2: Generate Diff
79+
80+
Creates a diff from base branch to HEAD and saves to `pr_diff.txt`:
81+
82+
```bash
83+
git fetch origin ${{ github.base_ref }}
84+
git diff origin/${{ github.base_ref }}...HEAD > pr_diff.txt
85+
```
86+
87+
---
88+
89+
## Step 3: Smart Diff Prioritization
90+
91+
Handles token limits (OpenAI models have token size limits):
92+
93+
### Logic:
94+
95+
| Condition | Action |
96+
| ------------------------- | ------------------------------ |
97+
| Diff within token limit | Use full diff |
98+
| Diff exceeds token limit | Prioritize critical files only |
99+
| Still exceeds safe tokens | Truncate file to \~5000 tokens |
100+
101+
### Categories Used:
102+
103+
| Priority | Category | File Types |
104+
| -------- | ------------- | ---------------------------------- |
105+
| 1 | Core Code | `.js`, `.py`, `.java`, etc. |
106+
| 2 | Configuration | `Dockerfile`, `.env`, `.yml`, etc. |
107+
| 3 | Tests/Docs | `README`, `test_*.py`, etc. |
108+
| 4 | Styles/Docs | `.css`, `.md`, `.txt` |
109+
| 5 | Other | Anything else |
110+
111+
### Snippet: File Categorization
112+
113+
```bash
114+
if echo "$file" | grep -qE '\.(js|jsx|ts|tsx|py|java|...)'; then
115+
PRIORITY=1
116+
CATEGORY="Core Code"
117+
```
118+
119+
### Token Calculation (Estimates 1 token ≈ 4 characters):
120+
121+
```bash
122+
ESTIMATED_TOKENS=$((DIFF_SIZE / 4))
123+
```
124+
125+
### Prioritized Files Output:
126+
127+
* Sorted by priority, then file size.
128+
* Only included if total token count stays below `6000`.
129+
* Review summary added to `focused_diff.txt`
130+
131+
### Final Check:
132+
133+
If total tokens still > 6000:
134+
135+
```bash
136+
head -c 20000 pr_diff.txt > truncated.txt
137+
```
138+
139+
---
140+
141+
## Step 4: Review PR with GitHub Models
142+
143+
Uses the GitHub Models API to post a review comment.
144+
145+
### Prompt Structure Sent to Model:
146+
147+
| Section | Content Details |
148+
| ---------------------------- | ------------------------------------- |
149+
| `## Code Review Summary` | High-level summary |
150+
| `## Critical Issues` | Blocking problems |
151+
| `## Code Quality Analysis` | Security, Performance, Best Practices |
152+
| `## Detailed Findings` | Tabular issues |
153+
| `## Code Examples` | Before/After code with explanations |
154+
| `## Testing Recommendations` | Test ideas |
155+
| `## Documentation Notes` | Documentation suggestions |
156+
157+
### Model Call Example:
158+
159+
```js
160+
fetch('https://models.github.ai/inference/chat/completions', {
161+
method: 'POST',
162+
headers: {
163+
'Authorization': `Bearer ${{ secrets.GH_PAT_MODELS }}`,
164+
...
165+
},
166+
body: JSON.stringify({
167+
model: "gpt-4.1-nano",
168+
messages: [...],
169+
temperature: 0.1,
170+
max_tokens: 4000
171+
})
172+
})
173+
```
174+
175+
---
176+
177+
## Fallback: Graceful Failure Handling
178+
179+
If the diff is too large for GPT:
180+
181+
* Parse filenames from diff
182+
* Show summary stats (lines added/removed, file size)
183+
* Post fallback message with:
184+
185+
* Review checklist
186+
* Manual review strategy
187+
188+
```markdown
189+
## AI Code Review - Large Diff Analysis
190+
...
191+
- [ ] Check for hardcoded secrets
192+
- [ ] Optimize database queries
193+
...
194+
```
195+
196+
---
197+
198+
## Step 5: Display Final Stats
199+
200+
For debugging and transparency:
201+
202+
```bash
203+
wc -c < pr_diff.txt # Byte size
204+
wc -l < pr_diff.txt # Line count
205+
```
206+
207+
Shows if the final diff obeys the token limit (\~6700 tokens max).
208+
209+
---
210+
211+
## Summary Table
212+
213+
| Feature | Status |
214+
| ----------------------------- | ------------ |
215+
| Smart diff prioritization | ✅ Enabled |
216+
| File-based categorization | ✅ Advanced |
217+
| Token-safe fallbacks | ✅ Included |
218+
| Markdown review formatting | ✅ Structured |
219+
| Graceful API error handling | ✅ Robust |
220+
| Final logging and diagnostics | ✅ Verbose |
221+
222+
223+
---
224+
225+
## Related Files Generated
226+
227+
| File | Description |
228+
| ----------------------- | --------------------------------- |
229+
| `pr_diff.txt` | Full or focused diff for LLM |
230+
| `focused_diff.txt` | Diff after smart prioritization |
231+
| `sorted_files.txt` | File list sorted by priority/size |
232+
| `priority_analysis.txt` | Token and category for each file |

0 commit comments

Comments
 (0)