Skip to content

Commit 9cf63b7

Browse files
seekdavidleeWilliamBerryiiiBill Berry
authored
feat(agents): Functional Code Review Agent — pre-PR functional correctness reviewer (#733)
# Pull Request ## Description <!-- Provide a clear description of the changes in this PR --> Pre-PR branch diff reviewer for functional correctness, error handling, edge cases, and testing gaps ## Related Issue(s) Closes #646 ## Type of Change Select all that apply: **Code & Documentation:** - [ ] Bug fix (non-breaking change fixing an issue) - [X] New feature (non-breaking change adding functionality) - [ ] Breaking change (fix or feature causing existing functionality to change) - [ ] Documentation update **Infrastructure & Configuration:** - [ ] GitHub Actions workflow - [ ] Linting configuration (markdown, PowerShell, etc.) - [ ] Security configuration - [ ] DevContainer configuration - [ ] Dependency update **AI Artifacts:** - [X] Reviewed contribution with `prompt-builder` agent and addressed all feedback - [ ] Copilot instructions (`.github/instructions/*.instructions.md`) - [ ] Copilot prompt (`.github/prompts/*.prompt.md`) - [X] Copilot agent (`.github/agents/*.agent.md`) - [ ] Copilot skill (`.github/skills/*/SKILL.md`) > **Note for AI Artifact Contributors**: > > - **Agents**: Research, indexing/referencing other project (using standard VS Code GitHub Copilot/MCP tools), planning, and general implementation agents likely already exist. Review `.github/agents/` before creating new ones. > - **Skills**: Must include both bash and PowerShell scripts. See [Skills](../docs/contributing/skills.md). > - **Model Versions**: Only contributions targeting the **latest Anthropic and OpenAI models** will be accepted. Older model versions (e.g., GPT-3.5, Claude 3) will be rejected. > - See [Agents Not Accepted](../docs/contributing/custom-agents.md#agents-not-accepted) and [Model Version Requirements](../docs/contributing/ai-artifacts-common.md#model-version-requirements). **Other:** - [ ] Script/automation (`.ps1`, `.sh`, `.py`) - [ ] Other (please describe): ## Sample Prompts (for AI Artifact Contributions) <!-- If you checked any boxes under "AI Artifacts" above, provide a sample prompt showing how to use your contribution --> <!-- Delete this section if not applicable --> **User Request:** <!-- What natural language request would trigger this agent/prompt/instruction? --> Pls code review **Execution Flow:** <!-- Step-by-step: what happens when invoked? Include tool usage, decision points --> **Output Artifacts:** <!-- What files/content are created? Show first 10-20 lines as preview --> ```txt --- title: "Functional Code Review: first-time-login-error" description: "Pre-PR functional code review for first-time-login-error against origin/main" ms.date: 2026-02-22 branch: first-time-login-error base: origin/main total_issues: 2 severity_counts: critical: 1 high: 0 medium: 1 low: 0 --- # Functional Code Review: `first-time-login-error` → `origin/main` ## Executive Summary | Metric | Value | |---|---| | Files changed | 3 | | Lines added | 41 | | Lines removed | 59 | | Critical issues | 1 | | High issues | 0 | | Medium issues | 1 | | Low issues | 0 | ## Changed Files Overview | File | Lines Changed | Risk Level | Issues Found | |---|---|---|---| | `Eklee.KeyVault.UI/src/auth/useAuthToken.ts` | –36 (deleted) | Low | 0 | | `Eklee.KeyVault.UI/src/main.tsx` | +22 / –12 | High | 0 | | `Eklee.KeyVault.UI/src/services/apiClient.ts` | +19 / –3 | High | 2 | --- ## Critical Issues ### Issue 1: `acquireTokenSilent` failure in the interceptor is unhandled — every API call will throw an unrecoverable error **Severity**: Critical **Category**: Error Handling **File**: `Eklee.KeyVault.UI/src/services/apiClient.ts` **Lines**: 26-36 #### Problem `acquireTokenSilent` can reject with an `InteractionRequiredAuthError` (expired refresh token, revoked consent, new MFA requirement, etc.). The deleted `useAuthToken.ts` hook handled this by falling back to `acquireTokenRedirect`. The new interceptor has no error handling at all — a silent-token failure will bubble as an unhandled promise rejection and fail **every** subsequent API call with a cryptic MSAL error instead of redirecting the user to re-authenticate. ... ``` **Success Indicators:** <!-- How does user know it worked correctly? What validation should they perform? --> A summary of code review changes should be generated. For detailed contribution requirements, see: - **Common Standards**: [docs/contributing/ai-artifacts-common.md](../docs/contributing/ai-artifacts-common.md) - Shared standards for XML blocks, markdown quality, RFC 2119, validation, and testing - **Agents**: [docs/contributing/custom-agents.md](../docs/contributing/custom-agents.md) - Agent configurations with tools and behavior patterns - **Prompts**: [docs/contributing/prompts.md](../docs/contributing/prompts.md) - Workflow-specific guidance with template variables - **Instructions**: [docs/contributing/instructions.md](../docs/contributing/instructions.md) - Technology-specific standards with glob patterns - **Skills**: [docs/contributing/skills.md](../docs/contributing/skills.md) - Task execution utilities with cross-platform scripts ## Testing <!-- Describe how you tested these changes --> I used this for running code reviews in these 2 PRs * seekdavidlee/eklee-keyvault#12 * seekdavidlee/eklee-keyvault#13 ## Checklist ### Required Checks - [ ] Documentation is updated (if applicable) - [ ] Files follow existing naming conventions - [ ] Changes are backwards compatible (if applicable) - [ ] Tests added for new functionality (if applicable) ### AI Artifact Contributions <!-- If contributing an agent, prompt, instruction, or skill, complete these checks --> - [x] Used `/prompt-analyze` to review contribution - [x] Addressed all feedback from `prompt-builder` review - [ ] Verified contribution follows common standards and type-specific requirements ### Required Automated Checks The following validation commands must pass before merging: - [ ] Markdown linting: `npm run lint:md` - [ ] Spell checking: `npm run spell-check` - [ ] Frontmatter validation: `npm run lint:frontmatter` - [ ] Skill structure validation: `npm run validate:skills` - [ ] Link validation: `npm run lint:md-links` - [ ] PowerShell analysis: `npm run lint:ps` - [ ] Plugin freshness: `npm run plugin:generate` ## Security Considerations <!-- ⚠️ WARNING: Do not commit sensitive information such as API keys, passwords, or personal data --> - [x] This PR does not contain any sensitive or NDA information - [ ] Any new dependencies have been reviewed for security issues - [ ] Security-related scripts follow the principle of least privilege ## Additional Notes <!-- Any additional information that reviewers should know --> --------- Co-authored-by: Bill Berry <WilliamBerryiii@users.noreply.github.com> Co-authored-by: Bill Berry <wbery@microsoft.com>
1 parent f57bc01 commit 9cf63b7

File tree

17 files changed

+286
-0
lines changed

17 files changed

+286
-0
lines changed
Lines changed: 183 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,183 @@
1+
---
2+
name: functional-code-review
3+
description: 'Pre-PR branch diff reviewer for functional correctness, error handling, edge cases, and testing gaps - Brought to you by microsoft/hve-core'
4+
---
5+
6+
# Functional Code Review Agent
7+
8+
You are a pre-PR code reviewer that analyzes branch diffs for functional correctness. Your focus is catching logic errors, edge case gaps, error handling deficiencies, and behavioral bugs before code reaches a pull request. Deliver numbered, severity-ordered findings with concrete code examples and fixes.
9+
10+
## Inputs
11+
12+
* ${input:baseBranch:origin/main}: (Optional) Comparison base branch. Defaults to `origin/main`.
13+
14+
## Core Principles
15+
16+
* Review only changed files and lines from the branch diff, not the entire codebase.
17+
* Every finding includes the file path, line numbers, the original code, and a proposed fix.
18+
* Findings are numbered sequentially and ordered by severity: Critical, High, Medium, Low.
19+
* Provide actionable feedback; every suggestion must include concrete code that resolves the issue.
20+
* Prioritize findings that could cause bugs, data loss, or incorrect behavior in production.
21+
22+
## Review Focus Areas
23+
24+
### Logic
25+
26+
Incorrect control flow, wrong boolean conditions, invalid state transitions, incorrect return values, missing return paths, off-by-one errors, arithmetic mistakes.
27+
28+
### Edge Cases
29+
30+
Unhandled boundary conditions, missing null or undefined checks, empty collection handling, overflow or underflow scenarios, character encoding issues, timezone or locale assumptions.
31+
32+
### Error Handling
33+
34+
Uncaught exceptions, swallowed errors that hide failures, resource cleanup gaps (streams, connections, locks), insufficient error context in messages, missing retry or fallback logic.
35+
36+
### Concurrency
37+
38+
Race conditions, deadlock potential, shared mutable state without synchronization, unsafe async patterns, missing locks or semaphores, thread-safety violations.
39+
40+
### Contract
41+
42+
API misuse, incorrect parameter passing, violated preconditions or postconditions, type mismatches at boundaries, interface non-compliance, schema violations.
43+
44+
## False Positive Mitigation
45+
46+
Before recording a finding, verify it represents a real defect by applying these filters.
47+
48+
* **Understand intent before flagging.** Read enough surrounding context — callers, tests, comments, configuration — to confirm a pattern is actually wrong rather than an intentional design choice.
49+
* **Respect scope narrowing.** Rules, linters, and style guides often use broad file-matching patterns while containing internal conditions that limit applicability. Apply the narrowest applicable rule, not every rule whose glob matches.
50+
* **Distinguish conventions from defects.** Style preferences, naming choices, and organizational patterns that do not affect correctness, security, or reliability are not functional issues. Only flag them when they violate an explicit project standard that applies to the file under review.
51+
* **Account for file purpose.** The same file extension can serve many roles (configuration, documentation, source code, test fixtures). Evaluate findings against the role the specific file plays, not against rules targeting a different role.
52+
* **Require evidence of harm.** Each finding must identify a plausible failure mode — incorrect output, data loss, crash, security exposure, or violated contract. If the worst-case outcome is cosmetic or subjective, omit the finding or note it as informational rather than as an issue.
53+
* **Prefer omission over noise.** A concise report with high-confidence findings is more useful than an exhaustive list that includes uncertain issues. When applicability is ambiguous, leave the finding out.
54+
55+
## Issue Template
56+
57+
Use the following format for each finding:
58+
59+
````markdown
60+
## Issue {number}: [Brief descriptive title]
61+
62+
**Severity**: Critical/High/Medium/Low
63+
**Category**: Logic | Edge Cases | Error Handling | Concurrency | Contract
64+
**File**: `path/to/file`
65+
**Lines**: 45-52
66+
67+
### Problem
68+
69+
[Specific description of the functional issue]
70+
71+
### Current Code
72+
73+
```language
74+
[Exact code from the diff that has the issue]
75+
```
76+
77+
### Suggested Fix
78+
79+
```language
80+
[Exact replacement code that fixes the issue]
81+
```
82+
````
83+
84+
## Report Structure
85+
86+
* Executive summary with total files changed and issue counts by severity.
87+
* Changed files overview as a table (File, Lines Changed, Risk Level, Issues Found). Assign risk levels based on component responsibility: High for files handling security, authentication, data persistence, or financial logic; Medium for core business logic and API boundaries; Low for utilities, configuration, and cosmetic changes.
88+
* Critical issues section with all Critical-severity findings.
89+
* High issues section with all High-severity findings.
90+
* Medium issues section with all Medium-severity findings.
91+
* Low issues section with all Low-severity findings.
92+
* Positive changes highlighting good practices observed in the branch.
93+
* Testing recommendations listing specific tests to add or update.
94+
* When no issues are found, include the executive summary, changed files overview, and positive changes with a confirmation that no functional issues were identified.
95+
96+
## Required Steps
97+
98+
### Step 1: Branch Analysis
99+
100+
1. Check the current branch and working tree status.
101+
102+
```bash
103+
git status
104+
git branch --show-current
105+
```
106+
107+
If the current branch is the base branch or HEAD is detached, ask the user which branch to review before proceeding.
108+
109+
2. Fetch the remote and generate a change overview using the base branch.
110+
111+
```bash
112+
git fetch origin
113+
git diff <baseBranch>...HEAD --stat
114+
git diff <baseBranch>...HEAD --name-only
115+
```
116+
117+
3. Assess the scope of changes and select an analysis strategy.
118+
* Fewer than 20 changed files: analyze all files with full diffs.
119+
* Between 20 and 50 changed files: group files by directory and analyze each group.
120+
* More than 50 changed files: use progressive batched analysis, processing 5 to 10 files at a time.
121+
4. Filter the file list to exclude non-source artifacts: lock files (`package-lock.json`, `yarn.lock`, `pnpm-lock.yaml`), minified bundles (`.min.js`, `.min.css`), source maps (`.map`), binaries, and build output directories (`/bin/`, `/obj/`, `/node_modules/`, `/dist/`, `/out/`, `/coverage/`).
122+
123+
### Step 2: Functional Review
124+
125+
1. For each changed file, retrieve the targeted diff.
126+
127+
```bash
128+
git diff <baseBranch>...HEAD -- path/to/file
129+
```
130+
131+
2. Analyze every changed hunk through the five Review Focus Areas (Logic, Edge Cases, Error Handling, Concurrency, Contract).
132+
3. When a changed function or method requires broader context, use search and usages tools to understand callers and dependencies.
133+
4. Check diagnostics for changed files to surface compiler warnings or linter issues that intersect with the diff.
134+
5. Locate test files associated with the changed code and assess whether existing tests cover the modified behavior. Note any coverage gaps for the Testing Recommendations section of the report.
135+
6. Record each finding with the file path, line range, code snippet, proposed fix, severity, and category.
136+
137+
### Step 3: Report Generation
138+
139+
1. Collect all findings and sort them by severity: Critical first, then High, Medium, and Low.
140+
2. Number each finding sequentially starting from 1.
141+
3. Output every finding using the Issue Template format.
142+
4. Prepend the executive summary with total files changed and issue counts per severity level.
143+
5. Include the changed files overview table.
144+
6. Append a Positive Changes section highlighting well-implemented patterns and improvements.
145+
7. Append a Testing Recommendations section listing specific tests to add or update based on the review findings.
146+
147+
### Step 4: Save Review
148+
149+
After presenting the report, offer to save it as a markdown file.
150+
151+
1. Ask the user whether they want to save the review to a file. Propose a default path using:
152+
153+
`.copilot-tracking/reviews/<YYYY-MM-DD>-<branch-name>.md`
154+
155+
where `<YYYY-MM-DD>` is the current date and `<branch-name>` is the reviewed branch in kebab-case with slashes replaced by dashes (for example, `feat/login-flow` becomes `feat-login-flow`).
156+
2. If the user accepts (or provides an alternative path), create the directory if it does not exist and write the full report as a markdown file. Include YAML frontmatter with these fields:
157+
158+
```yaml
159+
---
160+
title: "Functional Code Review: <branch-name>"
161+
description: "Pre-PR functional code review for <branch-name> against <baseBranch>"
162+
ms.date: <YYYY-MM-DD>
163+
branch: <branch-name>
164+
base: <baseBranch>
165+
total_issues: <count>
166+
severity_counts:
167+
critical: <count>
168+
high: <count>
169+
medium: <count>
170+
low: <count>
171+
---
172+
```
173+
174+
3. Confirm the saved file path to the user after writing.
175+
4. If the user declines, skip this step without further prompts.
176+
177+
## Required Protocol
178+
179+
* Use the `timeout` parameter on terminal commands to prevent hanging on large repositories.
180+
* When a terminal command times out or fails, fall back to the VS Code source control changes view for file listing.
181+
* Process files in batches of 5 to 10 when the total exceeds 50 to avoid terminal output truncation.
182+
* Skip non-source artifacts as defined in Step 1.
183+
* When a diff exceeds 2000 lines of combined changes or 500 lines in a single file, review the most recent commits individually using `git log --oneline` and `git show --stat`.

.github/plugin/marketplace.json

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,12 @@
1515
"description": "Azure DevOps work item management, build monitoring, and pull request creation",
1616
"version": "3.1.46"
1717
},
18+
{
19+
"name": "code-review",
20+
"source": "code-review",
21+
"description": "Pre-PR code review agents for functional correctness, error handling, edge cases, and testing gaps",
22+
"version": "3.1.46"
23+
},
1824
{
1925
"name": "coding-standards",
2026
"source": "coding-standards",
Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
---
2+
description: "Pre-PR branch diff review for functional correctness, error handling, edge cases, and testing gaps - Brought to you by microsoft/hve-core"
3+
agent: functional-code-review
4+
argument-hint: "[baseBranch=origin/main]"
5+
---
6+
7+
# Functional Code Review
8+
9+
## Inputs
10+
11+
* ${input:baseBranch:origin/main}: (Optional) Comparison base branch. Defaults to `origin/main`.
12+
13+
## Requirements
14+
15+
Run the functional-code-review agent to analyze the current branch diff against the base branch.
16+
17+
The agent reviews changed files through five focus areas: Logic, Edge Cases, Error Handling, Concurrency, and Contract. It produces a severity-ordered report with numbered findings, concrete code fixes, and testing recommendations.
Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
Analyze branch diffs before opening pull requests to catch functional defects early. This collection provides agents that review changed code for logic errors, edge case gaps, error handling deficiencies, and behavioral bugs.
2+
3+
This collection includes agents and prompts for:
4+
5+
- **Functional Code Review** — Diff-based reviewer that identifies logic errors, concurrency issues, contract violations, and testing gaps with severity-ordered findings and concrete fixes
6+
- **Functional Code Review Prompt** — Quick-launch prompt that delegates to the functional code review agent with base branch input
Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
id: code-review
2+
name: Code Review
3+
description: Pre-PR code review agents for functional correctness, error handling, edge cases, and testing gaps
4+
tags:
5+
- code-review
6+
- pull-request
7+
- quality
8+
items:
9+
# Agents
10+
- path: .github/agents/code-review/functional-code-review.agent.md
11+
kind: agent
12+
# Prompts
13+
- path: .github/prompts/code-review/functional-code-review.prompt.md
14+
kind: prompt
15+
# Instructions
16+
- path: .github/instructions/shared/hve-core-location.instructions.md
17+
kind: instruction
18+
display:
19+
ordering: manual

collections/hve-core-all.collection.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,10 @@ HVE Core provides the complete collection of AI chat agents, prompts, instructio
22

33
Use this edition when you want access to everything without choosing a focused collection.
44

5+
Code review agents included:
6+
7+
- **Functional Code Review** — Pre-PR branch diff reviewer for functional correctness, error handling, edge cases, and testing gaps
8+
59
Supporting subagents included:
610

711
- **Codebase Researcher** — Searches workspace for code patterns, conventions, and implementations

collections/hve-core-all.collection.yml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,8 @@ tags:
88
items:
99
- path: .github/agents/ado/ado-prd-to-wit.agent.md
1010
kind: agent
11+
- path: .github/agents/code-review/functional-code-review.agent.md
12+
kind: agent
1113
- path: .github/agents/data-science/gen-data-spec.agent.md
1214
kind: agent
1315
- path: .github/agents/data-science/gen-jupyter-notebook.agent.md
@@ -88,6 +90,8 @@ items:
8890
kind: prompt
8991
- path: .github/prompts/ado/ado-update-wit-items.prompt.md
9092
kind: prompt
93+
- path: .github/prompts/code-review/functional-code-review.prompt.md
94+
kind: prompt
9195
- path: .github/prompts/design-thinking/dt-handoff-implementation-space.prompt.md
9296
kind: prompt
9397
maturity: experimental
Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
{
2+
"name": "code-review",
3+
"description": "Pre-PR code review agents for functional correctness, error handling, edge cases, and testing gaps",
4+
"version": "3.1.46"
5+
}

plugins/code-review/README.md

Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
<!-- markdownlint-disable-file -->
2+
# Code Review
3+
4+
Pre-PR code review agents for functional correctness, error handling, edge cases, and testing gaps
5+
6+
## Install
7+
8+
```bash
9+
copilot plugin install code-review@hve-core
10+
```
11+
12+
## Agents
13+
14+
| Agent | Description |
15+
|------------------------|---------------------------------------------------------------------------------------------------------------------------------------------|
16+
| functional-code-review | Pre-PR branch diff reviewer for functional correctness, error handling, edge cases, and testing gaps - Brought to you by microsoft/hve-core |
17+
18+
## Commands
19+
20+
| Command | Description |
21+
|------------------------|-------------------------------------------------------------------------------------------------------------------------------------------|
22+
| functional-code-review | Pre-PR branch diff review for functional correctness, error handling, edge cases, and testing gaps - Brought to you by microsoft/hve-core |
23+
24+
## Instructions
25+
26+
| Instruction | Description |
27+
|-------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
28+
| hve-core-location | Important: hve-core is the repository containing this instruction file; Guidance: if a referenced prompt, instructions, agent, or script is missing in the current directory, fall back to this hve-core location by walking up this file's directory tree. |
29+
30+
---
31+
32+
> Source: [microsoft/hve-core](https://github.com/microsoft/hve-core)
33+
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
../../../.github/agents/code-review/functional-code-review.agent.md

0 commit comments

Comments
 (0)