-
Notifications
You must be signed in to change notification settings - Fork 26
feat(pr-review): add sub-agent delegation for file-level reviews #164
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from 5 commits
df860db
e0f7681
6669953
eb1b667
7dcf8b6
199020e
0ef1afc
4f1e5da
3226964
722ea61
bc425c4
c0447ca
2b1fcf3
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
|
|
@@ -24,6 +24,7 @@ Then configure the required secrets (see [Installation](#installation) below). | |||||
| - **A/B Testing**: Support for testing multiple LLM models | ||||||
| - **Review Context Awareness**: Considers previous reviews and unresolved threads | ||||||
| - **Evidence Enforcement**: Optional check that PR descriptions include concrete end-to-end proof the code works, not just test output | ||||||
| - **Sub-Agent Delegation** *(Experimental)*: Split large PR reviews across multiple sub-agents, one per file, then consolidate findings (see [Known Limitations](#known-limitations-sub-agent-delegation)) | ||||||
| - **Observability**: Optional Laminar integration for tracing and evaluation | ||||||
|
|
||||||
| ## Plugin Contents | ||||||
|
|
@@ -141,12 +142,24 @@ PR reviews are automatically triggered when: | |||||
| | `llm-base-url` | No | `''` | Custom LLM endpoint URL | | ||||||
| | `review-style` | No | `roasted` | **[DEPRECATED]** Previously chose between `standard` and `roasted` review styles. Now ignored — the styles have been merged into a single unified skill. | | ||||||
| | `require-evidence` | No | `'false'` | Require the reviewer to enforce an `Evidence` section in the PR description with end-to-end proof: screenshots/videos for frontend work, commands and runtime output for backend or scripts, and an agent conversation link when applicable. Test output alone does not qualify. | | ||||||
| | `use-sub-agents` | No | `'false'` | **(Experimental)** Enable sub-agent delegation for file-level reviews. The main agent acts as a coordinator that delegates per-file review work to `file_reviewer` sub-agents via the SDK TaskToolSet, then consolidates findings into a single PR review. Useful for large PRs with many changed files. | | ||||||
| | `extensions-repo` | No | `OpenHands/extensions` | Extensions repository | | ||||||
| | `extensions-version` | No | `main` | Git ref (tag, branch, or SHA) | | ||||||
| | `llm-api-key` | Yes | - | LLM API key | | ||||||
| | `github-token` | Yes | - | GitHub token for API access | | ||||||
| | `lmnr-api-key` | No | `''` | Laminar API key for observability | | ||||||
|
|
||||||
| ## Known Limitations: Sub-Agent Delegation | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🟢 Previous concern addressed: The Known Limitations section clearly documents the experimental nature and constraints (LLM-driven JSON parsing, potential information loss, no integration tests yet, sub-agents have no tools). This transparency is excellent for an experimental feature. |
||||||
|
|
||||||
| The `use-sub-agents` feature is **experimental** and has the following known constraints: | ||||||
|
|
||||||
| - **LLM-driven JSON parsing**: The coordinator agent relies on the LLM to parse and merge JSON responses from sub-agents. There is no code-level validation of sub-agent output, so malformed responses may cause incomplete reviews. | ||||||
| - **Potential information loss during consolidation**: When merging findings from multiple sub-agents, the coordinator may lose or deduplicate findings imperfectly, especially for cross-file issues. | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🟢 Nit: Typo - "analyse" should be "analyze" for consistency with US English used elsewhere in the codebase.
Suggested change
|
||||||
| - **No integration tests yet**: Current test coverage verifies prompt formatting only. End-to-end validation of the delegation flow requires manual workflow testing. | ||||||
| - **Sub-agents have no tools**: File reviewer sub-agents analyse the diff in their context window only — they cannot run commands or query the GitHub API. | ||||||
|
|
||||||
| These limitations are acceptable for an opt-in experimental feature and will be addressed as the feature matures. | ||||||
|
|
||||||
| ## A/B Testing Multiple Models | ||||||
|
|
||||||
| Test different LLM models by providing a comma-separated list: | ||||||
|
|
||||||
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
|
|
@@ -32,6 +32,11 @@ | |||||
| REPO_NAME: Repository name in format owner/repo (required) | ||||||
| REQUIRE_EVIDENCE: Whether to require PR description evidence showing the code | ||||||
| works ('true'/'false', default: 'false') | ||||||
| USE_SUB_AGENTS: Enable sub-agent delegation for file-level reviews | ||||||
| ('true'/'false', default: 'false'). When enabled, the main agent acts | ||||||
| as a coordinator that delegates per-file review work to | ||||||
| file_reviewer sub-agents via the TaskToolSet, then consolidates | ||||||
| findings into a single GitHub PR review. | ||||||
|
|
||||||
| For setup instructions, usage examples, and GitHub Actions integration, | ||||||
| see README.md in this directory. | ||||||
|
|
@@ -50,18 +55,29 @@ | |||||
| from typing import Any | ||||||
|
|
||||||
| from lmnr import Laminar | ||||||
| from openhands.sdk import LLM, Agent, AgentContext, Conversation, get_logger | ||||||
| from openhands.sdk import ( | ||||||
| LLM, | ||||||
| Agent, | ||||||
| AgentContext, | ||||||
| Conversation, | ||||||
| Tool, | ||||||
| get_logger, | ||||||
| register_agent, | ||||||
| ) | ||||||
| from openhands.sdk.context import Skill | ||||||
| from openhands.sdk.context.skills import load_project_skills | ||||||
| from openhands.sdk.conversation import get_agent_final_response | ||||||
| from openhands.sdk.git.utils import run_git_command | ||||||
| from openhands.sdk.plugin import PluginSource | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🔴 Critical: Partially unresolved deprecation issue. While Verify:
The previous critical review thread specifically flagged this import as problematic.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🟢 Clarification on previous review: The concern about Note: |
||||||
| from openhands.tools.delegate import DelegationVisualizer | ||||||
| from openhands.tools.preset.default import get_default_condenser, get_default_tools | ||||||
| from openhands.tools.task import TaskToolSet | ||||||
|
|
||||||
| # Add the script directory to Python path so we can import prompt.py | ||||||
| script_dir = Path(__file__).parent | ||||||
| sys.path.insert(0, str(script_dir)) | ||||||
|
|
||||||
| from prompt import format_prompt # noqa: E402 | ||||||
| from prompt import format_prompt, get_file_reviewer_skill_content # noqa: E402 | ||||||
|
|
||||||
| logger = get_logger(__name__) | ||||||
|
|
||||||
|
|
@@ -728,6 +744,7 @@ def validate_environment() -> dict[str, Any]: | |||||
| "model": os.getenv("LLM_MODEL", "anthropic/claude-sonnet-4-5-20250929"), | ||||||
| "base_url": os.getenv("LLM_BASE_URL"), | ||||||
| "require_evidence": _get_bool_env("REQUIRE_EVIDENCE"), | ||||||
| "use_sub_agents": _get_bool_env("USE_SUB_AGENTS"), | ||||||
| "pr_info": { | ||||||
| "number": os.getenv("PR_NUMBER"), | ||||||
| "title": os.getenv("PR_TITLE"), | ||||||
|
|
@@ -763,6 +780,53 @@ def fetch_pr_context(pr_number: str) -> tuple[str, str, str]: | |||||
| return pr_diff, commit_id, review_context | ||||||
|
|
||||||
|
|
||||||
| def _create_file_reviewer_agent(llm: LLM) -> Agent: | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🟢 Clarification on previous review: The factory function signature is correct. I verified that |
||||||
| """Factory for file_reviewer sub-agents used during delegation. | ||||||
|
|
||||||
| Each sub-agent receives a skill that defines its review persona and | ||||||
| expected output format. It has no tools — the coordinator handles | ||||||
| all GitHub API interaction. | ||||||
| """ | ||||||
| # review_style is read at registration time from the environment | ||||||
| review_style = os.getenv("REVIEW_STYLE", "standard").lower() | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🟡 Suggestion: The deprecated Consider either:
Not blocking, but worth clarifying the intended behavior.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🟢 Acceptable: Using the deprecated |
||||||
| skill_content = get_file_reviewer_skill_content(review_style) | ||||||
|
|
||||||
| skills = [ | ||||||
| Skill( | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🟠 Important: Verify the factory function signature matches The function accepts If the signature is wrong, agent instantiation will fail at runtime with no static type checking to catch it.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🟢 Nit: Spelling consistency - use US English to match the rest of the codebase (README uses "analyze").
Suggested change
|
||||||
| name="file_review_instructions", | ||||||
| content=skill_content, | ||||||
| trigger=None, | ||||||
| ), | ||||||
| ] | ||||||
| return Agent( | ||||||
| llm=llm, | ||||||
| tools=[], # sub-agents only analyse; coordinator posts the review | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🟢 Nit: Spelling consistency - "analyse" (British) vs "analyze" (US). The README was updated to use US English ("analyze"), so this comment should match.
Suggested change
|
||||||
| agent_context=AgentContext( | ||||||
| skills=skills, | ||||||
| system_message_suffix=( | ||||||
| "You are a file-level code reviewer sub-agent. " | ||||||
| "Return findings as a JSON array. Do NOT call the GitHub API." | ||||||
| ), | ||||||
| ), | ||||||
| ) | ||||||
|
|
||||||
|
|
||||||
| def _register_sub_agents() -> None: | ||||||
| """Register the file_reviewer agent type. | ||||||
|
|
||||||
| TaskToolSet auto-registers on import, so no explicit | ||||||
| ``register_tool()`` call is needed. | ||||||
| """ | ||||||
| register_agent( | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🟠 Important: Related to the deprecation concern - verify that |
||||||
| name="file_reviewer", | ||||||
| factory_func=_create_file_reviewer_agent, | ||||||
| description=( | ||||||
| "Reviews one or more files from a PR diff and returns structured " | ||||||
| "findings as a JSON array." | ||||||
| ), | ||||||
| ) | ||||||
|
|
||||||
|
|
||||||
| def create_conversation( | ||||||
| config: dict[str, Any], | ||||||
| secrets: dict[str, str], | ||||||
|
|
@@ -773,6 +837,9 @@ def create_conversation( | |||||
| handles wiring skills, MCP config, and hooks automatically. | ||||||
| Project-specific skills from the workspace are loaded separately. | ||||||
|
|
||||||
| When ``config["use_sub_agents"]`` is True the coordinator agent is | ||||||
| given the TaskToolSet so it can delegate to file_reviewer sub-agents. | ||||||
|
|
||||||
| Args: | ||||||
| config: Configuration dictionary from validate_environment() | ||||||
| secrets: Secrets to mask in output | ||||||
|
|
@@ -804,9 +871,17 @@ def create_conversation( | |||||
| skills=project_skills, | ||||||
| ) | ||||||
|
|
||||||
| tools = get_default_tools(enable_browser=False) | ||||||
|
|
||||||
| use_sub_agents = config.get("use_sub_agents", False) | ||||||
| if use_sub_agents: | ||||||
| _register_sub_agents() | ||||||
| tools.append(Tool(name=TaskToolSet.name)) | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🟢 Acceptable: No validation that sub-agent registration succeeded or that TaskToolSet.name is valid. If registration silently fails, the coordinator would try to use a non-existent tool. For an experimental feature with manual workflow testing, this is acceptable. If/when this graduates from experimental, consider adding validation: try:
_register_sub_agents()
tools.append(Tool(name=TaskToolSet.name))
logger.info(f"Sub-agent delegation enabled — {TaskToolSet.name} registered")
except Exception as e:
logger.error(f"Sub-agent registration failed: {e}")
raise |
||||||
| logger.info("Sub-agent delegation enabled — TaskToolSet added") | ||||||
|
|
||||||
| agent = Agent( | ||||||
| llm=llm, | ||||||
| tools=get_default_tools(enable_browser=False), | ||||||
| tools=tools, | ||||||
| agent_context=agent_context, | ||||||
| system_prompt_kwargs={"cli_mode": True}, | ||||||
| condenser=get_default_condenser( | ||||||
|
|
@@ -816,12 +891,18 @@ def create_conversation( | |||||
|
|
||||||
| # The plugin directory is the parent of the scripts/ directory | ||||||
| plugin_dir = script_dir.parent # plugins/pr-review/ | ||||||
| return Conversation( | ||||||
| agent=agent, | ||||||
| workspace=cwd, | ||||||
| secrets=secrets, | ||||||
| plugins=[PluginSource(source=str(plugin_dir))], | ||||||
| ) | ||||||
| conversation_kwargs: dict[str, Any] = { | ||||||
| "agent": agent, | ||||||
| "workspace": cwd, | ||||||
| "secrets": secrets, | ||||||
| "plugins": [PluginSource(source=str(plugin_dir))], | ||||||
| } | ||||||
| if use_sub_agents: | ||||||
| conversation_kwargs["visualizer"] = DelegationVisualizer( | ||||||
| name="PR Review Coordinator" | ||||||
| ) | ||||||
|
|
||||||
| return Conversation(**conversation_kwargs) | ||||||
|
|
||||||
|
|
||||||
| def run_review( | ||||||
|
|
@@ -930,9 +1011,11 @@ def main(): | |||||
| config = validate_environment() | ||||||
| pr_info = config["pr_info"] | ||||||
| require_evidence = config["require_evidence"] | ||||||
| use_sub_agents = config["use_sub_agents"] | ||||||
|
|
||||||
| logger.info(f"Reviewing PR #{pr_info['number']}: {pr_info['title']}") | ||||||
| logger.info(f"Require PR evidence: {require_evidence}") | ||||||
| logger.info(f"Sub-agent delegation: {use_sub_agents}") | ||||||
|
|
||||||
| try: | ||||||
| pr_diff, commit_id, review_context = fetch_pr_context(pr_info["number"]) | ||||||
|
|
@@ -952,6 +1035,7 @@ def main(): | |||||
| diff=pr_diff, | ||||||
| review_context=review_context, | ||||||
| require_evidence=require_evidence, | ||||||
| use_sub_agents=use_sub_agents, | ||||||
| ) | ||||||
|
|
||||||
| secrets = {} | ||||||
|
|
||||||
| Original file line number | Diff line number | Diff line change | ||||||||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -11,6 +11,11 @@ | |||||||||||||||||||||||||||||||||||||||||
| - {pr_number} - The PR number | ||||||||||||||||||||||||||||||||||||||||||
| - {commit_id} - The HEAD commit SHA | ||||||||||||||||||||||||||||||||||||||||||
| - {review_context} - Previous review comments and thread resolution status | ||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||
| When sub-agent delegation is enabled, the main agent acts as a coordinator | ||||||||||||||||||||||||||||||||||||||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could we have "smart activation" of delegation? Prompt the main agent so that it decides when to delegate vs when not to delegate. And maybe also add an override that forces the mode?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Addressed in 4f1e5da: added Tri-state modes:
The This comment was posted by an AI assistant (OpenHands) on behalf of a user. |
||||||||||||||||||||||||||||||||||||||||||
| that splits the diff by file and delegates individual file reviews to | ||||||||||||||||||||||||||||||||||||||||||
| sub-agents via the TaskToolSet, then consolidates results and posts the | ||||||||||||||||||||||||||||||||||||||||||
| final review. | ||||||||||||||||||||||||||||||||||||||||||
| """ | ||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||
| # Template for when there is review context available | ||||||||||||||||||||||||||||||||||||||||||
|
|
@@ -75,6 +80,76 @@ | |||||||||||||||||||||||||||||||||||||||||
| Analyze the changes and post your review using the GitHub API. | ||||||||||||||||||||||||||||||||||||||||||
| """ | ||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||
| # Prompt for the main coordinator agent when sub-agent delegation is enabled. | ||||||||||||||||||||||||||||||||||||||||||
| # The coordinator splits the diff into per-file chunks and delegates each | ||||||||||||||||||||||||||||||||||||||||||
| # to a "file_reviewer" sub-agent via the TaskToolSet, then consolidates | ||||||||||||||||||||||||||||||||||||||||||
| # and posts the review. | ||||||||||||||||||||||||||||||||||||||||||
| SUB_AGENT_PROMPT = """{skill_trigger} | ||||||||||||||||||||||||||||||||||||||||||
| /github-pr-review | ||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||
| You are a **review coordinator**. Your job is to delegate the actual file-level | ||||||||||||||||||||||||||||||||||||||||||
| review work to sub-agents and then consolidate their findings into a single | ||||||||||||||||||||||||||||||||||||||||||
| GitHub PR review. | ||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||
| ## Pull Request Information | ||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||
| - **Title**: {title} | ||||||||||||||||||||||||||||||||||||||||||
| - **Description**: {body} | ||||||||||||||||||||||||||||||||||||||||||
| - **Repository**: {repo_name} | ||||||||||||||||||||||||||||||||||||||||||
| - **Base Branch**: {base_branch} | ||||||||||||||||||||||||||||||||||||||||||
| - **Head Branch**: {head_branch} | ||||||||||||||||||||||||||||||||||||||||||
| - **PR Number**: {pr_number} | ||||||||||||||||||||||||||||||||||||||||||
| - **Commit ID**: {commit_id} | ||||||||||||||||||||||||||||||||||||||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🟢 Nit: "De-duplicate and drop low-signal noise" is subjective guidance for an LLM. One model's "noise" might be a useful finding to a human. For an experimental feature this is acceptable, but consider either:
Not a blocker given the experimental flag and documented limitations. |
||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||
| {review_context_section}{evidence_requirements_section} | ||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||
| ## Instructions | ||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||
| You have access to the **task** tool (TaskToolSet). Follow these steps: | ||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||
| 1. **Delegate file reviews** — for each changed file (or small group of | ||||||||||||||||||||||||||||||||||||||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🟡 Suggestion: The coordinator prompt has significant responsibilities:
For a production feature, consider whether steps 1, 3, and 4 should be handled in code rather than relying on LLM prompt engineering. For an experimental opt-in feature this is acceptable, but worth documenting as a known limitation.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🟢 Previous concern addressed: The coordinator prompt now includes explicit error handling guidance in step 2 ("If a sub-agent returns malformed output... skip its results and note the file in the review body"). Well done. |
||||||||||||||||||||||||||||||||||||||||||
| closely related files), call the task tool with: | ||||||||||||||||||||||||||||||||||||||||||
| - `subagent_type`: `"file_reviewer"` | ||||||||||||||||||||||||||||||||||||||||||
| - `prompt`: the diff chunk for the file(s), together with the PR context | ||||||||||||||||||||||||||||||||||||||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🟡 Suggestion: The coordinator has complex responsibilities with no explicit error handling guidance. Consider adding a step for handling malformed sub-agent responses:
Suggested change
This provides explicit guidance for handling failures, which is critical given the LLM-driven parsing noted in Known Limitations. |
||||||||||||||||||||||||||||||||||||||||||
| (title, description, base/head branch). Ask it to return a structured | ||||||||||||||||||||||||||||||||||||||||||
| list of findings with severity, file path, line number, and a short | ||||||||||||||||||||||||||||||||||||||||||
| description. | ||||||||||||||||||||||||||||||||||||||||||
| - `description`: a short label like `"Review src/utils.py"` | ||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||
| 2. **Collect results** — each task tool call returns the sub-agent's findings. | ||||||||||||||||||||||||||||||||||||||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🟠 Important: The coordinator relies entirely on the LLM to parse and merge JSON responses from sub-agents with no code-level validation. This could be fragile:
Consider adding explicit error handling in the code layer, or at minimum, document these limitations in the README as known constraints of the experimental feature. |
||||||||||||||||||||||||||||||||||||||||||
| Merge them all together. De-duplicate and drop low-signal noise. | ||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||
| 3. **Post the review** — use the GitHub API (as described by /github-pr-review) | ||||||||||||||||||||||||||||||||||||||||||
| to submit a single PR review with inline comments on the relevant lines. | ||||||||||||||||||||||||||||||||||||||||||
| Keep the top-level review body brief. | ||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||
| ## Full Diff | ||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||
| The complete diff is provided below. Split it by file when delegating. | ||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||
| ```diff | ||||||||||||||||||||||||||||||||||||||||||
| {diff} | ||||||||||||||||||||||||||||||||||||||||||
| ``` | ||||||||||||||||||||||||||||||||||||||||||
| """ | ||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||
| # System-level instruction injected into each file_reviewer sub-agent so it | ||||||||||||||||||||||||||||||||||||||||||
| # knows its role, the review style, and the expected output format. | ||||||||||||||||||||||||||||||||||||||||||
| FILE_REVIEWER_SKILL = """\ | ||||||||||||||||||||||||||||||||||||||||||
| You are a **file-level code reviewer**. You will receive a diff for one or more | ||||||||||||||||||||||||||||||||||||||||||
| files from a pull request together with PR metadata. | ||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||
| Review style: {review_style_description} | ||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||
| For each issue you find, return a JSON object with: | ||||||||||||||||||||||||||||||||||||||||||
| - `path`: the file path | ||||||||||||||||||||||||||||||||||||||||||
| - `line`: the diff line number (use the NEW file line number) | ||||||||||||||||||||||||||||||||||||||||||
| - `severity`: one of `critical`, `major`, `minor`, `nit` | ||||||||||||||||||||||||||||||||||||||||||
| - `body`: a concise description of the issue with a suggested fix when possible | ||||||||||||||||||||||||||||||||||||||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🟠 Important: The FILE_REVIEWER_SKILL lacks a concrete JSON schema example. The coordinator will parse LLM-generated JSON with no validation. Suggest adding an explicit example to reduce parsing ambiguity:
Suggested change
Do NOT post anything to the GitHub API — the coordinator agent will handle that. |
||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||
| Return your findings as a JSON array. If you find no issues, return `[]`. | ||||||||||||||||||||||||||||||||||||||||||
| Do NOT post anything to the GitHub API — the coordinator agent will handle that. | ||||||||||||||||||||||||||||||||||||||||||
| """ | ||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||
| def format_prompt( | ||||||||||||||||||||||||||||||||||||||||||
| skill_trigger: str, | ||||||||||||||||||||||||||||||||||||||||||
|
|
@@ -88,6 +163,7 @@ def format_prompt( | |||||||||||||||||||||||||||||||||||||||||
| diff: str, | ||||||||||||||||||||||||||||||||||||||||||
| review_context: str = "", | ||||||||||||||||||||||||||||||||||||||||||
| require_evidence: bool = False, | ||||||||||||||||||||||||||||||||||||||||||
| use_sub_agents: bool = False, | ||||||||||||||||||||||||||||||||||||||||||
| ) -> str: | ||||||||||||||||||||||||||||||||||||||||||
| """Format the PR review prompt with all parameters. | ||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||
|
|
@@ -105,6 +181,9 @@ def format_prompt( | |||||||||||||||||||||||||||||||||||||||||
| the review context section is omitted from the prompt. | ||||||||||||||||||||||||||||||||||||||||||
| require_evidence: Whether to instruct the reviewer to enforce PR description | ||||||||||||||||||||||||||||||||||||||||||
| evidence showing the code works. | ||||||||||||||||||||||||||||||||||||||||||
| use_sub_agents: When True, use the sub-agent coordinator prompt instead of | ||||||||||||||||||||||||||||||||||||||||||
| the single-agent prompt. The coordinator will delegate | ||||||||||||||||||||||||||||||||||||||||||
| file-level reviews to sub-agents and consolidate results. | ||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||
| Returns: | ||||||||||||||||||||||||||||||||||||||||||
| Formatted prompt string | ||||||||||||||||||||||||||||||||||||||||||
|
|
@@ -121,7 +200,9 @@ def format_prompt( | |||||||||||||||||||||||||||||||||||||||||
| _EVIDENCE_REQUIREMENT_SECTION if require_evidence else "" | ||||||||||||||||||||||||||||||||||||||||||
| ) | ||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||
| return PROMPT.format( | ||||||||||||||||||||||||||||||||||||||||||
| template = SUB_AGENT_PROMPT if use_sub_agents else PROMPT | ||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||
| return template.format( | ||||||||||||||||||||||||||||||||||||||||||
| skill_trigger=skill_trigger, | ||||||||||||||||||||||||||||||||||||||||||
| title=title, | ||||||||||||||||||||||||||||||||||||||||||
| body=body, | ||||||||||||||||||||||||||||||||||||||||||
|
|
@@ -134,3 +215,26 @@ def format_prompt( | |||||||||||||||||||||||||||||||||||||||||
| evidence_requirements_section=evidence_requirements_section, | ||||||||||||||||||||||||||||||||||||||||||
| diff=diff, | ||||||||||||||||||||||||||||||||||||||||||
| ) | ||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||
| def get_file_reviewer_skill_content(review_style: str = "standard") -> str: | ||||||||||||||||||||||||||||||||||||||||||
| """Return the file_reviewer sub-agent skill content. | ||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||
| Args: | ||||||||||||||||||||||||||||||||||||||||||
| review_style: 'standard' or 'roasted' | ||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||
| Returns: | ||||||||||||||||||||||||||||||||||||||||||
| Formatted skill content string for the file_reviewer agent type | ||||||||||||||||||||||||||||||||||||||||||
| """ | ||||||||||||||||||||||||||||||||||||||||||
| style_descriptions = { | ||||||||||||||||||||||||||||||||||||||||||
| "standard": ( | ||||||||||||||||||||||||||||||||||||||||||
| "Balanced review covering correctness, style, readability, " | ||||||||||||||||||||||||||||||||||||||||||
| "and security. Be constructive." | ||||||||||||||||||||||||||||||||||||||||||
| ), | ||||||||||||||||||||||||||||||||||||||||||
| "roasted": ( | ||||||||||||||||||||||||||||||||||||||||||
| "Linus Torvalds-style brutally honest review. Focus on data " | ||||||||||||||||||||||||||||||||||||||||||
| "structures, simplicity, and pragmatism. No hand-holding." | ||||||||||||||||||||||||||||||||||||||||||
| ), | ||||||||||||||||||||||||||||||||||||||||||
| } | ||||||||||||||||||||||||||||||||||||||||||
| description = style_descriptions.get(review_style, style_descriptions["standard"]) | ||||||||||||||||||||||||||||||||||||||||||
| return FILE_REVIEWER_SKILL.format(review_style_description=description) | ||||||||||||||||||||||||||||||||||||||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🟢 Nit: Consider adding a Known Limitations subsection under the experimental feature description that mentions:
This sets appropriate expectations for users trying the experimental feature.