Skip to content

test: reduce /create-bug root cause analysis cost with two-phase approach#26829

Open
chrisleewilcox wants to merge 1 commit intomainfrom
MMQA-1530-two-phase-root-cause-analysis
Open

test: reduce /create-bug root cause analysis cost with two-phase approach#26829
chrisleewilcox wants to merge 1 commit intomainfrom
MMQA-1530-two-phase-root-cause-analysis

Conversation

@chrisleewilcox
Copy link
Contributor

@chrisleewilcox chrisleewilcox commented Mar 2, 2026

Description

The root cause analysis in /create-bug (Step 6) currently runs entirely on Opus — all file reads, grep searches, git log/blame calls, and reasoning happen in one expensive context. This refactors Step 6 into a two-phase approach:

  • Phase 1 (Gather Evidence): A Haiku subagent (model: "haiku") performs all the I/O-heavy work — grep/glob/read for code tracing, git log for regression PRs, and pattern searching for scope analysis. Returns a structured summary.
  • Phase 2 (Analyze & Post): The main Opus context receives the distilled summary and does the reasoning — determines root cause, identifies regression PRs, assesses scope, files separate bugs, and posts the comment.

This reduces cost by ~5-8x per analysis while maintaining output quality since reasoning stays on Opus.

Opus (current) Haiku gathering + Opus reasoning
Gathering phase tokens ~100-150k on Opus ~100-150k on Haiku
Reasoning phase tokens (included above) ~10-20k on Opus
Estimated cost $1-3 ~$0.15-0.40

Related issues

MMQA-1530

Manual testing steps

Feature: Two-phase root cause analysis in /create-bug

  Scenario: user opts in to root cause analysis
    Given a bug report has been created via /create-bug

    When user opts in to root cause investigation at Step 5
    Then a Haiku subagent is launched to gather evidence
    And the subagent returns a structured summary with relevant files, code flow trace, git history findings, and pattern matches
    And the main context analyzes the summary and posts a comment with all required sections

Screenshots/Recordings

Before

Root cause analysis ran entirely on Opus — all file reads, greps, git logs, and reasoning in one context.

After

Run /create-bug in Claude Code terminal, opt in to root cause analysis, and verify the Agent tool is called with model: "haiku" for evidence gathering while the final comment still contains all required sections.

Changelog

CHANGELOG entry: null

Pre-merge author checklist

  • I've followed MetaMask Contributor Docs
  • I've completed the PR template to the best of my ability
  • I've included tests if applicable
  • I've documented my code using JSDoc format if applicable
  • I've applied the right labels on the PR (see Labeling Guideline)

Pre-merge reviewer checklist

  • I've manually tested the PR (e.g. pull and round trip)
  • I confirm that this PR addresses all acceptance criteria described in the ticket

Note

Low Risk
Documentation-only change to the /create-bug Claude command; no production code paths or data handling are modified. Risk is limited to potential workflow/prompt regressions in how root-cause analysis is executed and formatted.

Overview
Refactors /create-bug Step 6 root-cause investigation guidance to a two-phase flow: a Haiku subagent performs I/O-heavy evidence gathering (code search/trace, git log regression hunting, and scope/pattern matching) and returns a structured summary, then the main context performs the reasoning and posts a single consolidated issue comment.

Adds explicit instructions for the Haiku prompt (required sections and constraints on output) and updates the main-context checklist to consume that summary to determine root cause, identify regression PRs, assess blast radius, and file/link follow-up bugs as needed.

Written by Cursor Bugbot for commit 932e944. This will update automatically on new commits. Configure here.

Use a Haiku subagent for evidence gathering (file reads, grep, git log)
and reserve the main Opus context for reasoning and posting, reducing
cost by ~5-8x per analysis while maintaining output quality.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@github-actions
Copy link
Contributor

github-actions bot commented Mar 2, 2026

CLA Signature Action: All authors have signed the CLA. You may need to manually re-run the blocking PR check if it doesn't pass in a few minutes.

@metamaskbot metamaskbot added the team-qa QA team label Mar 2, 2026
@github-actions github-actions bot added the size-S label Mar 2, 2026
@github-actions
Copy link
Contributor

github-actions bot commented Mar 2, 2026

🔍 Smart E2E Test Selection

  • Selected E2E tags: None (no tests recommended)
  • Selected Performance tags: None (no tests recommended)
  • Risk Level: low
  • AI Confidence: 98%
click to see 🤖 AI reasoning details

E2E Test Selection:
The only changed file is .claude/commands/create-bug.md, which is a Claude AI assistant command configuration file. This file defines how Claude should help create bug reports for MetaMask Mobile - it's internal tooling documentation, not application code.

The changes restructure the "Root Cause Analysis" workflow from three sequential phases to a two-phase approach using a Haiku subagent for evidence gathering and the main context for analysis. This is purely a workflow optimization for the AI assistant.

This file:

  • Is NOT part of the MetaMask Mobile application code
  • Is NOT part of the E2E test infrastructure (tests/, page-objects, fixtures)
  • Is NOT part of CI/CD workflows (.github/workflows/)
  • Does NOT affect any user-facing functionality
  • Does NOT affect any test execution or test framework

No E2E tests are needed because this change has zero impact on the application or testing infrastructure.

Performance Test Selection:
This change is to a Claude AI assistant command configuration file (.claude/commands/create-bug.md). It has no impact on app performance, UI rendering, data loading, or any application code. No performance tests are needed.

View GitHub Actions results

@chrisleewilcox chrisleewilcox added no-changelog no-changelog Indicates no external facing user changes, therefore no changelog documentation needed No QA Needed Apply this label when your PR does not need any QA effort. needs-dev-review PR needs reviews from other engineers (in order to receive required approvals) labels Mar 2, 2026
@chrisleewilcox chrisleewilcox marked this pull request as ready for review March 2, 2026 19:38
@github-project-automation github-project-automation bot moved this to Needs dev review in PR review queue Mar 2, 2026
@chrisleewilcox chrisleewilcox changed the title refactor: split /create-bug root cause analysis into two-phase approach test: reduce /create-bug root cause analysis cost with two-phase approach Mar 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

needs-dev-review PR needs reviews from other engineers (in order to receive required approvals) No QA Needed Apply this label when your PR does not need any QA effort. no-changelog no-changelog Indicates no external facing user changes, therefore no changelog documentation needed size-S team-qa QA team

Projects

Status: Needs dev review

Development

Successfully merging this pull request may close these issues.

2 participants