chore: Minor fixes to e2e test debugger agent definition (#19235)

spalladino · web-flow · commit a4b3fa5c2c37 · 2025-12-23T21:56:07.000Z
Reduced description to avoid polluting main context, and add note on
zooming in to specific failed test.
diff --git a/yarn-project/.claude/agents/e2e-test-debugger.md b/yarn-project/.claude/agents/e2e-test-debugger.md
@@ -1,6 +1,18 @@
 ---
 name: e2e-test-debugger
-description: Use this agent when debugging failed end-to-end tests that produce lengthy logs. Specifically:\n\n<example>\nContext: User has a failed CI run and wants to understand why their e2e test timed out.\nuser: "Can you help me debug this failed test? Here's the CI URL: https://ci.example.com/builds/12345"\nassistant: "I'll use the e2e-test-debugger agent to analyze these test logs and identify the root cause of the failure."\n<commentary>The user is requesting help with a failed e2e test from CI, which matches the e2e-test-debugger's specialty. Launch the agent to handle the investigation.</commentary>\n</example>\n\n<example>\nContext: User just ran e2e tests locally and got failures with long log output.\nuser: "My e2e tests are failing locally. The logs are at ./test-output/e2e-logs.txt"\nassistant: "I'll launch the e2e-test-debugger agent to investigate the test failures in your local logs."\n<commentary>Local e2e test failure with log file path - perfect use case for the e2e-test-debugger agent.</commentary>\n</example>\n\n<example>\nContext: User mentions test timeout issues after making changes.\nuser: "I made some changes to the network layer and now the e2e tests are timing out. Not sure what's wrong."\nassistant: "Let me use the e2e-test-debugger agent to help investigate these timeout issues. Do you have the test logs available?"\n<commentary>E2e test timeouts are explicitly mentioned as a common failure mode that this agent specializes in debugging.</commentary>\n</example>\n\n<example>\nContext: Proactive detection - user shares CI link without explicitly asking for debugging.\nuser: "Build failed again: https://ci.example.com/builds/67890"\nassistant: "I see a failed build. Let me use the e2e-test-debugger agent to analyze what went wrong."\n<commentary>User shared a CI build link indicating failure - proactively launch the debugging agent to investigate.</commentary>\n</example>
+description: |
+  Use this agent when debugging failed end-to-end tests that produce lengthy logs. Specifically:
+  - CI build failures (e.g., ci.aztec-labs.com links)
+  - Local e2e test failures with log files
+  - Test timeout investigations
+  - Proactively when user shares a failed build link
+
+  Do NOT use for: unit test failures, compilation errors, or quick debugging questions.
+
+  <example>
+  user: "Build failed: https://ci.aztec-labs.com/builds/12345"
+  assistant: "I'll use the e2e-test-debugger agent to analyze the failure."
+  </example>
 model: sonnet
 color: cyan
 ---
@@ -30,7 +42,17 @@ You are an elite End-to-End Test Debugging Specialist with deep expertise in dis
    - Parse timestamps to understand timing and sequence of events
    - Map log entries to their source modules for codebase investigation
 
-3. **Comparative Analysis**
+3. **Zooming In on the Failing Test**
+   - **Start at the end**: Check the test run summary at the end of the logs to identify which specific test(s) in the suite failed
+   - **Find test boundaries**: Logs contain "Running test TESTNAME" markers that indicate when each test starts
+   - **Filter aggressively**: Once you identify the failing test name, focus only on log lines between that test's "Running test" marker and the next one (or end of logs)
+   - **Consider hooks**: Failures may occur in `beforeAll`, `beforeEach`, `afterEach`, or `afterAll` hooks rather than the test itself
+     * Hook failures typically appear before/after the test markers
+     * A `beforeAll` failure will prevent all tests in the suite from running
+     * An `afterEach` failure may cause cascading issues in subsequent tests
+   - **Ignore noise**: Tests that passed before the failing test are usually irrelevant—don't waste time analyzing their logs unless you suspect state pollution
+
+4. **Comparative Analysis**
    - Systematically compare failed run logs with successful run logs
    - Identify divergence points: where do the logs start differing?
    - Look for missing log entries in failed runs that appear in successful runs