feat: update webapp testing skill documentation and add bug report for run page regressions

nirukk52 · nirukk52 · commit e0eb1a07b524 · 2025-11-07T22:21:51.000-06:00
- Revise SKILL.md to reflect Playwright E2E testing integration and pre-push hook automation.
- Introduce new bug report for BUG-010 detailing critical regressions on the /run page, including issues with graph events and screenshot visibility.
- Provide steps to reproduce, expected vs actual results, and proposed fixes to guide resolution efforts.
diff --git a/.DS_Store b/.DS_Store
diff --git a/.claude-skills/webapp-testing_skill/SKILL.md b/.claude-skills/webapp-testing_skill/SKILL.md
@@ -1,15 +1,73 @@
 ---
 name: webapp-testing
-description: Playwright-first testing playbook for ScreenGraph. Automate the run page, capture screenshots, verify graph events, and fall back to Cursor browser tools when you need quick manual checks.
+description: Playwright E2E testing for ScreenGraph. Tests use .env configuration (package name is the main key). Automated tests run in pre-push hook. Use Cursor browser tools for quick manual inspection.
 ---
 
 # ScreenGraph Web App Testing (Playwright + Cursor Tools)
 
-This skill consolidates the former `cursor-browser-mastery`, `cursor-chrome-window-mastery`, and legacy webapp testing guidance into a **Playwright-first** workflow. Use Playwright (via MCP or local scripts) for reliable automation, and keep Cursor’s built-in browser tools handy for fast inspection or ad-hoc verification.
+**Updated 2025-11-08**: FR-020 implemented unified Playwright test suite integrated into pre-push hook. All tests use `.env` file as single source of truth.
 
 ---
 
-## 1. Standard Setup (Do This First)
+## 1. Automated E2E Tests (Primary Method)
+
+### Quick Start
+```bash
+# Headed mode (visual debugging with slowMo)
+bun run test:e2e:headed
+
+# CI mode (headless, fast)
+bun run test:e2e:ci
+
+# Interactive UI mode
+bun run test:e2e:ui
+
+# Run from root or frontend directory
+```
+
+### Test Configuration
+- **Package Name**: Read from `VITE_PACKAGE_NAME` in `.env` (currently: `com.jetbrains.kotlinconf`)
+- **Location**: `frontend/tests/e2e/`
+- **Config**: `frontend/playwright.config.ts` (environment-aware)
+- **Helpers**: `frontend/tests/e2e/helpers.ts` (reusable utilities)
+
+### Current Tests
+1. **Landing Page Load** - Verifies frontend health
+2. **Run Page Navigation** - Clicks "Detect My First Drift" → verifies Run Timeline heading
+
+### Pre-Push Hook Integration
+E2E tests run automatically before every push:
+```bash
+git push  # Runs smoke tests + E2E tests automatically
+```
+
+---
+
+## 2. Manual Playwright Testing (For New Tests)
+
+When creating new regression tests, place them in `frontend/tests/e2e/`:
+
+```typescript
+import { test, expect } from "@playwright/test";
+import { TEST_PACKAGE_NAME, TEST_APP_CONFIG } from "./helpers";
+
+test("my new test", async ({ page }) => {
+  await page.goto("/");
+  
+  // Your test logic here
+  // All config comes from .env via helpers
+  console.log(`Testing package: ${TEST_PACKAGE_NAME}`);
+});
+```
+
+**Key Principles:**
+- Use `.env` for all configuration (package name, Appium URL, APK path)
+- Use `helpers.ts` utilities for common operations
+- Write environment-aware tests (work in both headed and headless modes)
+
+---
+
+## 3. Cursor Browser Tools (Quick Inspection)
 1. **Start services**  
    ```bash
    .cursor/commands/start-services
diff --git a/jira/bugs/BUG-010-run-page-regressions/BUG-010-main.md b/jira/bugs/BUG-010-run-page-regressions/BUG-010-main.md
@@ -0,0 +1,119 @@
+# BUG-010: Run Page Regressions - Graph Events & Screenshots Not Visible
+
+> **Line Limit:** 150 lines max (enforced)  
+> **Priority**: P1  
+> **Status**: 🔴 Open
+
+---
+
+## Summary
+The `/run` page has three critical regressions preventing proper visualization of agent activity:
+1. **Graph events not visible** - Timeline events are not rendering
+2. **Screenshots not visible** - Discovered screenshot gallery is empty  
+3. **Stop node not called** - Run completion indicator never fires
+
+**Evidence**: `.playwright-mcp/drift-detection-with-screenshot.png`
+
+---
+
+## Severity / Impact
+- **Severity**: High
+- **Impact**: Complete loss of run visualization - users cannot verify agent behavior. Affects all users attempting drift detection. No workaround available.
+
+---
+
+## Environment
+- **Backend**: `encore run` (local) - Port 4000
+- **Frontend**: `localhost:5173`
+- **Browser/OS**: All browsers, all OS
+- **Package**: `com.jetbrains.kotlinconf` (from `.env`)
+
+---
+
+## Steps to Reproduce
+1. Start services: `bun run dev`
+2. Ensure Appium/device running
+3. Navigate to `http://localhost:5173`
+4. Click "Detect My First Drift" button
+5. Wait for navigation to `/run/{runId}` page
+6. **Observe**: Page loads but timeline empty, no screenshots, no stop node
+
+---
+
+## Expected Result
+- Timeline events appear in real-time (SSE updates)
+- Screenshot gallery populates with captured screens
+- Stop node fires when agent completes exploration
+- Run state transitions visible in UI
+
+---
+
+## Actual Result
+- Run Timeline heading visible ✅
+- Cancel Run button present ✅
+- **BUT**: No timeline events render
+- **BUT**: Screenshot gallery remains empty
+- **BUT**: Stop node never appears
+
+---
+
+## Root Cause
+**Investigation required**. Potential causes:
+1. SSE streams not connecting (check Network tab for `/run/stream`, `/graph/stream`)
+2. Backend not publishing events (check `run_event_outbox` table)
+3. Frontend not rendering events (Svelte reactivity issue)
+4. Agent not running (check backend logs for agent activity)
+
+**Diagnostic Commands** (see `.playwright-mcp/BUG-010-INVESTIGATION.md`):
+```bash
+# Check SSE streams
+curl -N http://localhost:4000/run/stream/{runId}
+
+# Check backend logs
+task backend:logs | grep -i "sse\|event"
+
+# Check browser DevTools → Network tab → filter "stream"
+```
+
+---
+
+## Proposed Fix
+1. **Investigate**: Check browser DevTools Network/Console tabs
+2. **Verify**: Backend logs show event publications
+3. **Test**: SSE endpoints directly with curl
+4. **Fix**: Address root cause (SSE connection, event publishing, or rendering)
+5. **Test**: Add automated tests for event/screenshot rendering
+
+**Suggested Test Additions** (`frontend/tests/e2e/run-page.spec.ts`):
+- Wait for graph events: `waitForGraphEvent(page, "agent.event.screenshot_captured")`
+- Verify screenshot count: `expect(countDiscoveredScreenshots(page)).toBeGreaterThan(0)`
+- Check stop node: `waitForStopNode(page, { timeout: 120000 })`
+
+---
+
+## Attachments / Logs
+- **Evidence**: `.playwright-mcp/drift-detection-with-screenshot.png`
+- **Investigation Guide**: `.playwright-mcp/BUG-010-INVESTIGATION.md`
+- **Test Helpers**: `frontend/tests/e2e/helpers.ts`
+
+---
+
+## Owner / Priority
+- **Reported by**: Founder (via Playwright analysis)
+- **Assigned to**: Next available frontend/fullstack engineer
+- **Priority**: P1 (High impact, core feature broken)
+
+---
+
+## Related Items
+- **Testing Infrastructure**: FR-020 (Run Page Regression Harness)
+- **Testing Guide**: `.claude-skills/webapp-testing_skill/SKILL.md`
+- **Original Report**: Request `f0164999-3a34-4705-bd7c-e426eff61c6f`
+
+---
+
+## Notes
+- Current E2E tests only verify page load, not event rendering
+- Test package: `com.jetbrains.kotlinconf` (defined in `.env`)
+- Manual reproduction required with Appium/device services running
+- See investigation guide for detailed diagnostic steps