nirukk52
diff --git a/‎.claude-skills/dev-log-monitoring_skill/SKILL.md‎
Lines changed: 59 additions & 1 deletion b/‎.claude-skills/dev-log-monitoring_skill/SKILL.md‎
Lines changed: 59 additions & 1 deletion
diff --git a/‎.gitignore‎
Lines changed: 5 additions & 0 deletions b/‎.gitignore‎
Lines changed: 5 additions & 0 deletions
diff --git a/‎BROWSERSTACK_MIGRATION_SUMMARY.md‎
Lines changed: 112 additions & 0 deletions b/‎BROWSERSTACK_MIGRATION_SUMMARY.md‎
Lines changed: 112 additions & 0 deletions
diff --git a/‎FOUNDERS_NOTEPAD.md‎
Lines changed: 3 additions & 1 deletion b/‎FOUNDERS_NOTEPAD.md‎
Lines changed: 3 additions & 1 deletion
diff --git a/‎backend/agent/README.md‎
Lines changed: 89 additions & 0 deletions b/‎backend/agent/README.md‎
Lines changed: 89 additions & 0 deletions
@@ -77,12 +77,26 @@ curl -s http://localhost:5173 | head -n 1
 
 ## Step 2: Navigate and Execute Flow with Playwright MCP
 
+### 2.0 Close Existing Browser Session (MANDATORY)
+
+**ALWAYS close any existing browser tabs first to avoid "about:blank" issues:**
+
+```
+# List existing tabs
+mcp_cursor-browser-extension_browser_tabs(action: "list")
+
+# Close all tabs
+mcp_cursor-browser-extension_browser_tabs(action: "close", index: 0)
+```
+
+Repeat until no tabs remain. This prevents stale browser state and ensures clean test execution.
+
 ### 2.1 Navigate to Application
 
 Use Playwright MCP to open the app:
 
 ```
-mcp_playwright_browser_navigate(url: "http://localhost:5173")
+mcp_cursor-browser-extension_browser_navigate(url: "http://localhost:5173")
 ```
 
 ### 2.2 Trigger Drift Detection
@@ -405,6 +419,50 @@ curl -s http://localhost:4000/graph/diagnostics
 grep "agent.event.screen_perceived" /tmp/backend-logs.txt
 ```
 
+### BrowserStack Session Issues
+
+**Symptom:** Agent times out during ProvisionApp or EnsureDevice with BrowserStack
+
+**Check:**
+```bash
+# Look for BrowserStack session creation
+grep "Creating Appium session" /tmp/backend-logs.txt | grep browserStack
+
+# Check for timeout errors
+grep -i "timeout\|timed out" /tmp/backend-logs.txt | grep -i browserstack
+
+# Verify device name
+grep "deviceName" /tmp/backend-logs.txt | tail -5
+```
+
+**Debug with BrowserStack MCP:**
+```bash
+# Check recent sessions
+curl -s -u "USERNAME:KEY" "https://api-cloud.browserstack.com/app-automate/builds.json?limit=3"
+
+# Get session details (replace SESSION_ID)
+curl -s -u "USERNAME:KEY" "https://api-cloud.browserstack.com/app-automate/builds/BUILD_ID/sessions/SESSION_ID.json"
+
+# Check available devices
+curl -s -u "USERNAME:KEY" "https://api-cloud.browserstack.com/app-automate/devices.json" | grep "Samsung\|Pixel"
+```
+
+**Common Fixes:**
+1. **Invalid device name**: Query available devices via API - names are account-specific and case-sensitive
+2. **Missing APK upload**: Pre-upload APK in buildAgentContext, pass `bs://` URL to session (CRITICAL)
+3. **Session not closed**: Add `driver.deleteSession()` in Stop node handler
+4. **Timeout issues**: 60s default is sufficient (BrowserStack completes in ~40s)
+
+**Artifact Locations:**
+```bash
+# Screenshots and UI XML stored by Encore
+ls -la ~/Library/Caches/encore/objects/*/artifacts/obj:/artifacts/[RUN_ID]/screenshot/
+ls -la ~/Library/Caches/encore/objects/*/artifacts/obj:/artifacts/[RUN_ID]/ui_xml/
+
+# Open most recent screenshot
+open "$(find ~/Library/Caches/encore/objects/*/artifacts/ -name "*.png" | tail -1)"
+```
+
 ## Resources
 
 ### references/log_patterns.md
 
@@ -149,3 +149,8 @@ out
 # Skills (Local Development Only)
 # ============================================
 skills-main/
+
+# ============================================
+# MCP Configuration (Contains Secrets)
+# ============================================
+mcp.json
@@ -0,0 +1,112 @@
+# ✅ BrowserStack Migration Complete
+
+**Date**: 2025-11-15  
+**Branch**: `005-auto-device-provision`  
+**Status**: Ready for testing
+
+---
+
+## What Changed
+
+Replaced **local Appium + local devices** with **BrowserStack cloud device management**.
+
+### Before (Spec 001)
+- Manual Appium server management
+- Local device setup (USB, ADB)
+- Device prerequisite checks
+- 60s Appium startup timeout
+
+### After (BrowserStack)
+- ✅ Managed cloud Appium
+- ✅ Cloud devices (no USB needed)
+- ✅ No local prerequisites
+- ✅ Instant availability
+- ✅ CI/CD ready
+
+---
+
+## Required Setup
+
+Add these credentials to your `.env` file:
+
+```bash
+# BrowserStack Credentials (REQUIRED)
+BROWSERSTACK_USERNAME=your_username_here
+BROWSERSTACK_ACCESS_KEY=your_access_key_here
+
+# Optional (has default)
+BROWSERSTACK_HUB_URL=https://hub.browserstack.com/wd/hub
+```
+
+**Get credentials from**: Your BrowserStack account dashboard
+
+---
+
+## Files Modified
+
+1. ✅ `backend/config/env.ts` - Added 3 BrowserStack env vars
+2. ✅ `backend/agent/nodes/setup/EnsureDevice/appium-lifecycle.ts` - Removed local server management
+3. ✅ `backend/agent/nodes/setup/EnsureDevice/node.ts` - Simplified to hub health check only
+4. ✅ `backend/agent/adapters/appium/webdriverio/session.adapter.ts` - Added HTTPS/path support
+5. ✅ `specs/001-automate-appium-lifecycle/BROWSERSTACK_MIGRATION.md` - Full migration docs
+
+---
+
+## Deprecated Specs
+
+- ⚠️ **Spec 001** (automate-appium-lifecycle) - No longer needed
+- ⚠️ **Spec 005** (auto-device-provision) - BrowserStack handles this
+
+---
+
+## Testing Next Steps
+
+1. **Set credentials in `.env`** (see above)
+2. **Start backend**: `cd backend && encore run`
+3. **Verify health check**: Look for "browserstack hub is healthy" in logs
+4. **Start a run**: Device session should connect to BrowserStack
+5. **Monitor logs**: Check for `actor: "browserstack-lifecycle"` entries
+
+---
+
+## Architecture Summary
+
+```
+Before:
+User → Start Run → EnsureDevice → Check ADB → Start Appium → Connect Device → Run Agent
+
+After:
+User → Start Run → EnsureDevice → Check BrowserStack Hub → Connect Cloud Device → Run Agent
+```
+
+**Key Difference**: No local infrastructure required. Everything runs in the cloud.
+
+---
+
+## Documentation
+
+- **Full Migration Guide**: `specs/001-automate-appium-lifecycle/BROWSERSTACK_MIGRATION.md`
+- **Graphiti Memory**: Added to `group_id="screengraph"` with tags: `backend`, `agent`, `browserstack`, `spec-001-deprecated`
+
+---
+
+## Need Help?
+
+**Q: Where do I get BrowserStack credentials?**  
+A: From your BrowserStack account dashboard or contact project owner
+
+**Q: Will runs still work with local devices?**  
+A: No. System is now BrowserStack-only. No local Appium support.
+
+**Q: What if BrowserStack is down?**  
+A: Runs will fail with `BrowserStackUnavailableError` (retryable)
+
+**Q: How do I test locally during development?**  
+A: Use BrowserStack's cloud devices. No local testing supported.
+
+---
+
+**Ready to test!** 🚀
+
+Set your credentials and run: `cd backend && encore run`
+
@@ -86,4 +86,6 @@ Labels & Notes – Tag screens (“Paywall”, “KYC”), leave short reviews,
 
 
 Helpful commands
-open "/Users/priyankalalge/Library/Caches/encore/objects/d3u8d93djnh82bnf6l1g/artifacts/obj:/artifacts/"     
+open "/Users/priyankalalge/Library/Caches/encore/objects/d3u8d93djnh82bnf6l1g/artifacts/obj:/artifacts/"     
+
+https://developer.android.com/training/testing/ui-tests/screenshot
@@ -1,3 +1,92 @@
+# ONE GIANT MAP — TECH STACK + CLOUD + NODES + BUSINESS VALUE
+
++-----------------------------------------------------------------------------------------------------------------+
+|                                               BUSINESS LAYER                                                    |
+|  * Primary Use-Cases *                                                                                          |
+|    - Full automated mobile QA                                                                                   |
+|    - Visual regression + drift detection                                                                        |
+|    - Competitive analysis (auto-mapping competitor apps)                                                        |
+|    - CI/CD blocking on UX changes                                                                               |
+|    - Product insights: new flows, abandoned paths, UX stability                                                 |
+|    - Release-over-release diff timelines                                                                         |
+|    - PM-facing screen explorer dashboards                                                                        |
+|    - Engineering reproducible bug reports                                                                         |
+|                                                                                                                 |
+|  * Stakeholders *                                                                                               |
+|    - QA Teams | Product Managers | Founders | Designers | Analysts                                              |
++-----------------------------------------------------------------------------------------------------------------+
+                                                         |
+                                                         v
++-----------------------------------------------------------------------------------------------------------------+
+|                                          CLOUD DEVICE PROVIDERS                                                 |
+|   (Execution backend for ACT + PERCEIVE)                                                                        |
+|     - AWS Device Farm: real devices, parallel runs, screenshots, videos                                         |
+|     - BrowserStack App Automate: instant devices, Appium endpoints                                              |
+|     - Sauce Labs: automated mobile flows + advanced debugging                                                    |
+|                                                                                                                 |
+|    All feed into: ScreenGraph DriverPort → Perceive → Act                                                       |
++-----------------------------------------------------------------------------------------------------------------+
+                                                         |
+                                                         v
++-----------------------------------------------------------------------------------------------------------------+
+|                                        TOOLING + ANALYSIS LAYER                                                 |
+|   *Dynamic Crawlers & Fuzzers*                                                                                  |
+|       - DroidBot, DroidRun, Fastbot, Stoat, Ape, Monkey, MarlonTool, DroidMate                                 |
+|   *State/Model Extractors*                                                                                      |
+|       - Gator, DroidFax, FlowDroid                                                                              |
+|   *Runtime Introspection Tools*                                                                                 |
+|       - Stetho, Flipper, Facebook Infer                                                                         |
+|   *APK / XML Processing*                                                                                        |
+|       - Apktool, AXMLPrinter2, uiautomatorviewer                                                                |
+|   *LLM / AI Engines*                                                                                            |
+|       - Humanoid, GPT/LLM-based Explorers                                                                       |
+|                                                                                                                 |
+|   Role: Enrich ENUMERATE / CHOOSE / VERIFY / PERSIST / DETECTPROGRESS                                          |
++-----------------------------------------------------------------------------------------------------------------+
+                                                         |
+                                                         v
++-----------------------------------------------------------------------------------------------------------------+
+|                                        CORE SCREENGRAPH ENGINE                                                  |
+|   State-Space Engine:                                                                                           |
+|      - ScreenGraph (screenId ↔ hash)                                                                            |
+|      - ActionGraph (edges)                                                                                      |
+|      - Coverage metrics (screens, edges, paths)                                                                 |
+|      - Loop detection, stall scoring                                                                            |
+|                                                                                                                 |
+|   Visual Engine:                                                                                                |
+|      - Perceptual hashing (pHash/dHash/SSIM)                                                                    |
+|      - Pixel diffs, layout diffs, drift scoring                                                                 |
+|                                                                                                                 |
+|   Replay Engine:                                                                                                |
+|      - Deterministic reproduction of any run                                                                    |
++-----------------------------------------------------------------------------------------------------------------+
+                                                         |
+                                                         v
++-----------------------------------------------------------------------------------------------------------------+
+|                           8-NODE DETERMINISTIC LOOP (HEART OF THE SYSTEM)                                      |
+|                                                                                                                 |
+|   [1] PERCEIVE → capture screenshot + XML + hash                                                                |
+|   [2] ENUMERATE → extract actionable elements                                                                   |
+|   [3] CHOOSE → strategy/AI/coverage-guided decision                                                              |
+|   [4] ACT → execute via Appium/ADB/Cloud Device                                                                  |
+|   [5] VERIFY → confirm visual or structural change                                                               |
+|   [6] PERSIST → upsert screen/action + edges in graph                                                            |
+|   [7] DETECT PROGRESS → stall/forward/loop                                                                       |
+|   [8] SHOULD CONTINUE → continue / restart app / switch policy / stop                                            |
+|                                                                                                                 |
+|   Single writer: run_events                                                                                     |
++-----------------------------------------------------------------------------------------------------------------+
+                                                         |
+                                                         v
++-----------------------------------------------------------------------------------------------------------------+
+|                                             RUNTIME CORE                                                         |
+|   - Event log (run_events)                                                                                      |
+|   - Outbox (strict publish ordering)                                                                            |
+|   - Graph projector (screens/actions/edges)                                                                      |
+|   - Deterministic replay core                                                                                    |
++-----------------------------------------------------------------------------------------------------------------+
+
+
 # ScreenGraph Agent System (MVP Scaffolding)
 
 ## Service Role