Skip to content

nandhabn/ai-assist

Repository files navigation

AI Flow Recorder β€” Chrome Extension

A Manifest V3 Chrome Extension that does two things at once:

  1. Records user interaction flows for AI-powered test-automation generation.
  2. Acts as a live AI co-pilot β€” predicting your next UI action in real time using a deterministic scoring engine with an optional Gemini / ChatGPT fallback.

Features

🎯 Flow Recording

  • Captures clicks, input changes, form submissions, route changes (SPA-aware), and API calls (fetch/XHR).
  • Every event includes:
    • Session ID, timestamp, URL, and route (pathname)
    • Full element metadata: tag, ID, className, innerText, name, type, role, aria-label, data-testid
    • Stable CSS selector and XPath fallback
    • API details: method, endpoint, status code, duration

πŸ€– Live AI Agent (Shadow DOM Panel)

  • Floating panel injected via Shadow DOM β€” zero CSS pollution.
  • Shows the top-3 predicted next actions with confidence score and per-factor score breakdown (proximity, intent, form, role, direction).
  • Highlights the predicted element on hover.
  • "Run" button to execute the predicted action directly.
  • "Fill Form" assist β€” detects active forms and autofills fields with smart placeholder data.

🧠 Prediction Engine

  • Deterministic-first: weighted scoring across five factors before any AI call.

    Factor Weight
    Proximity 0.30
    Intent 0.25
    Form 0.25
    Role 0.10
    Direction 0.10
  • Hard form dominance: submit-type actions inside the active form are boosted Γ—1.25; out-of-form candidates penalised Γ—0.75.

  • AI fallback (Gemini or ChatGPT) kicks in only when confidence < 0.2 and the AI cooldown window has passed.

  • AI results are validated by semantic similarity before overriding deterministic results.

πŸ“Š Flow Analysis & AI Export (Popup)

  • Flow tab: browse recorded events, view selectors and API calls, export as JSON.
  • AI tab: auto-generated summary, structured LLM prompt for test generation, metrics, and suggested test points.
  • Export as JSON or Markdown.

Tech Stack

Layer Technology
Language TypeScript 5
UI (popup) React 18, CSS modules
Build Vite 5 (two configs: popup/background + content)
Extension Manifest V3, service worker, content script
AI Gemini / ChatGPT via pluggable AIProvider interface

Project Structure

chrome-extension-flow-recorder/
β”œβ”€β”€ public/
β”‚   └── manifest.json               # Extension manifest (MV3)
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ background/
β”‚   β”‚   └── background.ts           # Service worker: messaging, storage, tab broadcast
β”‚   β”œβ”€β”€ config/
β”‚   β”‚   β”œβ”€β”€ aiConfig.ts             # API key config from VITE_* env vars
β”‚   β”‚   └── prompts.ts              # AI prompt templates
β”‚   β”œβ”€β”€ content/
β”‚   β”‚   β”œβ”€β”€ content.ts              # Main content script: recording + agent orchestration
β”‚   β”‚   β”œβ”€β”€ agentPanel.ts           # Floating AI panel (Shadow DOM)
β”‚   β”‚   β”œβ”€β”€ agentManager.ts         # Agent lifecycle and prediction scheduling
β”‚   β”‚   β”œβ”€β”€ autofill.ts             # Form autofill assist
β”‚   β”‚   β”œβ”€β”€ chatgptBridge.ts        # Bridge script for ChatGPT tab provider
β”‚   β”‚   β”œβ”€β”€ execution.ts            # Execute predicted actions
β”‚   β”‚   β”œβ”€β”€ flyout.ts / flyout.css  # Flyout overlay UI
β”‚   β”‚   β”œβ”€β”€ formDetect.ts           # Active form detection
β”‚   β”‚   β”œβ”€β”€ prediction.ts           # Prediction wiring in content context
β”‚   β”‚   β”œβ”€β”€ providers.ts            # AI provider instantiation for content
β”‚   β”‚   β”œβ”€β”€ rateLimit.ts            # AI call rate limiting
β”‚   β”‚   └── state.ts                # Content-script shared state
β”‚   β”œβ”€β”€ popup/
β”‚   β”‚   β”œβ”€β”€ popup.html
β”‚   β”‚   β”œβ”€β”€ main.tsx                # React entry point
β”‚   β”‚   β”œβ”€β”€ App.tsx                 # Tabs: Control | Flow | AI
β”‚   β”‚   β”œβ”€β”€ App.css
β”‚   β”‚   β”œβ”€β”€ components/
β”‚   β”‚   β”‚   β”œβ”€β”€ RecorderControl.tsx # Start/Stop/Clear, Agent toggle
β”‚   β”‚   β”‚   β”œβ”€β”€ FlowViewer.tsx      # Recorded event list and export
β”‚   β”‚   β”‚   β”œβ”€β”€ AIPanel.tsx         # AI analysis, prompts, export
β”‚   β”‚   β”‚   β”œβ”€β”€ Dashboard.tsx       # Summary dashboard
β”‚   β”‚   β”‚   └── MissionBar.tsx / .css
β”‚   β”‚   └── styles/                 # Component CSS
β”‚   β”œβ”€β”€ types/
β”‚   β”‚   β”œβ”€β”€ index.ts                # RecordedEvent, FlowNode/Edge, ACTION_TYPES, ElementMetadata
β”‚   β”‚   └── ai.ts                   # AIProvider, CompactContext, AIPrediction
β”‚   β”œβ”€β”€ ui/
β”‚   β”‚   └── agentPanel.ts           # Panel render helpers
β”‚   └── utils/
β”‚       β”œβ”€β”€ storage.ts              # chrome.storage.local wrapper
β”‚       β”œβ”€β”€ selectorGenerator.ts    # CSS selector & XPath generation
β”‚       β”œβ”€β”€ elementAnalyzer.ts      # ElementMetadata, form helpers
β”‚       β”œβ”€β”€ navigationDetector.ts   # SPA route-change detection
β”‚       β”œβ”€β”€ apiInterceptor.ts       # Fetch/XHR interception
β”‚       β”œβ”€β”€ flowAnalyzer.ts         # analyzeEventFlow, detectForms, identifyTestPoints
β”‚       β”œβ”€β”€ aiFormatter.ts          # prepareFlowData, JSON/Markdown export
β”‚       β”œβ”€β”€ contextBuilder.ts       # Full PageContext builder
β”‚       β”œβ”€β”€ predictionEngine.ts     # generatePredictions, maybeUseAI, fillFormFields
β”‚       β”œβ”€β”€ aiProviderFactory.ts    # createAIProvider(name, apiKey)
β”‚       β”œβ”€β”€ geminiProvider.ts       # GeminiProvider
β”‚       β”œβ”€β”€ chatgptProvider.ts      # ChatGPTProvider (OpenAI)
β”‚       β”œβ”€β”€ chatgptTabProvider.ts   # ChatGPT via tab bridge
β”‚       β”œβ”€β”€ novaProvider.ts         # Nova provider
β”‚       β”œβ”€β”€ batchingProvider.ts     # Batching wrapper
β”‚       β”œβ”€β”€ aiQueue.ts              # AI request queue
β”‚       └── agentExecutor.ts        # Agent action executor
β”œβ”€β”€ vite.config.ts                  # Popup + background build
β”œβ”€β”€ vite.config.content.ts          # Content script build (does not clear dist)
β”œβ”€β”€ tsconfig.json
β”œβ”€β”€ package.json
β”œβ”€β”€ .env                            # API keys (never commit β€” see below)
└── dist/                           # Build output loaded by Chrome

Setup & Installation

Prerequisites

  • Node.js 18+
  • npm
  • Chrome, Edge, or Brave (Chromium-based)

1. Install dependencies

cd chrome-extension-flow-recorder
npm install

2. Configure API keys (optional)

Create a .env file in the project root. These are build-time keys injected by Vite β€” never commit this file.

VITE_GEMINI_API_KEY=your_gemini_api_key
VITE_OPENAI_API_KEY=your_openai_api_key

The Gemini key is used first; ChatGPT is the fallback. If neither key is set, the extension uses deterministic predictions only.

3. Build

npm run build

This runs two Vite builds:

  • vite build β€” popup (popup.html, popup.js) and background service worker (background.js).
  • vite build --config vite.config.content.ts β€” content script (content.js) and ChatGPT bridge (chatgptBridge.js).

Output lands in dist/.

4. Load in Chrome

  1. Open chrome://extensions/
  2. Enable Developer mode (top-right toggle)
  3. Click Load unpacked
  4. Select the dist/ folder

Development

# Watch mode: rebuilds popup/background + content script on change
npm run dev

# Type-check
npm run check

Usage

Recording a Flow

  1. Click the extension icon to open the popup.
  2. Control tab β†’ click Start.
  3. Interact with any web page.
  4. Click Stop when done.
  5. Flow tab β€” browse events, view selectors, or Export JSON.

AI Analysis

  1. AI tab β†’ view auto-generated summary, LLM prompt, metrics, and test point suggestions.
  2. Copy the structured prompt to clipboard or export as Markdown for use with any LLM.

Live Agent Panel

The floating panel appears automatically on every page once the extension is loaded. It:

  • Displays top-3 predicted next actions with confidence and score breakdown.
  • Lets you Run a prediction or Fill Form when a form is active.
  • Can be toggled from the Control tab in the popup.

Architecture Overview

Message Protocol

The background service worker brokers all communication:

Action Sender Effect
START_RECORDING Popup Persists state; broadcasts to all tabs
STOP_RECORDING Popup Persists state; broadcasts to all tabs
GET_EVENTS Popup Returns stored events
CLEAR_EVENTS Popup Clears events from storage
SAVE_SESSION Popup Appends current events as a saved session
TOGGLE_AGENT Popup Enables/disables the agent panel on all tabs
EVENT_RECORDED Content Keeps service worker alive

Storage Keys

All stored in chrome.storage.local:

Key Value
flowRecorder_events Array of RecordedEvent
flowRecorder_sessionId Current session ID
flowRecorder_isRecording Boolean
flowRecorder_sessions Saved sessions
flowRecorder_lastUserAction Last event (for agent context)
flowRecorder_agentEnabled Whether the agent panel is on

Prediction Pipeline

mousemove / interaction event
        ↓
  buildPageContext()
        ↓
  generatePredictions(context)   ← deterministic weighted scoring
        ↓
  confidence < 0.2 AND cooldown passed?
        ↓ yes
  maybeUseAI(context)            ← Gemini / ChatGPT
        ↓
  renderAgentPanel(topThree, confidence)

Security & Privacy

  • Local-only storage β€” no data leaves the browser unless you paste an exported prompt into an external LLM.
  • API keys are embedded at build time via Vite's import.meta.env; they exist only in the extension bundle.
  • The agent panel is isolated in Shadow DOM; it cannot be styled or read by the host page.
  • Elements marked [data-flow-recorder] are excluded from recording and prediction.

Limitations

  • Recordings are scoped to the active tab (no cross-tab recording).
  • API keys are bundled into the extension β€” for personal/dev use only. Avoid publishing to the Chrome Web Store with live API keys.
  • AI fallback requires a network connection and a valid API key.

Potential Upgrades

  • Move AI calls to the background service worker for better key isolation.
  • Adaptive weight tuning via reinforcement learning.
  • Intent drift detection.
  • Runtime provider switching in the popup UI.
  • Direct test file export (Cypress / Playwright).
  • Multi-tab recording.
  • Prediction heatmap overlay.

Built with TypeScript, React 18, Vite 5, and Chrome Manifest V3

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages