Skip to content

Conversation

@Diveyam-Mishra
Copy link

@Diveyam-Mishra Diveyam-Mishra commented Oct 25, 2025

Summary Splitting Extension work from PR #87
Records iframe interactions reliably without flooding steps.
Replays deterministic steps inside the correct iframe instead of navigating into the iframe URL.
Filters ad/analytics/recaptcha iframe navigations and about:blank.
Deduplicates steps (including scroll/input/click) to match real user intent.

Extension: Content (src/entrypoints/content.ts)
Enabled all frames and about:blank: allFrames: true, matchAboutBlank: true.
Added frame context on all events: frameUrl, frameIdPath, isTopFrame.
Input noise reduction: isTrusted checks, debounce per element, skip rapid empty repeats.
New-tab intent signal: PREPARE_NEW_TAB emitted on modifier/middle-clicks and target=_blank.

Extension: Background (src/entrypoints/background.ts)
Activated tab gating + new-tab intent correlation; ignore unactivated/background tabs.
Iframe nav filtering:
Tracks interactedFrameUrls per tab; only accepts rrweb Meta navigations from frames the user interacted with.
Drops about:blank; adds time-window gating (defaults to 3s after interaction).
Allow/deny logic (allowlist overrides blocklist), with settings persisted.
rrweb payload handling:
Scroll: merge consecutive updates; suppress chrome://newtab; dedupe identical XY within 200ms regardless of targetId.
Meta: convert to navigation step only if allowed by the above rules.
Step dedupe post-pass: collapse consecutive duplicates for navigation/input/click/scroll/key_press; update timestamp/screenshot instead of appending.
Navigation consolidation: per-tab merge of redirects to reduce churn.
Persisted recording state: default stopped; load/broadcast on startup; save on start/stop.

Extension: Types/Schema
types.ts: Added optional frameIdPath to StoredCustom* and frameUrl to rrweb.
workflow-types.ts: Promoted frameIdPath to Step types (Click/Input/KeyPress/Navigation/Scroll).
@sauravpanda


Summary by cubic

Improved extension recording fidelity with iframe-aware capture and strong noise gating so steps are cleaner and replay reliably in the correct frame. Added settings to control iframe recording and domain filters, and reduced duplicate scroll/input/click/navigation events.

  • New Features

    • Capture in all frames with frameUrl/frameIdPath/isTopFrame; replay inside the right iframe.
    • Gate noise: only record activated tabs, allow/block domain lists, time-window after iframe interaction; drop about:blank/chrome://.
    • Merge and dedupe steps (scroll/input/click/key_press/navigation) and consolidate redirects per tab.
    • Options page to configure iframe recording, window gating, and filters; settings persisted via chrome.storage.
    • Broadened host permissions to inject into iframes on external sites.
  • Refactors

    • Removed extraction steps and semantic targeting fields; simplified workflow schema to core steps (navigation, click, input, key_press, scroll).
    • Streamlined sidepanel UI to focus on recording; added frame context to events and cleaned rrweb handling.

Copilot AI review requested due to automatic review settings October 25, 2025 17:56
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR improves recording fidelity and reduces noise by implementing iframe interaction gating, new-tab intent correlation, and aggressive deduplication. It removes extraction/semantic targeting features, adds a configurable options page for iframe/blocklist settings, and ensures all frames record events with proper frame context.

Key changes:

  • Activated tab + new-tab intent filtering to suppress ad/tracker noise
  • Iframe meta navigation gating (only allow from user-interacted frames within time window)
  • Input/scroll/navigation deduplication to collapse rapid repeats
  • Extension options page for iframe recording and domain allow/deny lists

Reviewed Changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
extension/wxt.config.ts Added storage permission and broader host permissions for iframe injection
extension/src/lib/workflow-types.ts Removed extraction/radio/select/checkbox step types; added frameIdPath to remaining steps
extension/src/lib/types.ts Added frameIdPath/frameUrl to events; removed extraction event type
extension/src/entrypoints/sidepanel/components/recording-view.tsx Removed extraction UI and stats display
extension/src/entrypoints/options.html New options page for iframe settings and domain lists
extension/src/entrypoints/content.ts Enabled all frames; added frame context to all events; removed semantic extraction logic; added input debouncing
extension/src/entrypoints/background.ts Activated tab gating; iframe navigation filtering; step deduplication; removed extraction handling
.github/copilot-instructions.md New contributor/AI agent onboarding guide

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 127 to 133
const frameIdPath = (() => {
try {
let win: any = window; const parts: number[] = [];
while (win !== win.parent) { const parent = win.parent; let idx=0; for (let i=0;i<parent.frames.length;i++){ if(parent.frames[i]===win){idx=i;break;} } parts.unshift(idx); win=parent; if(parts.length>10) break; }
return parts.length ? parts.join('.') : '0';
} catch { return '0'; }
})();
Copy link

Copilot AI Oct 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The frameIdPath calculation logic is duplicated in multiple handler functions (at lines 127-133, 259-275, 305-311, 361, and 443). Extract this into a shared helper function to reduce code duplication and improve maintainability.

Copilot uses AI. Check for mistakes.
const times = interactedFrameTimes[rrEvent.tabId] || {};
const lastTs = times[fUrl];
if (!lastTs) break;
if (Date.now() - lastTs > settings.iframeWindow) break;
Copy link

Copilot AI Oct 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The time comparison uses Date.now() but lastTs comes from interactedFrameTimes which stores Date.now() at interaction time (line 646), while rrEvent.timestamp is used elsewhere. This creates an inconsistency where the window calculation mixes server-side event timestamps with client-side current time. Consider using rrEvent.timestamp - lastTs > settings.iframeWindow for consistent time-based filtering.

Suggested change
if (Date.now() - lastTs > settings.iframeWindow) break;
if (rrEvent.timestamp - lastTs > settings.iframeWindow) break;

Copilot uses AI. Check for mistakes.
Comment on lines 46 to 60
// Hostname patterns for iframe navigation noise we want to suppress
const BLOCKED_IFRAME_HOST_PATTERNS: RegExp[] = [
/doubleclick\.net$/i,
/googlesyndication\.com$/i,
/googleadservices\.com$/i,
/amazon-adsystem\.com$/i,
/recaptcha\.google\.com$/i,
/recaptcha\.net$/i,
/googletagmanager\.com$/i,
/indexww\.com$/i,
/adtrafficquality\.google$/i,
/2mdn\.net$/i,
/gstaticadssl\.googleapis\.com$/i,
];

Copy link

Copilot AI Oct 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The constant BLOCKED_IFRAME_HOST_PATTERNS is defined but never used in the code. The filtering logic instead uses settings.blocklist loaded from chrome.storage (lines 534-537). Either remove this unused constant or integrate it as the default blocklist value.

Suggested change
// Hostname patterns for iframe navigation noise we want to suppress
const BLOCKED_IFRAME_HOST_PATTERNS: RegExp[] = [
/doubleclick\.net$/i,
/googlesyndication\.com$/i,
/googleadservices\.com$/i,
/amazon-adsystem\.com$/i,
/recaptcha\.google\.com$/i,
/recaptcha\.net$/i,
/googletagmanager\.com$/i,
/indexww\.com$/i,
/adtrafficquality\.google$/i,
/2mdn\.net$/i,
/gstaticadssl\.googleapis\.com$/i,
];

Copilot uses AI. Check for mistakes.
allowlist: [] as string[],
};
let settings: { enableIframes: boolean; iframeWindow: number; blocklist: string[]; allowlist: string[] } = { ...DEFAULT_SETTINGS };
chrome.storage.sync.get(DEFAULT_SETTINGS, (s: any) => { settings = { ...settings, ...s }; });
Copy link

Copilot AI Oct 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The chrome.storage.sync.get callback executes asynchronously, but the settings variable is immediately accessed by message handlers. This creates a race condition where early events may use default settings instead of user-configured values. Consider awaiting the storage load or deferring event processing until settings are initialized.

Copilot uses AI. Check for mistakes.
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 issues found across 8 files

Prompt for AI agents (all 3 issues)

Understand the root cause of the following 3 issues and fix them.


<file name="extension/src/entrypoints/options.html">

<violation number="1" location="extension/src/entrypoints/options.html:32">
The blocklist placeholder uses a literal &quot;\n&quot;, so the UI shows &quot;example.com\nads.example.org&quot; instead of putting the second domain on a new line. Use an actual newline or an HTML entity to render the placeholder correctly.</violation>
</file>

<file name="extension/src/entrypoints/background.ts">

<violation number="1" location="extension/src/entrypoints/background.ts:364">
Navigation consolidation currently overwrites the previous navigation step for a tab, so later navigations replace earlier history instead of being appended. This drops essential steps for replay.</violation>

<violation number="2" location="extension/src/entrypoints/background.ts:551">
Applying the iframe window using Date.now() causes previously accepted iframe navigations to disappear once enough wall-clock time passes; compare against the rrweb event timestamp instead so stored steps remain stable.</violation>
</file>

React with 👍 or 👎 to teach cubic. Mention @cubic-dev-ai to give feedback, ask questions, or re-run the review.

</div>
<div class="row">
<label for="blocklist">Blocked domains (newline separated)</label>
<textarea id="blocklist" placeholder="example.com\nads.example.org"></textarea>
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot Oct 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The blocklist placeholder uses a literal "\n", so the UI shows "example.com\nads.example.org" instead of putting the second domain on a new line. Use an actual newline or an HTML entity to render the placeholder correctly.

Prompt for AI agents
Address the following comment on extension/src/entrypoints/options.html at line 32:

<comment>The blocklist placeholder uses a literal &quot;\n&quot;, so the UI shows &quot;example.com\nads.example.org&quot; instead of putting the second domain on a new line. Use an actual newline or an HTML entity to render the placeholder correctly.</comment>

<file context>
@@ -0,0 +1,80 @@
+      &lt;/div&gt;
+      &lt;div class=&quot;row&quot;&gt;
+        &lt;label for=&quot;blocklist&quot;&gt;Blocked domains (newline separated)&lt;/label&gt;
+        &lt;textarea id=&quot;blocklist&quot; placeholder=&quot;example.com\nads.example.org&quot;&gt;&lt;/textarea&gt;
+      &lt;/div&gt;
+      &lt;div class=&quot;row&quot;&gt;
</file context>
Suggested change
<textarea id="blocklist" placeholder="example.com\nads.example.org"></textarea>
<textarea id="blocklist" placeholder="example.com&#10;ads.example.org"></textarea>
Fix with Cubic

const times = interactedFrameTimes[rrEvent.tabId] || {};
const lastTs = times[fUrl];
if (!lastTs) break;
if (Date.now() - lastTs > settings.iframeWindow) break;
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot Oct 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Applying the iframe window using Date.now() causes previously accepted iframe navigations to disappear once enough wall-clock time passes; compare against the rrweb event timestamp instead so stored steps remain stable.

Prompt for AI agents
Address the following comment on extension/src/entrypoints/background.ts at line 551:

<comment>Applying the iframe window using Date.now() causes previously accepted iframe navigations to disappear once enough wall-clock time passes; compare against the rrweb event timestamp instead so stored steps remain stable.</comment>

<file context>
@@ -479,38 +519,52 @@ export default defineBackground(() =&gt; {
+                const times = interactedFrameTimes[rrEvent.tabId] || {};
+                const lastTs = times[fUrl];
+                if (!lastTs) break;
+                if (Date.now() - lastTs &gt; settings.iframeWindow) break;
+              }
+            } catch {
</file context>
Suggested change
if (Date.now() - lastTs > settings.iframeWindow) break;
if (rrEvent.timestamp - lastTs > settings.iframeWindow) break;

✅ Addressed in 55c8768

steps[existingIdx].type === "navigation"
) {
// Update existing navigation (redirect / title change)
(steps[existingIdx] as NavigationStep).url = navUrl;
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot Oct 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Navigation consolidation currently overwrites the previous navigation step for a tab, so later navigations replace earlier history instead of being appended. This drops essential steps for replay.

Prompt for AI agents
Address the following comment on extension/src/entrypoints/background.ts at line 364:

<comment>Navigation consolidation currently overwrites the previous navigation step for a tab, so later navigations replace earlier history instead of being appended. This drops essential steps for replay.</comment>

<file context>
@@ -295,119 +319,127 @@ export default defineBackground(() =&gt; {
+            steps[existingIdx].type === &quot;navigation&quot;
           ) {
+            // Update existing navigation (redirect / title change)
+            (steps[existingIdx] as NavigationStep).url = navUrl;
+            steps[existingIdx].timestamp = event.timestamp;
+          } else {
</file context>
Fix with Cubic

@Diveyam-Mishra
Copy link
Author

/copilot review again

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you remove this file

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also add a screen recording and screenshots before and after for this (will help me merge quickly)

Copilot AI review requested due to automatic review settings October 25, 2025 19:09
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

if (!targetElement || !("value" in targetElement)) return;
const isPassword = targetElement.type === "password";

// Ignore programmatic (non user-trusted) input events – these often cause massive duplication
Copy link

Copilot AI Oct 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use a hyphen instead of an en-dash in the comment.

Suggested change
// Ignore programmatic (non user-trusted) input events these often cause massive duplication
// Ignore programmatic (non user-trusted) input events - these often cause massive duplication

Copilot uses AI. Check for mistakes.
const times = interactedFrameTimes[rrEvent.tabId] || {};
const lastTs = times[fUrl];
if (!lastTs) break;
const eventTimestamp = typeof (rrEvent as any).timestamp === "number" ? (rrEvent as any).timestamp : Date.now();
Copy link

Copilot AI Oct 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The property (rrEvent as any).timestamp is accessed twice. Store it in a variable to improve readability and avoid redundant type casting.

Suggested change
const eventTimestamp = typeof (rrEvent as any).timestamp === "number" ? (rrEvent as any).timestamp : Date.now();
const rrEventTimestamp = (rrEvent as any).timestamp;
const eventTimestamp = typeof rrEventTimestamp === "number" ? rrEventTimestamp : Date.now();

Copilot uses AI. Check for mistakes.
Comment on lines +225 to +228
if (
type !== "CUSTOM_TAB_ACTIVATED" &&
!activatedTabs.has(tabId) &&
!(payload.openerTabId && recentNewTabIntents[payload.openerTabId] && Date.now() - recentNewTabIntents[payload.openerTabId] < NEW_TAB_INTENT_WINDOW_MS)
Copy link

Copilot AI Oct 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The expression recentNewTabIntents[payload.openerTabId] is accessed twice in the same condition. Store it in a variable to improve readability and avoid redundant lookups.

Suggested change
if (
type !== "CUSTOM_TAB_ACTIVATED" &&
!activatedTabs.has(tabId) &&
!(payload.openerTabId && recentNewTabIntents[payload.openerTabId] && Date.now() - recentNewTabIntents[payload.openerTabId] < NEW_TAB_INTENT_WINDOW_MS)
const openerIntentTime = payload.openerTabId ? recentNewTabIntents[payload.openerTabId] : undefined;
if (
type !== "CUSTOM_TAB_ACTIVATED" &&
!activatedTabs.has(tabId) &&
!(payload.openerTabId && openerIntentTime && Date.now() - openerIntentTime < NEW_TAB_INTENT_WINDOW_MS)

Copilot uses AI. Check for mistakes.
@Diveyam-Mishra
Copy link
Author

image the PR version trims meaningless noise, adds the frame metadata we needed, and preserves the real interactions. It’s an improvement over the “before” capture. image
Recording.2025-10-26.004204.mp4
Recording.2025-10-26.004228.mp4

@Diveyam-Mishra
Copy link
Author

@sauravpanda Let me know if this makes it easier

@Diveyam-Mishra
Copy link
Author

@sauravpanda
Copy link
Collaborator

Thanks for the screenshot, does this expects the system to use semantic abstraction? because the previous version used xpath and css selectors to take action when executing the workflow.

| NavigationStep
| ClickStep
| InputStep
| RadioStep
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why have we removed this steps? Any specific reason?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets keep it then, I dont want to break the old version, I will dive deeper and do a cleanup soon. So only make changes which you feel necessary, dont remove any old code.

cssSelector?: string; // Optional in source
elementTag: string;
elementText: string;
targetText?: string; // Semantic targeting text (label, placeholder, aria-label, etc.)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need this for the semantic abstraction, dont remove this

cssSelector?: string; // Optional in source
elementTag: string;
value: string;
targetText?: string; // Semantic targeting text (label, placeholder, aria-label, etc.)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same I need this.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok let me see the version update i think i rushed this previous pr i thought i just had to split it

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 7 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

if (parts.length > 10) break;
}
return parts.length ? parts.join('.') : '0';
} catch {
Copy link

Copilot AI Oct 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The function accesses parent frames without checking cross-origin restrictions. This may throw security errors when frames are cross-origin, but the catch block handles it. Consider logging the error for debugging purposes.

Suggested change
} catch {
} catch (error) {
console.warn("computeFrameIdPath: Failed to access parent frame (possibly cross-origin)", error);

Copilot uses AI. Check for mistakes.
blocklist: [
'doubleclick.net','googlesyndication.com','googleadservices.com',
'amazon-adsystem.com','2mdn.net','recaptcha.google.com','recaptcha.net',
'googletagmanager.com','indexww.com','adtrafficquality.google','gstaticadssl.googleapis.com'
Copy link

Copilot AI Oct 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default blocklist is hardcoded in two places (background.ts and options.html). Consider extracting this to a shared constant to ensure consistency and easier maintenance.

Copilot uses AI. Check for mistakes.
Comment on lines +492 to +494
(lastStep as ScrollStep).scrollX === scrollData.x &&
(lastStep as ScrollStep).scrollY === scrollData.y &&
Math.abs(rrEvent.timestamp - lastStep.timestamp) < 200 &&
Copy link

Copilot AI Oct 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hardcoded value '200' (milliseconds) for deduplication should be extracted as a named constant (e.g., SCROLL_DEDUPE_WINDOW_MS) for better maintainability and clarity.

Copilot uses AI. Check for mistakes.
case 'scroll': {
const sameXY = (last as any).scrollX === (step as any).scrollX && (last as any).scrollY === (step as any).scrollY;
const sameUrl = (last as any).url === (step as any).url;
const nearTime = Math.abs(step.timestamp - last.timestamp) < 200;
Copy link

Copilot AI Oct 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hardcoded value '200' (milliseconds) appears again here for scroll deduplication. This should use the same named constant suggested in Comment 5 to maintain consistency.

Copilot uses AI. Check for mistakes.
};

// Dedupe rule 1: If value unchanged for this element and within debounce window, skip
const DEBOUNCE_MS_INPUT = 1500;
Copy link

Copilot AI Oct 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The input debounce value is defined locally within the function. Consider defining this at the module level alongside other constants for consistency with the scroll DEBOUNCE_MS constant.

Copilot uses AI. Check for mistakes.
Comment on lines +403 to +408
if (
inputData.value === "" &&
prev &&
prev.value === "" &&
inputData.timestamp - prev.ts < 5000
) {
Copy link

Copilot AI Oct 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hardcoded value '5000' (milliseconds) for empty input suppression should be extracted as a named constant for better maintainability.

Copilot uses AI. Check for mistakes.
const prior = lastInputPerKey[key];
const nowTs = inputEvent.timestamp;
const isEmpty = (inputEvent as any).value === "";
if (isEmpty && prior && prior.value === "" && nowTs - prior.ts < 5000) {
Copy link

Copilot AI Oct 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hardcoded value '5000' (milliseconds) appears in both content.ts and background.ts for empty input suppression. Extract this as a shared constant to ensure consistency.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants