-
Notifications
You must be signed in to change notification settings - Fork 297
feat(workflow): strengthen iframe-aware execution and extraction #103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…hild tab steps missing. Background/ad/tracker tabs polluted logs. Excessive duplicate navigation events per redirect/loading cycle. Massive explosion of input steps (hundreds of empty, unchanged values). Unnecessary workflow updates when steps unchanged. New Tab Intent Heuristic: Content script emits PREPARE_NEW_TAB on ctrl/cmd/middle click or target=_blank. Background correlates upcoming chrome.tabs.onCreated to mark userInitiated. Activated tabs tracked; only activated or userInitiated tabs produce steps. Tab Filtering: Suppress all events (except activation) from tabs never activated and not correlated with an intent window (4s). Reduces noise from ads/trackers. Navigation Consolidation: Maintain lastNavigationIndexByTab; update existing navigation step instead of appending duplicates during rapid redirects or title/url churn. Input Event Deduplication: Content script: per-xpath cache; skip unchanged value; debounce; skip rapid empty repeats. Background: merge consecutive identical field edits; collapse bursts of empty values within 5s (timestamp refresh only). Track lastInputPerKey (tabId|xpath) to decide merge vs new step.
…ick in iframe itself
|
@sauravpanda Hey I have splitted the backend here can u review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR strengthens iframe-aware workflow execution by adding frame path resolution, safety filtering for recorded workflows, and improved navigation handling. The changes span workflow runner logic, deterministic controller actions, and recorder filtering.
- Enhanced click actions to resolve correct iframe context using
frameIdPathorframeUrlhints - Added backend safety filter to drop spurious navigation steps (about:blank, ad/analytics hosts)
- Improved workflow step validation by skipping pre-wait when next step URL differs from current page
Reviewed Changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| workflows/workflow_use/workflow/service.py | Updated LLM integration to LangChain, added fallback logic, improved URL-based step skipping, and refactored tool invocation |
| workflows/workflow_use/recorder/service.py | Added backend filter to remove about:blank and ad/analytics navigation steps, switched to patchright playwright |
| workflows/workflow_use/controller/views.py | Added frameIdPath and url fields to RecorderBase, enabled timestamp and tabId in StepMeta |
| workflows/workflow_use/controller/service.py | Implemented iframe context resolution, increased timeout to 2500ms, added fallback frame search, improved navigation logic |
| workflows/examples/test_iframes.json | New test workflow file demonstrating iframe interaction patterns |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
/copilot review again |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1 issue found across 5 files
Prompt for AI agents (all 1 issues)
Understand the root cause of the following 1 issues and fix them.
<file name="workflows/workflow_use/workflow/service.py">
<violation number="1" location="workflows/workflow_use/workflow/service.py:637">
Populating missing optional inputs with '' causes Pydantic validation failures for optional numeric/bool fields. Please default them to None so optional non-string parameters remain valid.</violation>
</file>
React with 👍 or 👎 to teach cubic. Mention @cubic-dev-ai to give feedback, ask questions, or re-run the review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| blocked = any( | ||
| pat in host for pat in ( | ||
| 'doubleclick.net', 'googlesyndication.com', 'googleadservices.com', | ||
| 'amazon-adsystem.com', '2mdn.net', 'recaptcha.google.com', 'recaptcha.net', | ||
| 'googletagmanager.com', 'indexww.com', 'adtrafficquality.google' | ||
| ) | ||
| ) |
Copilot
AI
Oct 25, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The substring matching approach (e.g., 'doubleclick.net' in host) could incorrectly match legitimate domains like mydoubleclick.net.example.com. Consider using exact domain matching or suffix matching instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we make this suffix matching?
|
@sauravpanda Can u review this one as well |
| # If frameUrl or frameIdPath are provided, narrow the search to that frame | ||
| def _select_context(pg): | ||
| try: | ||
| from playwright.async_api import Page, Frame |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have removed playwright dependency in Browser Use library, we would love to not use playwright if possible, can you do it by using browser use actions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 14 out of 15 changed files in this pull request and generated 3 comments.
Comments suppressed due to low confidence (2)
extension/src/entrypoints/background.ts:1
- This single-line frameIdPath calculation is extremely difficult to read and maintain. Consider reformatting this code to use multiple lines for better readability, even if it's an inline function.
// import { eventWithTime } from 'rrweb'; // Type not directly available
extension/src/entrypoints/background.ts:1
- Using
anytype forwinvariable bypasses TypeScript's type safety. Consider using a proper type likeWindoworWindow & typeof globalThisto maintain type safety while handling the cross-origin frame access.
// import { eventWithTime } from 'rrweb'; // Type not directly available
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| default = ... if input_def.required else None | ||
| fields[name] = (py_type, default) | ||
|
|
||
| from typing import cast as _cast |
Copilot
AI
Oct 26, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The _cast import is defined inside the _build_input_model method at line 608, but it appears after the import section at the top of the file (lines 6-36) where cast was previously imported as _cast but later removed. Consider moving this import back to the top-level imports for consistency and to avoid repeated imports in each method call.
| const frameIdPath = (() => { | ||
| try { | ||
| let win: any = window; const parts: number[] = []; | ||
| while (win !== win.parent) { const parent = win.parent; let idx=0; for (let i=0;i<parent.frames.length;i++){ if(parent.frames[i]===win){idx=i;break;} } parts.unshift(idx); win=parent; if(parts.length>10) break; } |
Copilot
AI
Oct 26, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This frameIdPath calculation logic is duplicated in multiple locations (lines 127-132, 556-571, 700-705, 771, 873). Consider extracting this into a shared utility function to improve maintainability and reduce code duplication.
Co-authored-by: Copilot <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 14 out of 15 changed files in this pull request and generated 6 comments.
Comments suppressed due to low confidence (1)
extension/src/entrypoints/background.ts:1
- This single-line frameIdPath calculation is extremely difficult to read and maintain. It should be reformatted with proper line breaks and indentation, or better yet, extracted into a named function.
// import { eventWithTime } from 'rrweb'; // Type not directly available
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| const frameIdPath = (() => { | ||
| try { | ||
| let win: any = window; const parts: number[] = []; | ||
| while (win !== win.parent) { const parent = win.parent; let idx=0; for (let i=0;i<parent.frames.length;i++){ if(parent.frames[i]===win){idx=i;break;} } parts.unshift(idx); win=parent; if(parts.length>10) break; } | ||
| return parts.length ? parts.join('.') : '0'; | ||
| } catch { return '0'; } | ||
| })(); |
Copilot
AI
Oct 26, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The frameIdPath calculation logic is duplicated in multiple places (lines 128-132, 556-571, 700-705, 771, 873). This should be extracted into a shared function to reduce duplication and improve maintainability.
| const frameIdPath = (() => { | |
| try { | |
| let win: any = window; const parts: number[] = []; | |
| while (win !== win.parent) { const parent = win.parent; let idx=0; for (let i=0;i<parent.frames.length;i++){ if(parent.frames[i]===win){idx=i;break;} } parts.unshift(idx); win=parent; if(parts.length>10) break; } | |
| return parts.length ? parts.join('.') : '0'; | |
| } catch { return '0'; } | |
| })(); | |
| const frameIdPath = getFrameIdPath(); |
| messages: list[BaseMessage] = [ | ||
| SystemMessage(content=STRUCTURED_OUTPUT_PROMPT), | ||
| UserMessage(content=combined_text), | ||
| HumanMessage(content=combined_text), |
Copilot
AI
Oct 26, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] The import change from UserMessage to HumanMessage suggests a switch to LangChain's message types. However, the commented-out code and the pattern of usage should be verified against the LangChain API to ensure this is the correct message type for the intended use case.
| } | ||
| // If top frame, allow | ||
| if ((rrEvent as any).isTopFrame) { | ||
| // allowed |
Copilot
AI
Oct 26, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The empty comment '// allowed' at line 544 serves no purpose and should be removed. If this branch needs explanation, provide meaningful documentation about why top frames are always allowed.
| // allowed | |
| // Top frame navigations are always allowed |
| except Exception: | ||
| # Fall back to default document retrieval if frame-specific query fails |
Copilot
AI
Oct 26, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment states this is a 'fall back' but the exception handling is completely silent. Consider logging the exception (at debug level minimum) to aid troubleshooting when frame-specific queries fail.
| except Exception: | |
| # Fall back to default document retrieval if frame-specific query fails | |
| except Exception as exc: | |
| # Fall back to default document retrieval if frame-specific query fails | |
| logger.debug("Frame-specific getDocument failed for frame_id=%r: %s", frame_ctx.frame_id, exc) |
| if (!targetElement || !("value" in targetElement)) return; | ||
| const isPassword = targetElement.type === "password"; | ||
|
|
||
| // Ignore programmatic (non user-trusted) input events – these often cause massive duplication |
Copilot
AI
Oct 26, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] While filtering non-trusted events is good for preventing duplication, a comment explaining why programmatic events cause 'massive duplication' would help future maintainers understand this filtering decision.
| // Ignore programmatic (non user-trusted) input events – these often cause massive duplication | |
| // Ignore programmatic (non user-trusted) input events. | |
| // Many frameworks, scripts, or browser extensions dispatch synthetic input events | |
| // (e.g., via element.value = "foo" or element.dispatchEvent(new Event("input"))). | |
| // These programmatic events can fire in rapid succession or in response to value changes | |
| // that are not user-driven, resulting in a flood of duplicate or irrelevant input events. | |
| // Filtering to only user-trusted events ensures we record only genuine user input, | |
| // preventing massive duplication and keeping the event log meaningful. |
Co-authored-by: Copilot <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 14 out of 15 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if declared_url and declared_url.startswith('http') and not has_frame_hints and declared_url != current_url: | ||
| await browser_session.navigate_to(declared_url) | ||
|
|
||
| handle: ElementHandle = await get_best_element_handle( | ||
| browser_session, | ||
| params.cssSelector, | ||
| params, | ||
| timeout_ms=DEFAULT_ACTION_TIMEOUT_MS, | ||
| ) |
Copilot
AI
Oct 26, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] The navigation logic executes before attempting to find the element in the declared frame. If the element exists in an iframe on the current page but the URL differs, this will navigate away unnecessarily. Consider checking for the element in frames first, then navigating only if not found.
| if declared_url and declared_url.startswith('http') and not has_frame_hints and declared_url != current_url: | |
| await browser_session.navigate_to(declared_url) | |
| handle: ElementHandle = await get_best_element_handle( | |
| browser_session, | |
| params.cssSelector, | |
| params, | |
| timeout_ms=DEFAULT_ACTION_TIMEOUT_MS, | |
| ) | |
| handle: ElementHandle = None | |
| # First, try to find the element without navigating | |
| try: | |
| handle = await get_best_element_handle( | |
| browser_session, | |
| params.cssSelector, | |
| params, | |
| timeout_ms=DEFAULT_ACTION_TIMEOUT_MS, | |
| ) | |
| except Exception: | |
| handle = None | |
| # If not found, and navigation is appropriate, navigate and try again | |
| if handle is None and declared_url and declared_url.startswith('http') and not has_frame_hints and declared_url != current_url: | |
| await browser_session.navigate_to(declared_url) | |
| try: | |
| handle = await get_best_element_handle( | |
| browser_session, | |
| params.cssSelector, | |
| params, | |
| timeout_ms=DEFAULT_ACTION_TIMEOUT_MS, | |
| ) | |
| except Exception as e: | |
| raise RuntimeError(f"Element with selector '{params.cssSelector}' not found after navigation to {declared_url}") from e | |
| if handle is None: | |
| raise RuntimeError(f"Element with selector '{params.cssSelector}' not found on the current page or after navigation.") |
| const times = interactedFrameTimes[rrEvent.tabId] || {}; | ||
| const lastTs = times[fUrl]; | ||
| if (!lastTs) break; | ||
| if (Date.now() - lastTs > settings.iframeWindow) break; |
Copilot
AI
Oct 26, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using Date.now() in the conversion function creates a time-dependent race condition. The timestamp should be based on the event timestamp (rrEvent.timestamp) to ensure deterministic conversion regardless of when the function is called.
| if (Date.now() - lastTs > settings.iframeWindow) break; | |
| if (rrEvent.timestamp - lastTs > settings.iframeWindow) break; |
| const DEFAULT_SETTINGS = { | ||
| enableIframes: true as boolean, | ||
| iframeWindow: 3000 as number, | ||
| blocklist: [ | ||
| 'doubleclick.net','googlesyndication.com','googleadservices.com', | ||
| 'amazon-adsystem.com','2mdn.net','recaptcha.google.com','recaptcha.net', | ||
| 'googletagmanager.com','indexww.com','adtrafficquality.google' | ||
| ] as string[], |
Copilot
AI
Oct 26, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The DEFAULT_SETTINGS object duplicates the same blocklist defined in BLOCKED_IFRAME_HOST_PATTERNS (lines 47-59) and in the options.html defaults (lines 46-49). This creates a maintenance burden where changes must be synchronized across three locations. Consider defining the blocklist once and importing it where needed.
| const DEFAULT_SETTINGS = { | |
| enableIframes: true as boolean, | |
| iframeWindow: 3000 as number, | |
| blocklist: [ | |
| 'doubleclick.net','googlesyndication.com','googleadservices.com', | |
| 'amazon-adsystem.com','2mdn.net','recaptcha.google.com','recaptcha.net', | |
| 'googletagmanager.com','indexww.com','adtrafficquality.google' | |
| ] as string[], | |
| const BLOCKED_IFRAME_HOST_PATTERNS = [ | |
| 'doubleclick.net','googlesyndication.com','googleadservices.com', | |
| 'amazon-adsystem.com','2mdn.net','recaptcha.google.com','recaptcha.net', | |
| 'googletagmanager.com','indexww.com','adtrafficquality.google' | |
| ]; | |
| const DEFAULT_SETTINGS = { | |
| enableIframes: true as boolean, | |
| iframeWindow: 3000 as number, | |
| blocklist: BLOCKED_IFRAME_HOST_PATTERNS as string[], |
| # If the next step declares a URL/frameUrl and it does not match the current page URL, | ||
| # skip waiting for its element on the current page (prevents false failures like step 7). | ||
| curr_url = (await page.get_url() or '').split('#')[0] | ||
| declared_next_url = (getattr(next_step_resolved, 'url', None) or getattr(next_step_resolved, 'frameUrl', None) or '').split('#')[0] |
Copilot
AI
Oct 26, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] The logic for extracting and normalizing URLs (splitting on '#') is duplicated on lines 150-151. Consider extracting this into a helper function like normalize_url(url) to avoid repetition and make the intent clearer.
Summary Splitting PR #87
Backend (workflows/workflow_use)
Recorder safety filter (recorder/service.py): drop about:blank and obvious ad/analytics hosts before saving workflow.
Controller models (controller/views.py): RecorderBase now includes url and frameIdPath so steps carry frame/page hints.
Deterministic controller (controller/service.py):
Clicks execute inside the correct iframe:
Resolve context by frameIdPath (preferred) or frameUrl (origin+prefix).
Do not navigate to iframe URLs; only auto-navigate for top-document clicks without frame hints.
Fallback: search all frames (prioritize target origin) to find selector.
Increased click timeout to 2500ms.
Workflow runner (workflow/service.py):
Skip “pre-wait for next selector” when next step’s declared url/frameUrl differs from current page URL to avoid false failures between pages.
Noise/Correctness
Prevent duplicate steps due to pointer-only variance and consecutive identical content.
Drop spurious iframe navigations (recaptcha/ads/metrics).
State is consistent across reloads; recording does not auto-start unexpectedly.
Additional Notes
Frame resolution uses frameIdPath first; frameUrl fallback matches origin + prefix to disambiguate.
Summary by cubic
Improve deterministic workflow reliability with iframe-aware execution and extraction. Clicks happen in the right frame, iframe content is captured, and noisy ad/blank navigations are filtered out.
New Features
Bug Fixes