Skip to content

Navigating Nested Iframes using observeΒ #870

@ambarc

Description

@ambarc

(using experimental: true to enable iframes). We're testing on an ERP with a bunch of nested iframes. Stagehand version 2.4.1.

Issue: Stagehand's observe({ iframes: true }) returns 0 nested elements on complex iframe structures despite 14+ frames containing interactive content. We've seen this solved elsewhere by preserving natural DOM hierarchy. Stagehand's intent might be to get builders to rerun recursions to get more depth, but that wouldn't feel intuitive.

Starting this issue to understand whether what I'm seeing is expected behavior.

The structure of the page we're looking at, trying to target the search element.


https://app.example.com/main (Frame 0) [MAIN]
β”œβ”€β”€ πŸ“„ Main page content
β”œβ”€β”€ πŸ–ΌοΈ iframe#navigation β†’ https://app.example.com/navigation (Frame 1) [SAME-ORIGIN]
β”‚   β”œβ”€β”€ πŸ” input#searchBar [INTERACTIVE] ← Target element
β”‚   β”œβ”€β”€ πŸ”˜ button#submitBtn [INTERACTIVE]  
β”‚   └── πŸ“ Various nav links [INTERACTIVE]
β”œβ”€β”€ πŸ–ΌοΈ iframe#workspace β†’ https://app.example.com/workspace (Frame 2) [SAME-ORIGIN]
β”‚   β”œβ”€β”€ πŸ“Š Data grid with buttons [INTERACTIVE]
β”‚   β”œβ”€β”€ βž• Add buttons [INTERACTIVE]
β”‚   └── πŸ–ΌοΈ nested iframe β†’ https://app.example.com/reports (Frame 3) [SAME-ORIGIN]
β”‚       └── πŸ“ˆ Report controls [INTERACTIVE]
└── πŸ–ΌοΈ iframe#status β†’ https://app.example.com/status (Frame 4) [SAME-ORIGIN]
    β”œβ”€β”€ πŸ“‹ select#departmentSelect [INTERACTIVE] 
    └── ℹ️ Status indicators [INTERACTIVE]

🚫 PROBLEM: observe({ iframes: true }) β†’ 0 elements
βœ… REALITY: 15+ interactive elements exist across frames
πŸ”§ CAUSE: Element resolution breaks during tree merging

experimental: true required for iframe traversal
observe({ iframes: false }): 4 elements, not supported (generic iframe placeholders)
observe({ iframes: true }): 4 elements (despite 14 nested frames with actions and content)

This might be where the issue is? I haven't gone deep but I can if the maintainers think I'm on the right path.

// In observeHandler.ts - the iframe metadata gets lost
const { combinedTree, combinedXpathMap, discoveredIframes } = await (iframes
 ? getAccessibilityTreeWithFrames(...).then(({ combinedTree, combinedXpathMap }) => ({
     combinedTree,           // βœ… Combined tree exists
     combinedXpathMap,       // βœ… XPath mappings exist 
     discoveredIframes: []   // ❌ Always empty - breaks element resolution
   }))
 : // non-iframe path works correctly

Sample code - I can't share the ERP we were testing against.

const TEST_URLS = {
 MAIN_FRAME: 'https://app.example.com/main',
 NAV_FRAME: 'https://app.example.com/navigation',
 CONTENT_FRAME: 'https://app.example.com/workspace',
 STATUS_FRAME: 'https://app.example.com/status'
};


const TEST_ELEMENTS = {
 SEARCH_INPUT: 'searchInput',
 SUBMIT_BUTTON: 'submitBtn',
 DROPDOWN: 'departmentSelect'
};


// Test iframe observation
console.log('Testing iframe observation...');


const basicObs = await page.observe({ iframes: false });
console.log(`Basic: ${basicObs.length} elements`); // 4 unsupported


const iframeObs = await page.observe({ iframes: true }); 
console.log(`Iframe: ${iframeObs.length} elements`); // 4 surface-level iframes instead of more nested actions.


const frames = await page.frames();
console.log(`Total frames: ${frames.length}`);

I would guess that the fix (if we're aligned) involves observeHandler.ts building deeper trees, preserving frame boundaries, and potentially allowing code to target intra-iframe elements. We're working off a hand-rolled solution that builds DOMs this way and it'd be great to be able to use stagehand's fuller feature set. Thanks for the good work on this project!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions