Captr can capture DOM snapshots from web browsers and accessibility trees from native applications. These captures provide valuable context for your screen recordings. This guide will help you set up and troubleshoot these features.
-
For DOM Capture:
- A Chromium-based browser (Chrome, Edge, Brave, Opera, Vivaldi, etc.)
- The browser must be started with remote debugging enabled
-
For Accessibility Tree Capture:
- macOS only
- Accessibility permissions must be granted to Captr
For Captr to capture DOM snapshots from Chromium-based browsers, at least one browser must be running with remote debugging enabled:
We've included a helper script that can launch any supported browser with debugging enabled:
# From the Captr directory:
python3 launch_chrome_debug.pyAdvanced usage:
# List available browsers
python3 launch_chrome_debug.py --list
# Launch specific browser
python3 launch_chrome_debug.py --browser edge
# Use different port (if 9222 is already in use)
python3 launch_chrome_debug.py --port 9223
# Launch browser with specific URL
python3 launch_chrome_debug.py --url https://github.comIf you prefer to launch browsers manually, here are the commands for different platforms:
# Chrome
open -a "Google Chrome" --args --remote-debugging-port=9222
# Microsoft Edge
open -a "Microsoft Edge" --args --remote-debugging-port=9222
# Brave Browser
open -a "Brave Browser" --args --remote-debugging-port=9222"C:\Program Files\Google\Chrome\Application\chrome.exe" --remote-debugging-port=9222
"C:\Program Files\Microsoft\Edge\Application\msedge.exe" --remote-debugging-port=9222
"C:\Program Files\BraveSoftware\Brave-Browser\Application\brave.exe" --remote-debugging-port=9222
google-chrome --remote-debugging-port=9222
microsoft-edge --remote-debugging-port=9222
brave-browser --remote-debugging-port=9222Captr will automatically try to find a suitable browser for DOM captures:
- It first tries the default port (9222)
- If that fails, it checks other common debugging ports (9223, 9224, 9333, 8080)
- It will use the first available browser it finds
This means you don't need to configure anything - as long as at least one browser is running with debugging enabled, Captr should be able to capture DOM snapshots.
Captr intelligently captures DOM snapshots and accessibility trees at key moments:
- Mouse Clicks: Immediate capture when clicking in a browser window
- Delayed Capture: 3-second delayed capture after clicks to catch page transitions
- Key Presses: When pressing navigation keys (Enter, Tab, arrows, etc.)
- Page Changes: When navigating to a new URL or site
- Periodic Capture: Automatic capture every 30 seconds
- Mouse Clicks: When clicking in native applications
- Key Presses: When pressing navigation or modifier keys
- Delayed Key Capture: After key release to capture resulting UI changes
To prevent excessive storage use, Captr implements smart deduplication:
- Content-based hashing to avoid saving identical snapshots
- Cooldown periods between captures (2-5 seconds)
- Detection of similar URLs to prevent near-duplicate captures
- Levenshtein distance comparison for URLs to identify similar pages
For Captr to capture accessibility trees from native apps on macOS:
- Go to System Settings > Privacy & Security > Privacy
- Select "Accessibility" from the left panel
- Click the lock to make changes (requires admin password)
- Check the box next to Captr.app
- Restart Captr if it's already running
-
Check if any browser has debugging enabled:
python3 debug_chrome_cdp.py
This will attempt to connect to any available browser.
-
Common issues:
- No browsers started with the
--remote-debugging-portflag - Firewalls blocking access to the debugging ports
- Antivirus software preventing the connections
- Custom browser configurations that disable remote debugging
- No browsers started with the
-
Check the logs:
- Look for messages indicating connection attempts to different ports
- Check for "CDP connection failed" or similar error messages
- Verify that the browser is detected as the focused application
-
Check if your app has accessibility permissions:
python3 debug_accessibility.py
This will test if Captr can access the accessibility API.
-
Common issues:
- Accessibility permissions not granted to Captr
- The application you're trying to capture doesn't properly support accessibility
- The PyObjC library is not properly installed or working
-
Check the logs:
- Look for warnings about "Accessibility permissions"
- Check for specific errors related to "AXUIElement" or accessibility API functions
By default, the captures are stored in a dom_snaps directory within each recording folder:
[recording_timestamp]/
├── dom_snaps/
│ ├── a11y_click_123456.json
│ ├── dom_click_123456.mhtml
│ └── ...
├── events.jsonl
└── ...
If Captr cannot create this directory, it will fallback to ~/Captr_dom_snaps/.
We've included several debugging tools to help troubleshoot capture issues:
debug_chrome_cdp.py- Tests Chrome DevTools Protocol connectivity with any available browserdebug_accessibility.py- Tests macOS Accessibility API functionalitylaunch_chrome_debug.py- Launches any Chromium-based browser with debugging enabledcheck_recording.py- Examines recordings to check if DOM and accessibility captures are working
If you're still having issues with DOM or accessibility tree capture, please check the application logs or submit an issue on our GitHub repository with the specific error messages you're seeing.
Some websites and applications employ security mechanisms that intentionally limit what can be captured via the Chrome DevTools Protocol (CDP):
Google Docs and similar web-based document editors use:
- Canvas-based rendering instead of standard DOM text elements
- Content Security Policies (CSP) that restrict access to the actual document content
- Custom rendering engines that don't expose content in the standard DOM
When capturing these sites, you'll typically see the UI framework but minimal or no actual document content. This is expected behavior and is part of these applications' security design.
Financial websites often implement:
- Anti-scraping technologies
- Dynamic content obfuscation
- Strict Content Security Policies
- Input field masking
From a technical perspective, these limitations exist because:
- Canvas Rendering: Content rendered to HTML5 canvas elements isn't accessible via the DOM tree
- Shadow DOM Isolation: Components using Shadow DOM may not expose their internal structure
- iframes with Different Origins: Content in cross-origin iframes is protected due to same-origin policy
- JavaScript Obfuscation: Some sites dynamically generate and transform content via obfuscated JavaScript
- Browser Security Features: Browsers intentionally limit what information is exposed via debugging interfaces
These limitations are privacy and security features, not bugs in Captr's capture mechanism.