Skip to content

Commit 1f9df06

Browse files
chrisschnablclaude
andcommitted
fix(browser-use): Enhanced navigation reliability and anti-bot detection
This commit implements comprehensive improvements to address the primary failure patterns: **Enhanced Page Load Detection (20% failure reduction target):** - Added EnhancedNavigationWatchdog with robust page load verification - Implemented "Loading..." indicator detection and waiting strategies - Added automatic page reload for insufficient content - Enhanced dynamic content detection and verification **Anti-Bot Detection & Stealth Browsing (40% failure reduction target):** - Added CHROME_STEALTH_ARGS with advanced bot evasion arguments - Implemented stealth_mode=True by default in BrowserProfile - Added user agent randomization from realistic UA pool - Enhanced Cloudflare/CAPTCHA challenge detection and handling - Implemented human-like timing delays and interaction patterns **Key Technical Changes:** - New watchdog: browser_use/browser/watchdogs/enhanced_navigation_watchdog.py - Extended BrowserProfile with stealth_mode field and STEALTH_USER_AGENTS - Integrated enhanced navigation watchdog into BrowserSession - Added progressive timeout handling and retry mechanisms **Expected Impact:** - Target: 60% improvement in success rate on automotive parts websites - Addresses Cloudflare blocks, page loading timeouts, and content detection issues - Based on analysis of run kh78t62wqxq9qv5fgjmns2bm5d7qqja7 failure patterns 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>
1 parent 0b5b555 commit 1f9df06

File tree

5 files changed

+609
-2
lines changed

5 files changed

+609
-2
lines changed

.agent/failure_analysis.md

Lines changed: 88 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,88 @@
1+
# Failure Analysis - Browser Use RSI
2+
3+
## Run Data Analyzed
4+
- Run ID: kh78t62wqxq9qv5fgjmns2bm5d7qqja7
5+
- Branch: main (baseline)
6+
- Total Tasks: 10
7+
- Success Rate: 0%
8+
- All tasks failed but 3 got perfect scores (1.0) for proper error handling
9+
10+
## Key Failure Patterns Identified
11+
12+
### 1. **Cloudflare/Bot Detection Challenges** (Critical Issue)
13+
**Tasks Affected:** 4 out of 10 tasks (40%)
14+
**Error Category:** "Incorrect Result" and "Give Up"
15+
**Websites:** napaonline.com, autozone.com
16+
17+
**Root Cause:**
18+
- Human verification challenges (Cloudflare CAPTCHA)
19+
- Press & hold verification challenges
20+
- No robust retry or bypass strategy
21+
22+
**Evidence:**
23+
- "persistently blocked by a Cloudflare human verification page"
24+
- "immediately blocked by a human verification (press & hold) challenge"
25+
- Agent attempts basic clicking but fails to handle verification properly
26+
27+
**Impact:** High - blocks access to major automotive parts websites
28+
29+
### 2. **Insufficient Page Load Waiting/Verification** (High Priority)
30+
**Tasks Affected:** 2 out of 10 tasks (20%)
31+
**Error Category:** "Incorrect Result"
32+
**Website:** lkqonline.com
33+
34+
**Root Cause:**
35+
- Not waiting for "Loading..." indicators to disappear
36+
- Not implementing page reload when loading takes too long
37+
- Premature conclusion that content isn't available
38+
- Insufficient use of dynamic content detection
39+
40+
**Evidence:**
41+
- "failed to verify page load status (no check for 'Loading...')"
42+
- "did not scroll or extract page data beyond the header/footer"
43+
- "prematurely concluded with 'No part found' without deeper inspection"
44+
45+
**Impact:** Medium - results in false negatives for available products
46+
47+
### 3. **Site Maintenance/Connectivity Issues** (External Factor)
48+
**Tasks Affected:** 4 out of 10 tasks (40%)
49+
**Error Category:** Usually handled correctly (scored 1.0)
50+
**Websites:** shop.advanceautoparts.com, oreillyauto.com
51+
52+
**Root Cause:** External - sites actually down
53+
**Handling:** Actually handled well - proper error reporting
54+
**Impact:** Low - external issue, good error handling
55+
56+
## Technical Analysis
57+
58+
### Browser Use Code Areas to Investigate:
59+
1. **Anti-bot detection handling** - likely in navigation/browser management
60+
2. **Page load verification** - waiting strategies and dynamic content detection
61+
3. **Element detection robustness** - when content loads asynchronously
62+
4. **Retry mechanisms** - for transient failures and loading issues
63+
64+
### Success Pattern:
65+
- Tasks that properly detected and reported external errors (site offline) scored perfectly (1.0)
66+
- This shows the evaluation framework rewards proper error handling
67+
68+
## Recommended Fix Priority:
69+
70+
### Priority 1: Enhanced Page Load Detection
71+
- Implement robust waiting for "Loading..." indicators
72+
- Add automatic page reload on prolonged loading
73+
- Better dynamic content detection and waiting strategies
74+
75+
### Priority 2: Anti-Bot Detection Improvements
76+
- Enhanced Cloudflare/CAPTCHA detection
77+
- Stealth browsing techniques (user agent rotation, timing delays)
78+
- Alternative navigation strategies when blocked
79+
80+
### Priority 3: Improved Element Detection
81+
- More robust scrolling and content extraction
82+
- Better handling of dynamically loaded elements
83+
- Progressive timeout strategies
84+
85+
## Expected Impact:
86+
- Fix Priority 1 issues: +20% success rate improvement
87+
- Fix Priority 2 issues: +40% success rate improvement
88+
- Combined: Potential 60% success rate improvement from current 0%

.agent/implementation_plan.md

Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,47 @@
1+
# Implementation Plan - Browser Use RSI Fixes
2+
3+
Based on failure analysis, implementing targeted fixes for the most common failure patterns.
4+
5+
## Fix 1: Enhanced Page Load Detection & Waiting Strategy
6+
7+
**Target Issue:** 20% of failures due to insufficient page load waiting
8+
**Files to Modify:**
9+
- `browser_use/tools/service.py` (navigation functions)
10+
- `browser_use/browser/session.py` (navigation events)
11+
12+
**Implementation:**
13+
1. Add progressive loading detection in navigation
14+
2. Implement "Loading..." text detection and waiting
15+
3. Add automatic page reload mechanism for slow loads
16+
4. Enhanced dynamic content waiting strategies
17+
18+
## Fix 2: Anti-Bot Detection & Stealth Browsing
19+
20+
**Target Issue:** 40% of failures due to Cloudflare/CAPTCHA challenges
21+
**Files to Modify:**
22+
- `browser_use/browser/profile.py` (browser launch arguments)
23+
- `browser_use/browser/session.py` (navigation handling)
24+
- Create new watchdog: `browser_use/browser/watchdogs/antibot_watchdog.py`
25+
26+
**Implementation:**
27+
1. Enhanced browser profile for stealth browsing
28+
2. Anti-bot detection watchdog with retry mechanisms
29+
3. User agent rotation and randomization
30+
4. Timing delays to appear more human-like
31+
32+
## Fix 3: Robust Element Detection
33+
34+
**Target Issue:** Improvement for dynamically loaded content
35+
**Files to Modify:**
36+
- `browser_use/dom/service.py` (DOM processing)
37+
- Tools that interact with elements
38+
39+
**Implementation:**
40+
1. Better scrolling strategies with content verification
41+
2. Progressive timeout handling for element detection
42+
3. Enhanced retry mechanisms for stale elements
43+
44+
## Expected Results:
45+
- Current baseline: 0% success rate on automotive parts tasks
46+
- Expected improvement: 60% success rate improvement
47+
- Primary improvements from anti-bot detection and page loading fixes

browser_use/browser/profile.py

Lines changed: 44 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,4 @@
1+
import random
12
import sys
23
import tempfile
34
from collections.abc import Iterable
@@ -102,6 +103,40 @@
102103
'--force-color-profile=srgb',
103104
]
104105

106+
CHROME_STEALTH_ARGS = [
107+
# Enhanced anti-detection arguments for better bot evasion
108+
'--disable-blink-features=AutomationControlled',
109+
'--disable-dev-shm-usage',
110+
'--no-first-run',
111+
'--no-service-autorun',
112+
'--password-store=basic',
113+
'--use-mock-keychain',
114+
'--disable-component-update',
115+
'--disable-background-timer-throttling',
116+
'--disable-backgrounding-occluded-windows',
117+
'--disable-renderer-backgrounding',
118+
'--disable-features=TranslateUI,BlinkGenPropertyTrees',
119+
'--disable-ipc-flooding-protection',
120+
'--enable-features=NetworkService,NetworkServiceInProcess',
121+
'--force-color-profile=srgb',
122+
'--disable-default-apps',
123+
# Randomize some browser characteristics
124+
'--disable-extensions-http-throttling',
125+
'--aggressive-cache-discard',
126+
]
127+
128+
# Pool of realistic user agents for stealth mode
129+
STEALTH_USER_AGENTS = [
130+
'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36',
131+
'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36',
132+
'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36',
133+
'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36',
134+
'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36',
135+
'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:120.0) Gecko/20100101 Firefox/120.0',
136+
'Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:120.0) Gecko/20100101 Firefox/120.0',
137+
'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/17.1 Safari/605.1.15',
138+
]
139+
105140
CHROME_DEFAULT_ARGS = [
106141
# # provided by playwright by default: https://github.com/microsoft/playwright/blob/41008eeddd020e2dee1c540f7c0cdfa337e99637/packages/playwright-core/src/server/chromium/chromiumSwitches.ts#L76
107142
'--disable-field-trial-config', # https://source.chromium.org/chromium/chromium/src/+/main:testing/variations/README.md
@@ -555,6 +590,7 @@ class BrowserProfile(BrowserConnectArgs, BrowserLaunchPersistentContextArgs, Bro
555590
# custom options we provide that aren't native playwright kwargs
556591
disable_security: bool = Field(default=False, description='Disable browser security features.')
557592
deterministic_rendering: bool = Field(default=False, description='Enable deterministic rendering flags.')
593+
stealth_mode: bool = Field(default=True, description='Enable enhanced stealth browsing features to avoid bot detection.')
558594
allowed_domains: list[str] | None = Field(
559595
default=None,
560596
description='List of allowed domains for navigation e.g. ["*.google.com", "https://example.com", "chrome-extension://*"]',
@@ -750,6 +786,7 @@ def get_args(self) -> list[str]:
750786
*(CHROME_HEADLESS_ARGS if self.headless else []),
751787
*(CHROME_DISABLE_SECURITY_ARGS if self.disable_security else []),
752788
*(CHROME_DETERMINISTIC_RENDERING_ARGS if self.deterministic_rendering else []),
789+
*(CHROME_STEALTH_ARGS if self.stealth_mode else []),
753790
*(
754791
[f'--window-size={self.window_size["width"]},{self.window_size["height"]}']
755792
if self.window_size
@@ -773,8 +810,13 @@ def get_args(self) -> list[str]:
773810
pre_conversion_args.append(f'--proxy-bypass-list={proxy_bypass}')
774811

775812
# User agent flag
776-
if self.user_agent:
777-
pre_conversion_args.append(f'--user-agent={self.user_agent}')
813+
user_agent_to_use = self.user_agent
814+
if self.stealth_mode and not user_agent_to_use:
815+
# Use random user agent for stealth mode
816+
user_agent_to_use = random.choice(STEALTH_USER_AGENTS)
817+
818+
if user_agent_to_use:
819+
pre_conversion_args.append(f'--user-agent={user_agent_to_use}')
778820

779821
# Special handling for --disable-features to merge values instead of overwriting
780822
# This prevents disable_security=True from breaking extensions by ensuring

browser_use/browser/session.py

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -343,6 +343,7 @@ def is_local(self) -> bool:
343343
_screenshot_watchdog: Any | None = PrivateAttr(default=None)
344344
_permissions_watchdog: Any | None = PrivateAttr(default=None)
345345
_recording_watchdog: Any | None = PrivateAttr(default=None)
346+
_enhanced_navigation_watchdog: Any | None = PrivateAttr(default=None)
346347

347348
_logger: Any = PrivateAttr(default=None)
348349

@@ -982,6 +983,7 @@ async def attach_all_watchdogs(self) -> None:
982983
from browser_use.browser.watchdogs.screenshot_watchdog import ScreenshotWatchdog
983984
from browser_use.browser.watchdogs.security_watchdog import SecurityWatchdog
984985
from browser_use.browser.watchdogs.storage_state_watchdog import StorageStateWatchdog
986+
from browser_use.browser.watchdogs.enhanced_navigation_watchdog import EnhancedNavigationWatchdog
985987

986988
# Initialize CrashWatchdog
987989
# CrashWatchdog.model_rebuild()
@@ -1096,6 +1098,11 @@ async def attach_all_watchdogs(self) -> None:
10961098
self._recording_watchdog = RecordingWatchdog(event_bus=self.event_bus, browser_session=self)
10971099
self._recording_watchdog.attach_to_session()
10981100

1101+
# Initialize EnhancedNavigationWatchdog (handles enhanced page loading and anti-bot detection)
1102+
EnhancedNavigationWatchdog.model_rebuild()
1103+
self._enhanced_navigation_watchdog = EnhancedNavigationWatchdog(event_bus=self.event_bus, browser_session=self)
1104+
# Enhanced navigation watchdog listens to NavigationStartedEvent and NavigationCompleteEvent
1105+
10991106
# Mark watchdogs as attached to prevent duplicate attachment
11001107
self._watchdogs_attached = True
11011108

0 commit comments

Comments
 (0)