Skip to content

Commit 20d7082

Browse files
chrisschnablclaude
andcommitted
fix(browser-use): Enhanced error classification and anti-bot detection
- Add JavaScript execution error patterns for better retry logic - Add dynamic content loading detection patterns - Add comprehensive anti-bot detection patterns (Cloudflare, CAPTCHA, etc.) - Enhance navigation watchdog with improved text-based challenge detection - Remove duplicate challenge detection logic for better maintainability 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>
1 parent dc36acd commit 20d7082

File tree

3 files changed

+159
-19
lines changed

3 files changed

+159
-19
lines changed

.agent/analysis_20250916.md

Lines changed: 64 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,64 @@
1+
# Browser-Use RSI Analysis - September 16, 2025
2+
3+
## Overview
4+
Analysis of recent evaluation runs shows consistent 0% success rates across all runs, despite some runs having decent comprehensive scores (52-62) and self-report success rates (63-76%). This suggests fundamental issues in either the evaluation criteria or core browser automation functionality.
5+
6+
## Key Issues Identified
7+
8+
### 1. Evaluation System Issues
9+
- All recent runs show 0% success rate regardless of comprehensive scores
10+
- API parsing errors preventing detailed failure analysis
11+
- Suggests success criteria may be too strict or broken
12+
13+
### 2. Error Handling & Classification
14+
Current implementation in `browser_use/tools/error_classifier.py`:
15+
- Good error categorization system (RETRYABLE_NETWORK, RETRYABLE_TIMING, etc.)
16+
- Pattern-based classification with retry strategies
17+
- However, may not be catching all failure patterns effectively
18+
19+
### 3. Element Detection & Staleness
20+
From code analysis, potential issues:
21+
- Stale element references in DOM interactions
22+
- Element detection timeouts
23+
- Lack of intelligent element recovery strategies
24+
25+
### 4. Navigation & Page Load Detection
26+
- Navigation timeout issues
27+
- Incomplete page load detection
28+
- Missing robust document ready monitoring
29+
30+
## Critical Areas for Improvement
31+
32+
### High Priority Fixes:
33+
1. **Element Staleness Recovery**: Implement intelligent element re-detection when elements become stale
34+
2. **Navigation Reliability**: Improve page load detection with multiple validation strategies
35+
3. **Error Recovery**: Enhance error classification to catch more edge cases
36+
37+
### Current Error Classification Gaps:
38+
- Missing patterns for JavaScript execution failures
39+
- Limited handling of anti-bot detection
40+
- Insufficient Cloudflare/CAPTCHA handling
41+
- Missing patterns for dynamic content loading failures
42+
43+
## Recommended Fixes
44+
45+
### Fix 1: ElementStalenessWatchdog
46+
- Add watchdog to automatically re-detect stale elements
47+
- Implement intelligent element recovery strategies
48+
- Add element reference caching with refresh mechanisms
49+
50+
### Fix 2: Enhanced Page Load Detection
51+
- Multi-layered page load verification (document.readyState, network idle, DOM stable)
52+
- Better handling of Single Page Applications (SPAs)
53+
- Dynamic content detection and waiting
54+
55+
### Fix 3: Anti-Bot & Security Handling
56+
- Better detection of anti-bot measures
57+
- Cloudflare challenge detection and handling
58+
- CAPTCHA detection with appropriate user feedback
59+
60+
## Implementation Strategy
61+
1. Focus on most common failure patterns first
62+
2. Implement minimal viable fixes, not overengineered solutions
63+
3. Ensure backward compatibility
64+
4. Test each fix with representative tasks before committing

browser_use/browser/watchdogs/enhanced_navigation_watchdog.py

Lines changed: 24 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -189,6 +189,7 @@ async def _detect_and_handle_antibot(self, target_id: str, url: str) -> None:
189189
let challengeFound = false;
190190
let challengeType = '';
191191
192+
// Check by selectors
192193
for (const selector of allSelectors) {
193194
const element = document.querySelector(selector);
194195
if (element) {
@@ -201,27 +202,31 @@ async def _detect_and_handle_antibot(self, target_id: str, url: str) -> None:
201202
}
202203
}
203204
204-
// Check for challenge text content
205-
const pageText = document.body.textContent.toLowerCase();
206-
const challengeTexts = [
207-
'cloudflare',
208-
'checking your browser',
209-
'human verification',
210-
'please wait while we verify',
211-
'security check',
212-
'press & hold',
213-
'click to verify',
214-
'i am human'
215-
];
216-
217-
let textChallenge = '';
218-
for (const text of challengeTexts) {
219-
if (pageText.includes(text)) {
220-
challengeFound = true;
221-
textChallenge = text;
222-
break;
205+
// Additional text-based detection
206+
if (!challengeFound) {
207+
const bodyText = document.body.innerText.toLowerCase();
208+
const challengeTexts = [
209+
'checking your browser',
210+
'verify you are human',
211+
'complete the security check',
212+
'please wait while we verify',
213+
'cloudflare security challenge',
214+
'ddos protection by cloudflare',
215+
'ray id:',
216+
'just a moment',
217+
'please solve the captcha',
218+
'prove you are not a robot'
219+
];
220+
221+
for (const text of challengeTexts) {
222+
if (bodyText.includes(text)) {
223+
challengeFound = true;
224+
challengeType = 'text:' + text;
225+
break;
226+
}
223227
}
224228
}
229+
}
225230
226231
return {
227232
challengeFound,

browser_use/tools/error_classifier.py

Lines changed: 71 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -133,6 +133,43 @@ def __init__(self):
133133
(r'incompatible.*version', "Incompatible version")
134134
]
135135

136+
# JavaScript execution error patterns (retryable with DOM refresh)
137+
self.javascript_patterns = [
138+
(r'javascript.*error', "JavaScript execution error"),
139+
(r'script.*error', "Script execution failed"),
140+
(r'uncaught.*exception', "Uncaught JavaScript exception"),
141+
(r'reference.*error', "JavaScript reference error"),
142+
(r'type.*error.*javascript', "JavaScript type error"),
143+
(r'cannot.*read.*propert', "JavaScript property access error"),
144+
(r'function.*not.*defined', "JavaScript function not defined")
145+
]
146+
147+
# Anti-bot and security challenge patterns
148+
self.antibot_patterns = [
149+
(r'cloudflare.*challenge', "Cloudflare security challenge"),
150+
(r'captcha.*required', "CAPTCHA challenge detected"),
151+
(r'recaptcha.*challenge', "reCAPTCHA challenge detected"),
152+
(r'bot.*detect', "Bot detection triggered"),
153+
(r'rate.*limit.*exceed', "Rate limiting activated"),
154+
(r'suspicious.*activity', "Suspicious activity detected"),
155+
(r'please.*verify.*human', "Human verification required"),
156+
(r'security.*check.*required', "Security verification required"),
157+
(r'access.*denied.*bot', "Bot access denied"),
158+
(r'automated.*traffic.*detect', "Automated traffic detected")
159+
]
160+
161+
# Dynamic content loading patterns
162+
self.dynamic_content_patterns = [
163+
(r'content.*still.*loading', "Dynamic content still loading"),
164+
(r'ajax.*request.*pending', "AJAX request in progress"),
165+
(r'react.*component.*mounting', "React component mounting"),
166+
(r'spa.*router.*navigating', "SPA navigation in progress"),
167+
(r'virtual.*dom.*updating', "Virtual DOM update in progress"),
168+
(r'lazy.*load.*pending', "Lazy loading in progress"),
169+
(r'infinite.*scroll.*loading', "Infinite scroll loading"),
170+
(r'skeleton.*loader.*active', "Skeleton loader active")
171+
]
172+
136173
def classify_error(
137174
self,
138175
error: Exception,
@@ -196,6 +233,30 @@ def classify_error(
196233
technical_details=error_str
197234
)
198235

236+
# JavaScript execution errors - retry with DOM refresh
237+
for pattern, description in self.javascript_patterns:
238+
if re.search(pattern, error_str, re.IGNORECASE):
239+
return ErrorClassificationResult(
240+
category=ErrorCategory.RETRYABLE_TIMING,
241+
should_retry=True,
242+
retry_delay=1.0,
243+
max_retries=2,
244+
user_message=f"JavaScript error: {description}. Waiting for page to stabilize and retrying...",
245+
technical_details=error_str
246+
)
247+
248+
# Dynamic content loading - retry with longer wait
249+
for pattern, description in self.dynamic_content_patterns:
250+
if re.search(pattern, error_str, re.IGNORECASE):
251+
return ErrorClassificationResult(
252+
category=ErrorCategory.RETRYABLE_TIMING,
253+
should_retry=True,
254+
retry_delay=3.0,
255+
max_retries=2,
256+
user_message=f"Dynamic content loading: {description}. Waiting for content to load...",
257+
technical_details=error_str
258+
)
259+
199260
# Check for permanent error patterns
200261

201262
# Invalid input - don't retry
@@ -238,6 +299,16 @@ def classify_error(
238299
technical_details=error_str
239300
)
240301

302+
# Anti-bot detection - don't retry (needs human intervention)
303+
for pattern, description in self.antibot_patterns:
304+
if re.search(pattern, error_str, re.IGNORECASE):
305+
return ErrorClassificationResult(
306+
category=ErrorCategory.PERMANENT_ACCESS_DENIED,
307+
should_retry=False,
308+
user_message=f"Anti-bot protection: {description}. Manual intervention required.",
309+
technical_details=error_str
310+
)
311+
241312
# Special handling for specific exception types
242313
if isinstance(error, TimeoutError):
243314
return ErrorClassificationResult(

0 commit comments

Comments
 (0)