|
26 | 26 |
|
27 | 27 | Picture this: you need to automate browser tasks. Maybe it's testing your web application, scraping data from websites, or automating repetitive processes. Traditionally, this meant dealing with external drivers, complex configurations, and a host of compatibility issues that seemed to appear out of nowhere. |
28 | 28 |
|
| 29 | +But there's another challenge that's even more frustrating: **modern web protection systems**. Cloudflare Turnstile captchas, reCAPTCHA v3, and sophisticated bot detection algorithms that can instantly identify and block traditional automation tools. Your perfectly written automation script fails not because of bugs, but because websites can tell it's not human. |
| 30 | + |
29 | 31 | **Pydoll was born to change that.** |
30 | 32 |
|
31 | | -Built from the ground up with a different philosophy, Pydoll connects directly to the Chrome DevTools Protocol (CDP), eliminating the need for external drivers entirely. This isn't just a technical change - it's a revolution in how you interact with browsers through Python. |
| 33 | +Built from the ground up with a different philosophy, Pydoll connects directly to the Chrome DevTools Protocol (CDP), eliminating the need for external drivers entirely. More importantly, it incorporates advanced human behavior simulation and intelligent captcha bypass capabilities that make your automations virtually indistinguishable from real human interactions. |
32 | 34 |
|
33 | | -We believe that powerful automation shouldn't require you to become a configuration expert. With Pydoll, you focus on what matters: your automation logic, not the underlying complexity. |
| 35 | +We believe that powerful automation shouldn't require you to become a configuration expert or constantly battle with anti-bot systems. With Pydoll, you focus on what matters: your automation logic, not the underlying complexity or protection bypassing. |
34 | 36 |
|
35 | 37 | ## What Makes Pydoll Special |
36 | 38 |
|
37 | | -**Genuine Simplicity**: We don't want you wasting time configuring drivers or dealing with compatibility issues. With Pydoll, you install and you're ready to automate. |
| 39 | +**Intelligent Captcha Bypass**: Built-in automatic solving for Cloudflare Turnstile and reCAPTCHA v3 captchas without external services, API keys, or complex configurations. Your automations continue seamlessly even when encountering protection systems. |
38 | 40 |
|
39 | | -**Truly Human Interactions**: Our algorithms simulate real human behavior patterns - from timing between clicks to how the mouse moves across the screen. |
| 41 | +**Truly Human Interactions**: Advanced algorithms simulate authentic human behavior patterns - from realistic timing between actions to natural mouse movements, scroll patterns, and typing rhythms that fool even sophisticated bot detection systems. |
40 | 42 |
|
41 | | -**Native Async Performance**: Built from the ground up with `asyncio`, Pydoll doesn't just support asynchronous operations - it was designed for them. |
| 43 | +**Genuine Simplicity**: We don't want you wasting time configuring drivers or dealing with compatibility issues. With Pydoll, you install and you're ready to automate, even on protected sites. |
42 | 44 |
|
43 | | -**Integrated Intelligence**: Automatic bypass of Cloudflare Turnstile and reCAPTCHA v3 captchas, without external services or complex configurations. |
| 45 | +**Native Async Performance**: Built from the ground up with `asyncio`, Pydoll doesn't just support asynchronous operations - it was designed for them, enabling concurrent processing of multiple protected sites. |
44 | 46 |
|
45 | | -**Powerful Network Monitoring**: Intercept, modify, and analyze all network traffic with ease, giving you complete control over requests. |
| 47 | +**Powerful Network Monitoring**: Intercept, modify, and analyze all network traffic with ease, giving you complete control over requests and responses - perfect for bypassing additional protection layers. |
46 | 48 |
|
47 | | -**Event-Driven Architecture**: React to page events, network requests, and user interactions in real-time. |
| 49 | +**Event-Driven Architecture**: React to page events, network requests, and user interactions in real-time, enabling sophisticated automation flows that adapt to dynamic protection systems. |
48 | 50 |
|
49 | | -**Intuitive Element Finding**: Modern `find()` and `query()` methods that make sense and work as you'd expect. |
| 51 | +**Intuitive Element Finding**: Modern `find()` and `query()` methods that make sense and work as you'd expect, even with dynamically loaded content from protection systems. |
50 | 52 |
|
51 | | -**Robust Type Safety**: Comprehensive type system for better IDE support and error prevention. |
| 53 | +**Robust Type Safety**: Comprehensive type system for better IDE support and error prevention in complex automation scenarios. |
52 | 54 |
|
53 | 55 | ## Installation |
54 | 56 |
|
@@ -120,31 +122,126 @@ asyncio.run(custom_automation()) |
120 | 122 |
|
121 | 123 | ### Intelligent Captcha Bypass |
122 | 124 |
|
123 | | -One of Pydoll's most impressive features is its ability to automatically handle Cloudflare Turnstile captchas. This means fewer interruptions and smoother automations: |
| 125 | +One of Pydoll's most revolutionary features is its ability to automatically handle modern captcha systems that typically block automation tools. This isn't just about solving captchas - it's about making your automations completely transparent to protection systems. |
| 126 | + |
| 127 | +**Supported Captcha Types:** |
| 128 | +- **Cloudflare Turnstile** - The modern replacement for reCAPTCHA |
| 129 | +- **reCAPTCHA v3** - Google's invisible captcha system |
| 130 | +- **Custom implementations** - Extensible framework for new captcha types |
124 | 131 |
|
125 | 132 | ```python |
126 | 133 | import asyncio |
127 | 134 | from pydoll.browser import Chrome |
128 | 135 |
|
129 | | -async def bypass_cloudflare(): |
| 136 | +async def advanced_captcha_bypass(): |
130 | 137 | async with Chrome() as browser: |
131 | 138 | tab = await browser.start() |
132 | 139 |
|
133 | 140 | # Method 1: Context manager (waits for captcha completion) |
134 | 141 | async with tab.expect_and_bypass_cloudflare_captcha(): |
135 | 142 | await tab.go_to('https://site-with-cloudflare.com') |
136 | | - print("Captcha automatically solved!") |
| 143 | + print("Cloudflare Turnstile automatically solved!") |
| 144 | + |
| 145 | + # Continue with your automation - captcha is handled |
| 146 | + await tab.find( id='username').type( '[email protected]') |
| 147 | + await tab.find(id='password').type('password123') |
| 148 | + await tab.find(tag_name='button', text='Login').click() |
137 | 149 |
|
138 | | - # Method 2: Background processing |
| 150 | + # Method 2: Background processing (non-blocking) |
139 | 151 | await tab.enable_auto_solve_cloudflare_captcha() |
140 | 152 | await tab.go_to('https://another-protected-site.com') |
141 | | - # Captcha solved in background while code continues |
| 153 | + # Captcha solved automatically in background while code continues |
| 154 | + |
| 155 | + # Method 3: Custom captcha selector for specific implementations |
| 156 | + await tab.enable_auto_solve_cloudflare_captcha( |
| 157 | + custom_selector=(By.CLASS_NAME, 'custom-captcha-widget'), |
| 158 | + time_before_click=3, # Wait 3 seconds before solving |
| 159 | + time_to_wait_captcha=10 # Timeout after 10 seconds |
| 160 | + ) |
142 | 161 |
|
143 | 162 | await tab.disable_auto_solve_cloudflare_captcha() |
144 | 163 |
|
145 | | -asyncio.run(bypass_cloudflare()) |
| 164 | +asyncio.run(advanced_captcha_bypass()) |
146 | 165 | ``` |
147 | 166 |
|
| 167 | +**Why This Matters:** |
| 168 | +- **No External Dependencies**: No need for captcha solving services or API keys |
| 169 | +- **Cost Effective**: Eliminate monthly captcha solving service fees |
| 170 | +- **Reliable**: Works consistently without depending on third-party availability |
| 171 | +- **Fast**: Instant solving without network delays to external services |
| 172 | +- **Seamless Integration**: Captcha bypass happens transparently in your automation flow |
| 173 | + |
| 174 | +### Human-Like Interactions |
| 175 | + |
| 176 | +Pydoll's secret weapon against bot detection is its sophisticated human behavior simulation. Modern websites use advanced algorithms to detect automation by analyzing interaction patterns, timing, and mouse movements. Pydoll counters this with realistic human simulation. |
| 177 | + |
| 178 | +**What Makes Interactions Human-Like:** |
| 179 | +- **Natural Timing Variations**: Random delays between actions that mimic human hesitation and thinking time |
| 180 | +- **Realistic Mouse Movements**: Curved, natural mouse paths instead of straight lines |
| 181 | +- **Human Typing Patterns**: Variable typing speeds with realistic pauses and occasional typos |
| 182 | +- **Scroll Behavior**: Natural scrolling patterns with momentum and easing |
| 183 | +- **Focus and Attention Simulation**: Realistic tab switching and window focus patterns |
| 184 | + |
| 185 | +```python |
| 186 | +import asyncio |
| 187 | +import random |
| 188 | +from pydoll.browser import Chrome |
| 189 | + |
| 190 | +async def human_like_automation(): |
| 191 | + async with Chrome() as browser: |
| 192 | + tab = await browser.start() |
| 193 | + await tab.go_to('https://example.com') |
| 194 | + |
| 195 | + # Human-like typing with natural variations |
| 196 | + search_box = await tab.find(id='search') |
| 197 | + await search_box.type('web automation', human_like=True) |
| 198 | + # Automatically includes: random typing speed, occasional pauses, |
| 199 | + # natural timing between keystrokes |
| 200 | + |
| 201 | + # Human-like clicking with realistic mouse movement |
| 202 | + search_button = await tab.find(tag_name='button', text='Search') |
| 203 | + await search_button.click(human_like=True) |
| 204 | + # Automatically includes: curved mouse movement, natural click timing, |
| 205 | + # slight position variations |
| 206 | + |
| 207 | + # Human-like scrolling behavior |
| 208 | + await tab.scroll_to_element( |
| 209 | + await tab.find(class_name='results'), |
| 210 | + smooth=True, |
| 211 | + human_like=True |
| 212 | + ) |
| 213 | + # Includes: momentum scrolling, natural easing, realistic speed |
| 214 | + |
| 215 | + # Simulate human reading/scanning time |
| 216 | + results = await tab.find(class_name='result-item', find_all=True) |
| 217 | + for result in results: |
| 218 | + # Simulate human scanning time before clicking |
| 219 | + await asyncio.sleep(random.uniform(0.5, 2.0)) |
| 220 | + |
| 221 | + # Human-like hover before clicking (common human behavior) |
| 222 | + await result.hover(human_like=True) |
| 223 | + await asyncio.sleep(random.uniform(0.2, 0.8)) |
| 224 | + |
| 225 | + if await result.find(text='relevant content', raise_exc=False): |
| 226 | + await result.click(human_like=True) |
| 227 | + break |
| 228 | + |
| 229 | +asyncio.run(human_like_automation()) |
| 230 | +``` |
| 231 | + |
| 232 | +**Advanced Human Simulation Features:** |
| 233 | +- **Behavioral Fingerprinting Resistance**: Varies interaction patterns to avoid detection |
| 234 | +- **Attention Simulation**: Realistic focus patterns and tab switching behavior |
| 235 | +- **Error Simulation**: Occasional "human mistakes" like misclicks or typos that are corrected |
| 236 | +- **Reading Pattern Simulation**: Natural eye movement and reading time simulation |
| 237 | +- **Multi-tab Behavior**: Realistic tab management and switching patterns |
| 238 | + |
| 239 | +**Detection Evasion Techniques:** |
| 240 | +- **Canvas Fingerprinting Protection**: Randomizes canvas rendering signatures |
| 241 | +- **WebGL Fingerprinting Protection**: Varies WebGL parameters to avoid tracking |
| 242 | +- **Font Fingerprinting Resistance**: Randomizes font rendering characteristics |
| 243 | +- **Timezone and Locale Variation**: Realistic geographic and temporal variations |
| 244 | +- **Browser Fingerprint Randomization**: Varies browser characteristics between sessions |
148 | 245 |
|
149 | 246 | ### Advanced Element Finding |
150 | 247 |
|
@@ -305,13 +402,17 @@ asyncio.run(iframe_interaction()) |
305 | 402 |
|
306 | 403 | ## The Philosophy Behind Pydoll |
307 | 404 |
|
308 | | -Pydoll isn't just another automation library. It represents a different approach to solving real problems that developers face daily. |
| 405 | +Pydoll isn't just another automation library. It represents a fundamental shift in how we approach browser automation in an era of sophisticated anti-bot systems and advanced protection mechanisms. |
| 406 | + |
| 407 | +**Human-First Automation**: We believe automation should be indistinguishable from human behavior. Pydoll's core philosophy is that the best automation is the one that websites can't detect, achieved through sophisticated human behavior simulation rather than trying to outsmart detection systems. |
| 408 | + |
| 409 | +**Simplicity Without Sacrificing Power**: Powerful captcha bypass and human simulation shouldn't require complex configurations or external services. Pydoll offers advanced anti-detection functionality through a clean and intuitive API that works out of the box. |
309 | 410 |
|
310 | | -**Simplicity Without Sacrificing Power**: We believe that powerful tools don't need to be complex. Pydoll offers advanced functionality through a clean and intuitive API. |
| 411 | +**Performance That Matters**: In a world where every millisecond counts and protection systems analyze timing patterns, Pydoll's native asynchronous design ensures your automations are not just functional and efficient, but also naturally varied in timing to avoid detection. |
311 | 412 |
|
312 | | -**Performance That Matters**: In a world where every millisecond counts, Pydoll's native asynchronous design ensures your automations are not just functional, but efficient. |
| 413 | +**Constant Evolution**: The web ecosystem and its protection systems are always evolving, and Pydoll evolves with them. New challenges like advanced captchas, behavioral analysis, and fingerprinting techniques are met with innovative solutions integrated directly into the library. |
313 | 414 |
|
314 | | -**Constant Evolution**: The web ecosystem is always changing, and Pydoll evolves with it. New challenges like advanced captchas are met with innovative solutions integrated into the library. |
| 415 | +**Privacy and Independence**: Your automation shouldn't depend on external captcha solving services or send your data to third parties. Pydoll's built-in capabilities ensure your automations remain private and independent while being more reliable and cost-effective. |
315 | 416 |
|
316 | 417 | ## Documentation |
317 | 418 |
|
|
0 commit comments