Skip to content

Commit fdf17b2

Browse files
authored
Merge branch 'main' into feat/tab-request-api-issue-171
2 parents b2a14ba + e6b4e95 commit fdf17b2

File tree

16 files changed

+863
-36
lines changed

16 files changed

+863
-36
lines changed

.github/ISSUE_TEMPLATE/config.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
blank_issues_enabled: false
1+
blank_issues_enabled: true
22
contact_links:
33
- name: Questions & Discussions
44
url: https://github.com/thalissonvs/pydoll/discussions

CHANGELOG.md

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,26 @@
1+
## 2.4.0 (2025-08-01)
2+
3+
### Feat
4+
5+
- changing bool prefs to properties and adding support to user-data-dir preferences
6+
- adding prefs options customization
7+
- add overloads for find and query methods in FindElementsMixin
8+
- add method to retrieve parent element and its attributes
9+
- implements start_timeout option
10+
11+
### Fix
12+
13+
- adding typehint and fixing some codes
14+
- removing options preferences private attributes
15+
- set default URL to 'about:blank' in create_target method
16+
- change navigation when creating a new tab
17+
- add type hinting support and update project description
18+
19+
### Refactor
20+
21+
- remove redundant asterisk from find method overloads and reorganize query method overloads
22+
- refine type hint for response parameter and improve key check
23+
124
## 2.3.1 (2025-07-12)
225

326
### Fix

README.md

Lines changed: 58 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@
44
<h1 align="center">Pydoll: Automate the Web, Naturally</h1>
55

66
<p align="center">
7+
<a href="https://github.com/autoscrape-labs/pydoll/stargazers"><img src="https://img.shields.io/github/stars/autoscrape-labs/pydoll?style=social"></a>
78
<a href="https://codecov.io/gh/autoscrape-labs/pydoll" >
89
<img src="https://codecov.io/gh/autoscrape-labs/pydoll/graph/badge.svg?token=40I938OGM9"/>
910
</a>
@@ -31,15 +32,64 @@ Built from scratch with a different philosophy, Pydoll connects directly to the
3132

3233
We believe that powerful automation shouldn't require you to become an expert in configuration or constantly fight with bot protection systems. With Pydoll, you can focus on what really matters: your automation logic, not the underlying complexity or protection systems.
3334

35+
<div>
36+
<h4>Be a good human. Give it a star ⭐</h4>
37+
No stars, no bugs fixed. Just kidding (maybe)
38+
</div>
39+
3440
## 🌟 What makes Pydoll special?
3541

3642
- **Zero Webdrivers**: Say goodbye to webdriver compatibility issues
37-
- **Native Captcha Bypass**: Handles Cloudflare Turnstile and reCAPTCHA v3*
43+
- **Human-like Interaction Engine**: Capable of passing behavioral CAPTCHAs like reCAPTCHA v3 or Turnstile, depending on IP reputation and interaction patterns
3844
- **Asynchronous Performance**: For high-speed automation and multiple simultaneous tasks
3945
- **Humanized Interactions**: Mimic real user behavior
4046
- **Simplicity**: With Pydoll, you install and you're ready to automate.
4147

42-
>⚠️ The effectiveness of captcha bypass depends on various factors, such as IP address reputation. Pydoll can achieve scores comparable to real users, but cannot handle restrictive configurations or IP blocks.
48+
## What's New in 2.4.0
49+
50+
### Advanced browser preferences support (thanks to [@LucasAlvws](https://github.com/LucasAlvws))
51+
You can now customize Chromium browser preferences through the `browser_preferences` dict in ChromiumOptions.<br><br>
52+
Set things like download directory, language, notification blocking, PDF handling, and more.
53+
Helper properties like `set_default_download_directory`, `set_accept_languages`, and `prompt_for_download` were added for convenience.
54+
Preferences are merged automatically, no need to redefine everything.<br><br>
55+
Here's an example:
56+
57+
```python
58+
options = ChromiumOptions()
59+
options.browser_preferences = { # you can set the entire dict
60+
'download': {
61+
'default_directory': '/tmp/downloads',
62+
'prompt_for_download': False
63+
},
64+
'intl': {
65+
'accept_languages': 'en-US,en,pt-BR'
66+
},
67+
'profile': {
68+
'default_content_setting_values': {
69+
'notifications': 2 # Block notifications
70+
}
71+
}
72+
}
73+
74+
options.set_default_download_directory('/tmp/downloads') # or just the individual properties
75+
options.set_accept_languages('en-US,en,pt-BR')
76+
options.prompt_for_download = False
77+
```
78+
See [docs/features.md](docs/features.md#custom-browser-preferences) for more details.
79+
80+
### New `get_parent_element()` method
81+
Retrieve the parent of any WebElement, making it easier to navigate the DOM structure:
82+
```python
83+
element = await tab.find(id='button')
84+
parent = await element.get_parent_parent()
85+
```
86+
### New start_timeout option (thanks to [@j0j1j2](https://github.com/j0j1j2))
87+
Added to ChromiumOptions to control how long the browser can take to start. Useful on slower machines or CI environments.
88+
89+
```python
90+
options = ChromiumOptions()
91+
options.start_timeout = 20 # wait 20 seconds
92+
```
4393

4494
## 📦 Installation
4595

@@ -360,13 +410,17 @@ Please make sure to:
360410
If you find Pydoll useful, consider [supporting me on GitHub](https://github.com/sponsors/thalissonvs).
361411
You'll get access to exclusive benefits like priority support, custom features and much more!
362412

363-
Can't sponsor right now? No problem you can still help a lot by:
413+
Can't sponsor right now? No problem, you can still help a lot by:
364414
- Starring the repository
365415
- Sharing on social media
366416
- Writing posts or tutorials
367417
- Giving feedback or reporting issues
368418

369-
Every bit of support makes a difference — thank you!
419+
Every bit of support makes a difference/
420+
421+
## 💬 Spread the word
422+
423+
If Pydoll saved you time, mental health, or a keyboard from being smashed, give it a ⭐, share it, or tell your weird dev friends.
370424

371425
## 📄 License
372426

cz.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,4 +2,4 @@
22
commitizen:
33
name: cz_conventional_commits
44
tag_format: $version
5-
version: 2.3.1
5+
version: 2.4.0
Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
# Deep Dive: Custom Browser Preferences in Pydoll
2+
3+
## Overview
4+
The `browser_preferences` feature (PR #204) enables direct, fine-grained control over Chromium browser settings via the `ChromiumOptions` API. This is essential for advanced automation, testing, and scraping scenarios where default browser behavior must be customized.
5+
6+
## How It Works
7+
- `ChromiumOptions.browser_preferences` is a dictionary that maps directly to Chromium's internal preferences structure.
8+
- Preferences are merged: setting new keys updates only those keys, preserving others.
9+
- Helper methods (`set_default_download_directory`, `set_accept_languages`, etc.) are provided for common scenarios.
10+
- Preferences are applied before browser launch, ensuring all settings take effect from the start of the session.
11+
- Validation ensures only dictionaries are accepted; invalid structures raise clear errors.
12+
13+
## Example
14+
```python
15+
options = ChromiumOptions()
16+
options.browser_preferences = {
17+
'download': {'default_directory': '/tmp', 'prompt_for_download': False},
18+
'intl': {'accept_languages': 'en-US,en'},
19+
'profile': {'default_content_setting_values': {'notifications': 2}}
20+
}
21+
```
22+
23+
## Advanced Usage
24+
- **Merging:** Multiple assignments merge keys, so you can incrementally build your preferences.
25+
- **Validation:** If you pass a non-dict or use the reserved 'prefs' key, an error is raised.
26+
- **Internals:** Preferences are set via a recursive setter that creates nested dictionaries as needed.
27+
- **Integration:** Used by the browser process manager to initialize the user data directory with your custom settings.
28+
29+
## Best Practices
30+
- Use helper methods for common patterns; set `browser_preferences` directly for advanced needs.
31+
- Check Chromium documentation for available preferences: https://chromium.googlesource.com/chromium/src/+/4aaa9f29d8fe5eac55b8632fa8fcb05a68d9005b/chrome/common/pref_names.cc
32+
- Avoid setting experimental or undocumented preferences unless you know their effects.
33+
34+
## References
35+
- See `pydoll/browser/options.py` for implementation details.
36+
- See tests in `tests/test_browser/test_browser_chrome.py` for usage examples.

docs/features.md

Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -662,6 +662,55 @@ These network analysis capabilities make Pydoll ideal for:
662662
- **Debugging**: Identify failed requests and network issues
663663
- **Security Testing**: Analyze request/response patterns
664664

665+
## Custom Browser Preferences
666+
667+
Pydoll now supports advanced browser customization through the `ChromiumOptions.browser_preferences` property. This allows you to set any Chromium browser preference for your automation session.
668+
669+
### What You Can Customize
670+
- Download directory and prompt behavior
671+
- Accepted languages
672+
- Notification blocking
673+
- PDF handling
674+
- Any other Chromium-supported preference
675+
676+
### Example: Setting Preferences
677+
```python
678+
from pydoll.browser.chromium import Chrome
679+
from pydoll.browser.options import ChromiumOptions
680+
681+
options = ChromiumOptions()
682+
options.browser_preferences = {
683+
'download': {
684+
'default_directory': '/tmp/downloads',
685+
'prompt_for_download': False
686+
},
687+
'intl': {
688+
'accept_languages': 'en-US,en,pt-BR'
689+
},
690+
'profile': {
691+
'default_content_setting_values': {
692+
'notifications': 2 # Block notifications
693+
}
694+
}
695+
}
696+
697+
# Helper methods for common preferences
698+
options.set_default_download_directory('/tmp/downloads')
699+
options.set_accept_languages('en-US,en,pt-BR')
700+
options.prompt_for_download = False
701+
702+
browser = Chrome(options=options)
703+
```
704+
705+
You can call `browser_preferences` multiple times—new keys will be merged, not replaced.
706+
707+
### Why is this useful?
708+
- Fine-grained control for scraping, testing, or automation
709+
- Avoid popups or unwanted prompts
710+
- Match user locale or automate downloads
711+
712+
---
713+
665714
## File Upload Support
666715

667716
Seamlessly handle file uploads in your automation:

pydoll/browser/chromium/base.py

Lines changed: 68 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,12 @@
11
import asyncio
2+
import json
3+
import os
4+
import shutil
25
from abc import ABC, abstractmethod
6+
from contextlib import suppress
37
from functools import partial
48
from random import randint
9+
from tempfile import TemporaryDirectory
510
from typing import Any, Callable, Optional
611

712
from pydoll.browser.interfaces import BrowserOptionsManager
@@ -77,13 +82,20 @@ def __init__(
7782
self._browser_process_manager = BrowserProcessManager()
7883
self._temp_directory_manager = TempDirectoryManager()
7984
self._connection_handler = ConnectionHandler(self._connection_port)
85+
self._backup_preferences_dir = ''
8086

8187
async def __aenter__(self) -> 'Browser':
8288
"""Async context manager entry."""
8389
return self
8490

8591
async def __aexit__(self, exc_type, exc_val, exc_tb):
8692
"""Async context manager exit with cleanup."""
93+
if self._backup_preferences_dir:
94+
user_data_dir = self._get_user_data_dir()
95+
shutil.copy2(
96+
self._backup_preferences_dir,
97+
os.path.join(user_data_dir, 'Default', 'Preferences'),
98+
)
8799
if await self._is_browser_running(timeout=2):
88100
await self.stop()
89101

@@ -113,9 +125,7 @@ async def start(self, headless: bool = False) -> Tab:
113125
proxy_config = self._proxy_manager.get_proxy_credentials()
114126

115127
self._browser_process_manager.start_browser_process(
116-
binary_location,
117-
self._connection_port,
118-
self.options.arguments,
128+
binary_location, self._connection_port, self.options.arguments
119129
)
120130
await self._verify_browser_running()
121131
await self._configure_proxy(proxy_config[0], proxy_config[1])
@@ -199,12 +209,13 @@ async def new_tab(self, url: str = '', browser_context_id: Optional[str] = None)
199209
"""
200210
response: CreateTargetResponse = await self._execute_command(
201211
TargetCommands.create_target(
202-
url=url,
203212
browser_context_id=browser_context_id,
204213
)
205214
)
206215
target_id = response['result']['targetId']
207-
return Tab(self, self._connection_port, target_id, browser_context_id)
216+
tab = Tab(self, self._connection_port, target_id, browser_context_id)
217+
if url: await tab.go_to(url)
218+
return tab
208219

209220
async def get_targets(self) -> list[TargetInfo]:
210221
"""
@@ -584,10 +595,60 @@ async def _execute_command(
584595

585596
def _setup_user_dir(self):
586597
"""Setup temporary user data directory if not specified in options."""
587-
if '--user-data-dir' not in [arg.split('=')[0] for arg in self.options.arguments]:
588-
# For all browsers, use a temporary directory
598+
user_data_dir = self._get_user_data_dir()
599+
if user_data_dir and self.options.browser_preferences:
600+
self._set_browser_preferences_in_user_data_dir(user_data_dir)
601+
elif not user_data_dir:
589602
temp_dir = self._temp_directory_manager.create_temp_dir()
603+
# For all browsers, use a temporary directory
590604
self.options.arguments.append(f'--user-data-dir={temp_dir.name}')
605+
if self.options.browser_preferences:
606+
self._set_browser_preferences_in_temp_dir(temp_dir)
607+
608+
def _set_browser_preferences_in_temp_dir(self, temp_dir: TemporaryDirectory):
609+
os.mkdir(os.path.join(temp_dir.name, 'Default'))
610+
preferences = self.options.browser_preferences
611+
with open(
612+
os.path.join(temp_dir.name, 'Default', 'Preferences'), 'w', encoding='utf-8'
613+
) as json_file:
614+
json.dump(preferences, json_file)
615+
616+
def _set_browser_preferences_in_user_data_dir(self, user_data_dir: str):
617+
"""
618+
Set browser preferences in the user data directory.
619+
620+
This function will:
621+
1. Create a backup of the existing Preferences file if it exists
622+
2. Create Default directory if it doesn't exist
623+
3. Write the new preferences to the Preferences file
624+
625+
Args:
626+
user_data_dir: Path to the user data directory
627+
"""
628+
default_dir = os.path.join(user_data_dir, 'Default')
629+
os.makedirs(default_dir, exist_ok=True)
630+
631+
preferences_path = os.path.join(default_dir, 'Preferences')
632+
self._backup_preferences_dir = os.path.join(default_dir, 'Preferences.backup')
633+
634+
if os.path.exists(preferences_path):
635+
# Backup existing Preferences file
636+
shutil.copy2(preferences_path, self._backup_preferences_dir)
637+
638+
preferences = {}
639+
if os.path.exists(preferences_path):
640+
with suppress(json.JSONDecodeError):
641+
with open(preferences_path, 'r', encoding='utf-8') as preferences_file:
642+
preferences = json.load(preferences_file)
643+
preferences.update(self.options.browser_preferences)
644+
with open(preferences_path, 'w', encoding='utf-8') as json_file:
645+
json.dump(preferences, json_file, indent=2)
646+
647+
def _get_user_data_dir(self) -> Optional[str]:
648+
for arg in self.options.arguments:
649+
if arg.startswith('--user-data-dir='):
650+
return arg.split('=', 1)[1]
651+
return None
591652

592653
@abstractmethod
593654
def _get_default_binary_location(self) -> str:

pydoll/browser/interfaces.py

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,11 @@ def start_timeout(self) -> int:
2121
def add_argument(self, argument: str):
2222
pass
2323

24+
@property
25+
@abstractmethod
26+
def browser_preferences(self) -> dict:
27+
pass
28+
2429

2530
class BrowserOptionsManager(ABC):
2631
@abstractmethod

0 commit comments

Comments
 (0)