Skip to content

Commit 9d2a44b

Browse files
committed
Merge branch 'main' of github.com-lucas:LucasAlvws/pydoll into issue-214
2 parents 8c63ecf + e8ba901 commit 9d2a44b

File tree

140 files changed

+19887
-58
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

140 files changed

+19887
-58
lines changed

.github/workflows/deploy-docs.yml

Lines changed: 29 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -1,37 +1,41 @@
1-
name: Deploy MkDocs to GitHub Pages
1+
name: Deploy site + docs
22

33
on:
44
push:
5-
branches:
6-
- main
5+
branches: [main]
76

87
jobs:
98
deploy:
109
runs-on: ubuntu-latest
11-
1210
steps:
13-
- name: Code Checkout
14-
uses: actions/checkout@v3
11+
- name: Code Checkout
12+
uses: actions/checkout@v4
13+
14+
- name: Setup Python
15+
uses: actions/setup-python@v5
16+
with:
17+
python-version: '3.x'
1518

16-
- name: Setup Python
17-
uses: actions/setup-python@v4
18-
with:
19-
python-version: '3.x'
19+
- name: Install Dependencies
20+
run: |
21+
python -m pip install --upgrade pip
22+
pip install mkdocs mkdocs-material pymdown-extensions mkdocstrings[python] mkdocs-static-i18n
2023
21-
- name: Install Dependencies
22-
run: |
23-
python -m pip install --upgrade pip
24-
pip install mkdocs
25-
pip install mkdocs-material
26-
pip install pymdown-extensions
27-
pip install mkdocstrings[python]
28-
pip install mkdocs-static-i18n
24+
# Build MkDocs em pasta temporária
25+
- name: Build MkDocs into temp folder
26+
run: mkdocs build --site-dir temp_docs
2927

30-
- name: Build the documentation
31-
run: mkdocs build
28+
# Criar estrutura final do site
29+
- name: Prepare final site
30+
run: |
31+
mkdir -p site/docs
32+
mkdir -p site/images
33+
cp -r temp_docs/* site/docs/
34+
cp -r public/* site/
3235
33-
- name: Deploy to GitHub Pages
34-
uses: peaceiris/actions-gh-pages@v3
35-
with:
36-
github_token: ${{ secrets.GITHUB_TOKEN }}
37-
publish_dir: ./site
36+
- name: Deploy to GitHub Pages
37+
uses: peaceiris/actions-gh-pages@v3
38+
with:
39+
github_token: ${{ secrets.GITHUB_TOKEN }}
40+
publish_dir: ./site
41+
cname: pydoll.tech

CHANGELOG.md

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,15 @@
1+
## 2.6.0 (2025-08-10)
2+
3+
### Feat
4+
5+
- add DownloadTimeout exception for file download timeouts
6+
- add context manager for handling file downloads in Tab class
7+
8+
### Refactor
9+
10+
- add type checking for connection handler in mixin class
11+
- add type overloads for event callback in Browser class
12+
113
## 2.5.0 (2025-08-07)
214

315
### Feat

README.md

Lines changed: 39 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -8,16 +8,16 @@
88
<a href="https://codecov.io/gh/autoscrape-labs/pydoll" >
99
<img src="https://codecov.io/gh/autoscrape-labs/pydoll/graph/badge.svg?token=40I938OGM9"/>
1010
</a>
11-
<img src="https://github.com/thalissonvs/pydoll/actions/workflows/tests.yml/badge.svg" alt="Tests">
12-
<img src="https://github.com/thalissonvs/pydoll/actions/workflows/ruff-ci.yml/badge.svg" alt="Ruff CI">
13-
<img src="https://github.com/thalissonvs/pydoll/actions/workflows/mypy.yml/badge.svg" alt="MyPy CI">
11+
<img src="https://github.com/autoscrape-labs/pydoll/actions/workflows/tests.yml/badge.svg" alt="Tests">
12+
<img src="https://github.com/autoscrape-labs/pydoll/actions/workflows/ruff-ci.yml/badge.svg" alt="Ruff CI">
13+
<img src="https://github.com/autoscrape-labs/pydoll/actions/workflows/mypy.yml/badge.svg" alt="MyPy CI">
1414
<img src="https://img.shields.io/badge/python-%3E%3D3.10-blue" alt="Python >= 3.10">
1515
<a href="https://deepwiki.com/autoscrape-labs/pydoll"><img src="https://deepwiki.com/badge.svg" alt="Ask DeepWiki"></a>
1616
</p>
1717

1818

1919
<p align="center">
20-
📖 <a href="https://autoscrape-labs.github.io/pydoll/">Documentation</a> •
20+
📖 <a href="https://pydoll.tech/">Documentation</a> •
2121
🚀 <a href="#-getting-started">Getting Started</a> •
2222
⚡ <a href="#-advanced-features">Advanced Features</a> •
2323
🤝 <a href="#-contributing">Contributing</a> •
@@ -97,6 +97,39 @@ await tab.request.get('https://api.example.com/data', headers=headers)
9797

9898
This opens up incredible possibilities for automation scenarios where you need both browser interaction AND API efficiency!
9999

100+
### New expect_download() context manager — robust file downloads made easy!
101+
Tired of fighting with flaky download flows, missing files, or racy event listeners? Meet `tab.expect_download()`, a delightful, reliable way to handle file downloads.
102+
103+
- Automatically sets the browser’s download behavior
104+
- Works with your own directory or a temporary folder (auto-cleaned!)
105+
- Waits for completion with a timeout (so your tests don’t hang)
106+
- Gives you a handy handle to read bytes/base64 or check `file_path`
107+
108+
Tiny example that just works:
109+
110+
```python
111+
import asyncio
112+
from pathlib import Path
113+
from pydoll.browser import Chrome
114+
115+
async def download_report():
116+
async with Chrome() as browser:
117+
tab = await browser.start()
118+
await tab.go_to('https://example.com/reports')
119+
120+
target_dir = Path('/tmp/my-downloads')
121+
async with tab.expect_download(keep_file_at=target_dir, timeout=10) as download:
122+
# Trigger the download in the page (button/link/etc.)
123+
await (await tab.find(text='Download latest report')).click()
124+
# Wait until finished and read the content
125+
data = await download.read_bytes()
126+
print(f"Downloaded {len(data)} bytes to: {download.file_path}")
127+
128+
asyncio.run(download_report())
129+
```
130+
131+
Want zero-hassle cleanup? Omit `keep_file_at` and we’ll create a temp folder and remove it automatically after the context exits. Perfect for tests.
132+
100133
### Total browser control with custom preferences! (thanks to [@LucasAlvws](https://github.com/LucasAlvws))
101134
Want to completely customize how Chrome behaves? **Now you can control EVERYTHING!**<br>
102135
The new `browser_preferences` system gives you access to hundreds of internal Chrome settings that were previously impossible to change programmatically. We're talking about deep browser customization that goes way beyond command-line flags!
@@ -176,7 +209,7 @@ options.browser_preferences = {
176209

177210
This level of control was previously only available to Chrome extension developers - now it's in your automation toolkit!
178211

179-
Check the [documentation](https://autoscrape-labs.github.io/pydoll/features/#custom-browser-preferences/) for more details.
212+
Check the [documentation](https://pydoll.tech/docs/features/#custom-browser-preferences/) for more details.
180213

181214
### New `get_parent_element()` method
182215
Retrieve the parent of any WebElement, making it easier to navigate the DOM structure:
@@ -487,7 +520,7 @@ options.add_argument('--disable-dev-shm-usage')
487520

488521
## 📚 Documentation
489522

490-
For complete documentation, detailed examples and deep dives into all Pydoll functionalities, visit our [official documentation](https://autoscrape-labs.github.io/pydoll/).
523+
For complete documentation, detailed examples and deep dives into all Pydoll functionalities, visit our [official documentation](https://pydoll.tech/).
491524

492525
The documentation includes:
493526
- **Getting Started Guide** - Step-by-step tutorials

README_zh.md

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -195,6 +195,40 @@ options = ChromiumOptions()
195195
options.start_timeout = 20 # 等待 20 秒
196196
```
197197

198+
### 新的 expect_download() 上下文管理器 —— 稳健、优雅的文件下载!
199+
还在为不稳定的下载流程、丢失的文件或混乱的事件监听而头疼吗?`tab.expect_download()` 来了:一种可靠、简洁的下载方式。
200+
201+
- 自动配置浏览器下载行为
202+
- 支持自定义下载目录或临时目录(自动清理!)
203+
- 内置超时等待,防止任务卡住
204+
- 提供便捷句柄:读取字节/BASE64,获取 `file_path`
205+
206+
一个“开箱即用”的小示例:
207+
208+
```python
209+
import asyncio
210+
from pathlib import Path
211+
from pydoll.browser import Chrome
212+
213+
async def download_report():
214+
async with Chrome() as browser:
215+
tab = await browser.start()
216+
await tab.go_to('https://example.com/reports')
217+
218+
target_dir = Path('/tmp/my-downloads')
219+
async with tab.expect_download(keep_file_at=target_dir, timeout=10) as dl:
220+
# 触发页面上的下载(按钮/链接等)
221+
await (await tab.find(text='Download latest report')).click()
222+
223+
# 等待完成并读取内容
224+
data = await dl.read_bytes()
225+
print(f"已下载 {len(data)} 字节,保存至: {dl.file_path}")
226+
227+
asyncio.run(download_report())
228+
```
229+
230+
想要“零成本清理”?不传 `keep_file_at` 即可——我们会创建临时目录,并在上下文退出后自动清理。对测试场景非常友好。
231+
198232
## 📦 安装
199233

200234
```bash

cz.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,4 +2,4 @@
22
commitizen:
33
name: cz_conventional_commits
44
tag_format: $version
5-
version: 2.5.0
5+
version: 2.6.0

docs/deep-dive/webelement-domain.md

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -73,6 +73,7 @@ classDiagram
7373
+get_attribute(name: str)
7474
+set_input_files(files: list[str])
7575
+scroll_into_view()
76+
+wait_until()
7677
+take_screenshot(path: str)
7778
+text
7879
+inner_html
@@ -373,8 +374,19 @@ is_visible = await element._is_element_visible()
373374

374375
# Check if element is the topmost at its position
375376
is_on_top = await element._is_element_on_top()
377+
378+
# Check if element can be interacted with
379+
is_interactable = await element._is_element_interactable()
380+
381+
# Wait until the element is ready for interaction
382+
await element.wait_until(is_visible=True, is_interactable=True, timeout=5)
383+
384+
# Raises ``WaitElementTimeout`` if the conditions aren't met in time.
376385
```
377386

387+
If both ``is_visible`` and ``is_interactable`` are set to ``True``, the element
388+
must satisfy **both** conditions to proceed.
389+
378390
These verifications are crucial for reliable automation, ensuring that elements can be interacted with before attempting operations.
379391

380392
## Position and Scrolling

docs/features.md

Lines changed: 55 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -209,6 +209,61 @@ asyncio.run(background_bypass_example())
209209

210210
Access websites that actively block automation tools without using third-party captcha solving services. This native captcha handling makes Pydoll suitable for automating previously inaccessible websites.
211211

212+
## Reliable Download Handling with expect_download
213+
214+
The `tab.expect_download()` context manager provides a robust, event-driven way to capture file downloads.
215+
216+
- Configures browser download behavior for you
217+
- Supports persistent target directory (`keep_file_at`) or temporary directory with auto-cleanup
218+
- Exposes a `_DownloadHandle` with convenience methods
219+
- Includes timeout protection to avoid indefinite waits
220+
221+
### API Overview
222+
223+
```python
224+
async with tab.expect_download(
225+
keep_file_at: Optional[str | Path] = None,
226+
timeout: Optional[float] = None,
227+
) as handle:
228+
... # trigger download action in page
229+
```
230+
231+
- `keep_file_at`: Target directory to keep the downloaded file. If `None`, a temporary directory is created and removed automatically when the context exits.
232+
- `timeout`: Maximum seconds to wait for completion (defaults to 60 if not provided).
233+
234+
`handle` exposes:
235+
236+
- `handle.file_path: Optional[str]` — final resolved path after completion
237+
- `await handle.read_bytes() -> bytes`
238+
- `await handle.read_base64() -> str`
239+
- `await handle.wait_started(timeout: Optional[float] = None) -> None`
240+
- `await handle.wait_finished(timeout: Optional[float] = None) -> None`
241+
242+
### Usage Examples
243+
244+
Persist file in a specific directory:
245+
246+
```python
247+
async with tab.expect_download(keep_file_at='/tmp/dl', timeout=15) as dl:
248+
await (await tab.find(text='Export CSV')).click()
249+
data = await dl.read_bytes()
250+
print('Saved at:', dl.file_path)
251+
```
252+
253+
Use a temporary directory (auto-cleanup) for tests:
254+
255+
```python
256+
async with tab.expect_download() as dl:
257+
await (await tab.find(text='Download PDF')).click()
258+
pdf_b64 = await dl.read_base64()
259+
# temp directory is cleaned automatically when leaving the context
260+
```
261+
262+
Notes:
263+
264+
- When the page emits no completion event within the configured `timeout`, a `DownloadTimeout` exception is raised.
265+
- If the browser does not provide a `filePath`, the manager falls back to the suggested filename in the chosen directory.
266+
212267
## Multi-Tab Management
213268

214269
Pydoll provides sophisticated tab management capabilities with a singleton pattern that ensures efficient resource usage and prevents duplicate Tab instances for the same browser tab.

docs/landing-sitemap.xml

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
<?xml version="1.0" encoding="UTF-8"?>
2+
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
3+
<url>
4+
<loc>https://pydoll.tech/</loc>
5+
<changefreq>daily</changefreq>
6+
<priority>1.0</priority>
7+
</url>
8+
</urlset>

0 commit comments

Comments
 (0)