Skip to content

Commit 15a3470

Browse files
committed
Merge branch 'main' into codex/add-page_load_state-enum-to-options
2 parents 73e76eb + 7d68de9 commit 15a3470

34 files changed

+2151
-686
lines changed

.github/workflows/tests.yml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,10 @@ jobs:
2222
run: |
2323
python -m pip install poetry
2424
poetry install
25+
- name: Install Chrome
26+
uses: browser-actions/setup-chrome@v1
27+
with:
28+
chrome-version: 132
2529
- name: Run tests with coverage
2630
run: |
2731
poetry run pytest -s -x --cov=pydoll -vv --cov-report=xml

.gitignore

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -161,4 +161,7 @@ cython_debug/
161161
#.idea/
162162

163163
.czrc
164-
.ruff_cache/
164+
.ruff_cache/
165+
166+
# Dev test file
167+
dev_test_file.py

CHANGELOG.md

Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,45 @@
1+
## 2.8.2 (2025-10-03)
2+
3+
### Fix
4+
5+
- implement proxy authentication handling for browser tabs
6+
- map exception when try to take screenshot of an iframe
7+
8+
## 2.8.1 (2025-09-27)
9+
10+
### Fix
11+
12+
- store the opened tab in the _tabs_opened dictionary
13+
- **elements**: correctly detect parenthesized XPath expressions
14+
15+
### Refactor
16+
17+
- simplify FindElementsMixin._get_expression_type startswith checks into single tuple
18+
19+
## 2.8.0 (2025-08-28)
20+
21+
### Feat
22+
23+
- adding get_siblings_elements method
24+
- adding get_children_elements method
25+
- refactor Tab class to support optional WebSocket address handling
26+
- add WebSocket connection support for existing browser instances
27+
- add optional WebSocket address support in connection handler
28+
29+
### Fix
30+
31+
- add get siblings and get childen methods a raise_exc option
32+
- improving children and parent retrive docstring and creating a private generic method for then
33+
- using new execute_script public method
34+
- solving conflicts
35+
- rename pages fixtures files and adding a error test
36+
37+
### Refactor
38+
39+
- refactor Tab class to improve initialization and error handling
40+
- refactor Browser class to manage opened tabs and WebSocket setup
41+
- add new exception classes for connection and WebSocket errors
42+
143
## 2.7.0 (2025-08-22)
244

345
### Feat

LICENSE

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
The MIT License (MIT)
22

3-
Copyright © 2025 <copyright holders>
3+
Copyright © 2025 AutoscrapeLabs
44

55
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
66

README.md

Lines changed: 1 addition & 39 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
<p align="center">
22
<img src="https://github.com/user-attachments/assets/219f2dbc-37ed-4aea-a289-ba39cdbb335d" alt="Pydoll Logo" /> <br>
33
</p>
4-
<h1 align="center">Pydoll: Automate the Web, Naturally</h1>
4+
<h1 align="center">Pydoll: scraping, the easier way</h1>
55

66
<p align="center">
77
<a href="https://github.com/autoscrape-labs/pydoll/stargazers"><img src="https://img.shields.io/github/stars/autoscrape-labs/pydoll?style=social"></a>
@@ -45,43 +45,6 @@ We believe that powerful automation shouldn't require you to become an expert in
4545
- **Humanized Interactions**: Mimic real user behavior
4646
- **Simplicity**: With Pydoll, you install and you're ready to automate.
4747

48-
## What's New
49-
50-
### WebElement: state waiting and new public APIs
51-
52-
- New `wait_until(...)` on `WebElement` to await element states with minimal code:
53-
54-
```python
55-
# Wait until it becomes visible OR the timeout expires
56-
await element.wait_until(is_visible=True, timeout=5)
57-
58-
# Wait until it becomes interactable (visible, on top, receiving pointer events)
59-
await element.wait_until(is_interactable=True, timeout=10)
60-
```
61-
62-
- Methods now public on `WebElement`:
63-
- `is_visible()`
64-
- Checks that the element has a visible area (> 0), isn’t hidden by CSS and is in the viewport (after `scroll_into_view()` when needed). Useful pre-check before interactions.
65-
- `is_interactable()`
66-
- “Click-ready” state: combines visibility, enabledness and pointer-event hit testing. Ideal for robust flows that avoid lost clicks.
67-
- `is_on_top()`
68-
- Verifies the element is the top hit-test target at the intended click point, avoiding overlays.
69-
- `execute_script(script: str, return_by_value: bool = False)`
70-
- Executes JavaScript in the element’s own context (where `this` is the element). Great for fine-tuning and quick inspections.
71-
72-
```python
73-
# Visually outline the element via JS
74-
await element.execute_script("this.style.outline='2px solid #22d3ee'")
75-
76-
# Confirm states
77-
visible = await element.is_visible()
78-
interactable = await element.is_interactable()
79-
on_top = await element.is_on_top()
80-
```
81-
82-
These additions simplify waiting and state validation before clicking/typing, reducing flakiness and making automations more predictable.
83-
84-
8548
## 📦 Installation
8649

8750
```bash
@@ -208,7 +171,6 @@ Pydoll offers a series of advanced features to please even the most
208171
demanding users.
209172

210173

211-
212174
### Advanced Element Search
213175

214176
We have several ways to find elements on the page. No matter how you prefer, we have a way that makes sense for you:

README_zh.md

Lines changed: 40 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -49,6 +49,45 @@ Pydoll 采用全新设计理念,从零构建,直接对接 Chrome DevTools Pr
4949

5050
## 最新功能
5151

52+
### 通过 WebSocket 进行远程连接 —— 随时随地控制浏览器!
53+
54+
现在你可以使用浏览器的 WebSocket 地址直接连接到已运行的实例,并立即使用完整的 Pydoll API:
55+
56+
```python
57+
from pydoll.browser.chromium import Chrome
58+
59+
chrome = Chrome()
60+
tab = await chrome.connect('ws://YOUR_HOST:9222/devtools/browser/XXXX')
61+
62+
# 直接开干:导航、元素自动化、请求、事件…
63+
await tab.go_to('https://example.com')
64+
title = await tab.execute_script('return document.title')
65+
print(title)
66+
```
67+
68+
这让你可以轻松对接远程/CI 浏览器、容器或共享调试目标——无需本地启动,只需指向 WS 端点即可自动化。
69+
70+
### 像专业人士一样漫游 DOM:get_children_elements() 与 get_siblings_elements()
71+
72+
两个让复杂布局遍历更优雅的小助手:
73+
74+
```python
75+
# 获取容器的直接子元素
76+
container = await tab.find(id='cards')
77+
cards = await container.get_children_elements(max_depth=1)
78+
79+
# 想更深入?这将返回子元素的子元素(以此类推)
80+
elements = await container.get_children_elements(max_depth=2)
81+
82+
# 在横向列表中无痛遍历兄弟元素
83+
active = await tab.find(class_name='item--active')
84+
siblings = await active.get_siblings_elements()
85+
86+
print(len(cards), len(siblings))
87+
```
88+
89+
用更少样板代码表达更多意图,特别适合动态网格、列表与菜单的场景,让抓取/自动化逻辑更清晰、更可读。
90+
5291
### WebElement:状态等待与新的公共 API
5392

5493
- 新增 `wait_until(...)` 用于等待元素状态,使用更简单:
@@ -212,7 +251,7 @@ options.browser_preferences = {
212251

213252
这种控制级别以前只有 Chrome 扩展开发者才能使用 - 现在它在你的自动化工具包中!
214253

215-
查看[文档](https://autoscrape-labs.github.io/pydoll/features/custom-browser-preferences/)了解更多详情。
254+
查看[文档](https://pydoll.tech/docs/zh/features/#custom-browser-preferences/)了解更多详情。
216255

217256
### 新的 `get_parent_element()` 方法
218257
检索任何 WebElement 的父元素,使导航 DOM 结构更加容易:

cz.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,4 +2,4 @@
22
commitizen:
33
name: cz_conventional_commits
44
tag_format: $version
5-
version: 2.7.0
5+
version: 2.8.2

public/docs/api/commands/target.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -100,6 +100,9 @@ incognito_tab = await create_target(
100100
)
101101
```
102102

103+
!!! info "Headless vs Headed: how contexts show up"
104+
Browser contexts are isolated logical environments. In headed mode, the first page created inside a new context will usually open in a new OS window. In headless mode, no window is shown — the isolation remains purely logical (cookies, storage, cache and auth state are still separate per context). Prefer contexts in headless/CI pipelines for performance and clean isolation.
105+
103106
## Advanced Features
104107

105108
### Target Events

public/docs/deep-dive/browser-domain.md

Lines changed: 79 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -327,6 +327,85 @@ Browser contexts are essential for several automation scenarios:
327327
4. **Session Isolation**: Prevent cross-contamination between test scenarios
328328
5. **Parallel Scraping**: Scrape multiple sites with different configurations
329329

330+
### Headless vs Headed: Windows and Best Practices
331+
332+
Browser contexts are a logical isolation layer. What you actually see is the page created inside a context:
333+
334+
- In headed mode (visible UI), creating the first page inside a new browser context will typically open a new OS window. The context is the isolated environment; the page is what renders in a tab or window.
335+
- In headless mode (no visible UI), no windows appear. The isolation still exists logically in the background, keeping cookies, storage, cache and auth state fully separate per context.
336+
337+
Recommendations:
338+
339+
- Prefer using multiple contexts in headless environments (e.g., CI/CD) for cleaner isolation, faster startup, and lower resource usage compared to launching multiple browser processes.
340+
- Use contexts to simulate multiple users or sessions in parallel without cross-contamination.
341+
342+
Why contexts are efficient:
343+
344+
- Creating a new browser context is significantly faster and lighter than starting a whole new browser instance. This makes test suites and scraping jobs more reliable and scalable.
345+
346+
### CDP Hierarchy and Context Window Semantics (Advanced)
347+
348+
To reason precisely about contexts, it's useful to map Pydoll concepts to CDP:
349+
350+
- Browser (process): single Chromium process running the DevTools endpoint.
351+
- BrowserContext: isolated profile inside that process (cookies, storage, cache, permissions).
352+
- Target/Page: an individual top-level page, popup, or background target that you control.
353+
354+
CDP and `browserContextId`:
355+
356+
- When creating a page via `Target.createTarget`, passing `browserContextId` tells the browser which isolated profile the new page should belong to. Without this ID, the target is created in the default context.
357+
- The ID is essential for isolation — it binds the new target to the correct storage/auth/permission boundary.
358+
359+
Why the first page in a context opens a window (headed):
360+
361+
- In headed mode, a page needs a top-level native window to render. A freshly created context initially has no window associated with it — it exists only in memory.
362+
- The first page created in that context implicitly materializes a window for that context. Subsequent pages can open as tabs within that window.
363+
364+
Implications for `new_window`/`newWindow` semantics:
365+
366+
- If you attempt to create a page with "tab-like" behavior (no new top-level window) in a context that has no existing window (first page), the browser may error because there is no host window to attach the tab to.
367+
- Practically: treat the first page in a new context (headed) as requiring a top-level window. Afterwards, you can create additional pages as tabs.
368+
369+
Headless mode makes this distinction moot:
370+
371+
- With no visible UI, windows vs tabs are logical constructs only. Context isolation is enforced the same way, but nothing is rendered, so there is no requirement to bootstrap a native window for the first page.
372+
373+
### Context-specific Proxy: sanitize + auth via Fetch events
374+
375+
When creating a browser context with a private proxy (credentials embedded in the URL), Pydoll follows a two-step strategy to avoid leaking credentials and reliably authenticate:
376+
377+
1) Sanitize the proxy server in the CDP command
378+
379+
- If you pass `proxy_server='http://user:pass@host:port'`, only the credential-free URL is sent to CDP (`http://host:port`).
380+
- Internally, Pydoll extracts and stores the credentials keyed by `browserContextId`.
381+
382+
2) Attach per-context auth handlers on first tab
383+
384+
- When you open a `Tab` inside that context, Pydoll enables Fetch events for that tab and registers two temporary listeners:
385+
- `Fetch.requestPaused`: continues normal requests.
386+
- `Fetch.authRequired`: automatically responds with the stored `user`/`pass`, then disables Fetch to avoid intercepting further requests.
387+
388+
Why this design?
389+
390+
- Prevents credential exposure in command logs and CDP parameters.
391+
- Keeps the auth scope strictly limited to the context that requested the proxy.
392+
- Works in both headed and headless modes (the auth flow is network-level, not UI-dependent).
393+
394+
Code flow highlights (simplified):
395+
396+
```python
397+
# On context creation
398+
context_id = await browser.create_browser_context(proxy_server='user:pwd@host:port')
399+
# => sends Target.createBrowserContext with 'http://host:port'
400+
# => stores {'context_id': ('user', 'pwd')} internally
401+
402+
# On first tab in that context
403+
tab = await browser.new_tab(browser_context_id=context_id)
404+
# => tab.enable_fetch_events(handle_auth=True)
405+
# => tab.on('Fetch.requestPaused', continue_request)
406+
# => tab.on('Fetch.authRequired', continue_with_auth(user, pwd))
407+
```
408+
330409
### Creating and Managing Contexts
331410

332411
```python

public/docs/deep-dive/tab-domain.md

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -368,6 +368,21 @@ These visual capture capabilities are invaluable for:
368368
- Debugging automation scripts
369369
- Archiving page content
370370

371+
!!! warning "Top-level targets vs iFrames for Tab screenshots"
372+
`Tab.take_screenshot()` relies on CDP's `Page.captureScreenshot`, which only works for top-level targets. If you obtained a `Tab` for an iframe using `await tab.get_frame(iframe_element)`, calling `take_screenshot()` on that iframe tab will raise `TopLevelTargetRequired`.
373+
374+
Use `WebElement.take_screenshot()` inside iframes. It captures via the viewport and works within the iframe context.
375+
376+
```python
377+
# Wrong: iframe Tab screenshot (raises TopLevelTargetRequired)
378+
iframe_tab = await tab.get_frame(iframe_element)
379+
await iframe_tab.take_screenshot(as_base64=True) # will raise an exception
380+
381+
# Correct: element screenshot inside iframe (uses viewport)
382+
element = await iframe_tab.find(id='captcha')
383+
await element.take_screenshot('captcha.png') # will work!
384+
```
385+
371386
## Event System Overview
372387

373388
The Tab domain provides a comprehensive event system for monitoring and reacting to browser events:

0 commit comments

Comments
 (0)