Web scraping is a technique for extracting data from websites.
+Understanding Web Scraping
+ + +
+
+
+
+ R$ 1.234,56
+ Original source
+ diff --git a/README.md b/README.md index f29538b9..5ea3142e 100644 --- a/README.md +++ b/README.md @@ -22,9 +22,9 @@ Support
-Pydoll automates Chromium-based browsers (Chrome, Edge) by connecting directly to the Chrome DevTools Protocol over WebSocket. No WebDriver binary, no `navigator.webdriver` flag, no compatibility issues. +Pydoll automates Chromium-based browsers (Chrome, Edge) by connecting directly to the Chrome DevTools Protocol over WebSocket. **No WebDriver binary, no `navigator.webdriver` flag, no compatibility issues.** -It combines a high-level API for common tasks with low-level CDP access for fine-grained control over network, fingerprinting, and browser behavior. The entire codebase is async-native and fully type-checked with mypy. +It combines a high-level API for stealthy automation with low-level CDP access for fine-grained control over network, fingerprinting, and browser behavior. And with its new **Pydantic-powered extraction engine**, it maps the DOM directly to structured Python objects, delivering an unmatched Developer Experience (DX). ### Top Sponsors @@ -48,11 +48,11 @@ It combines a high-level API for common tasks with low-level CDP access for fine ### Why Pydoll -- **Stealth-first**: Human-like mouse movement, realistic typing, and granular [browser preference](https://pydoll.tech/docs/features/configuration/browser-preferences/) control for fingerprint management. +- **Structured extraction**: Define a [Pydantic](https://docs.pydantic.dev/) model, call `tab.extract()`, get typed and validated data back. No manual element-by-element querying. - **Async and typed**: Built on `asyncio` from the ground up, 100% type-checked with `mypy`. Full IDE autocompletion and static error checking. +- **Stealth built in**: Human-like mouse movement, realistic typing, and granular [browser preference](https://pydoll.tech/docs/features/configuration/browser-preferences/) control for fingerprint management. - **Network control**: [Intercept](https://pydoll.tech/docs/features/network/interception/) requests to block ads/trackers, [monitor](https://pydoll.tech/docs/features/network/monitoring/) traffic for API discovery, and make [authenticated HTTP requests](https://pydoll.tech/docs/features/network/http-requests/) that inherit the browser session. - **Shadow DOM and iframes**: Full support for [shadow roots](https://pydoll.tech/docs/deep-dive/architecture/shadow-dom/) (including closed) and cross-origin iframes. Discover, query, and interact with elements inside them using the same API. -- **Ergonomic API**: `tab.find()` for most cases, `tab.query()` for complex [CSS/XPath selectors](https://pydoll.tech/docs/deep-dive/guides/selectors-guide/). ## Installation @@ -62,55 +62,124 @@ pip install pydoll-python No WebDriver binaries or external dependencies required. -## What's New +## Getting Started -Web scraping is a technique for extracting data from websites.
+
+ R$ 1.234,56
+ Original source
+ The only way to do great work is to love what you do.
+ + 2005 +Innovation distinguishes between a leader and a follower.
+ + 2001 +Stay hungry, stay foolish.
+ + 2005 +