You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat(document_loaders): add flexible timeout to PlaywrightURLLoader (#104)
## Description
This PR enhances the `PlaywrightURLLoader` by adding configurable
timeout and page load strategy options, making it more flexible for
handling dynamic web pages. This addresses issue #103.
### Changes
- Added `timeout` parameter (default: 30000ms) to control page
navigation timeout
- Added `wait_until` parameter to control when navigation is considered
complete
- Supported `wait_until` options:
- `"load"` (default): wait for the "load" event
- `"domcontentloaded"`: wait for the "DOMContentLoaded" event
- `"networkidle"`: wait until there are no network connections for at
least 500ms
- `"commit"`: wait for the first network request to be sent
### Why
The current implementation has a hardcoded 30-second timeout, which can
be insufficient for heavy dynamic pages. This change allows users to:
- Set longer timeouts for complex pages
- Choose appropriate page load strategies based on their needs
- Better handle dynamic content loading
### Real-World Examples
This PR solves timeout issues with various types of websites:
1. Weather websites:
```python
loader = PlaywrightURLLoader(
urls=["https://weather.com/en-IN/weather/tenday/l/Chennai+Tamil+Nadu?canonicalCityId=251b7b4afedf19f747b425e048038eb1"],
timeout=60000, # 60 second timeout
wait_until="domcontentloaded"
)
```
2. Dynamic news sites:
```python
loader = PlaywrightURLLoader(
urls=["https://www.reuters.com/markets/"],
timeout=45000,
wait_until="networkidle"
)
```
3. E-commerce sites:
```python
loader = PlaywrightURLLoader(
urls=["https://www.amazon.com/dp/B08N5KWB9H"],
timeout=90000, # 90 second timeout for complex product pages
wait_until="load"
)
```
### Testing
- Added new test cases for both sync and async methods
- Maintained backward compatibility
- All existing tests pass
- Tested with various real-world websites
### Related Issues
Closes#103
---------
Co-authored-by: Parth Pathak <[email protected]>
Co-authored-by: Mason Daugherty <[email protected]>
Co-authored-by: Mason Daugherty <[email protected]>
0 commit comments