1.0.0
1.0.0 (2025-09-29)
- Check out the Release blog post for more details.
- Check out the Upgrading guide to ensure a smooth update.
🚀 Features
- Add utility for load and parse Sitemap and
SitemapRequestLoader(#1169) (66599f8) by @Mantisus - Add periodic status logging and
status_message_callbackparameter for customization (#1265) (b992fb2) by @Mantisus - Add crawlee-cli option to skip project installation (#1294) (4d5aef0) by @Pijukatel
- Improve
CrawleeCLI help text (#1297) (afbe10f) by @Pijukatel - Add basic
OpenTelemetryinstrumentation (#1255) (a92d8b3) by @Pijukatel - Add
ImpitHttpClienthttp-client client using theimpitlibrary (#1151) (0d0d268) by @Mantisus - Prevent overloading system memory when running locally (#1270) (30de3bd) by @janbuchar
- Expose
PlaywrightPersistentBrowserclass (#1314) (b5fa955) by @Mantisus - Add
impitoption for Crawlee CLI (#1312) (508d7ce) by @Mantisus - Persist RequestList state (#1274) (cc68014) by @janbuchar
- Persist
DefaultRenderingTypePredictorstate (#1340) (fad4c25) by @Mantisus - Persist the
SitemapRequestLoaderstate (#1347) (27ef9ad) by @Mantisus - Add support for NDU storages (#1401) (5dbd212) by @vdusek
- Add RQ id, name, alias args to
add_requestsandenqueue_linksmethods (#1413) (1cae2bc) by @Mantisus - Add
SqlStorageClientbased onsqlalchemyv2+ (#1339) (07c75a0) by @Mantisus
🐛 Bug Fixes
- Fix memory estimation not working on MacOS (#1330) (ab020eb) by @Pijukatel
- Fix retry count to not count the original request (#1328) (74fa1d9) by @Pijukatel
- [breaking] Remove unused "stats" field from RequestQueueMetadata (#1331) (0a63bef) by @vdusek
- Ignore unknown parameters passed in cookies (#1336) (50d3ef7) by @Mantisus
- Fix
timeoutforstreammethod inImpitHttpClient(#1352) (54b693b) by @Mantisus - Include reason in the session rotation warning logs (#1363) (d6d7a45) by @vdusek
- Improve crawler statistics logging (#1364) (1eb6da5) by @vdusek
- Do not add a request that is already in progress to
MemoryRequestQueueClient(#1384) (3af326c) by @Mantisus - Save
RequestQueueStateforFileSystemRequestQueueClientin default KVS (#1411) (6ee60a0) by @Mantisus - Set default desired concurrency for non-browser crawlers to 10 (#1419) (1cc9401) by @vdusek
Refactor
- [breaking] Introduce new storage client system (#1194) (de1c03f) by @vdusek
- [breaking] Split
BrowserTypeliteral into two different literals based on context (#1070) (72b5698) by @Pijukatel - [breaking] Change method
HttpResponse.readfrom sync to async (#1296) (83fa8a4) by @Mantisus - [breaking] Replace
HttpxHttpClientwithImpitHttpClientas default HTTP client (#1307) (c803a97) by @Mantisus - [breaking] Change Dataset unwind parameter to accept list of strings (#1357) (862a203) by @vdusek
- [breaking] Remove
Request.idfield (#1366) (32f3580) by @Pijukatel - [breaking] Refactor storage creation and caching, configuration and services (#1386) (04649bd) by @Pijukatel