Skip to content

Commit ba4d384

Browse files
authored
feat: implement _snapshot_client for Snapshotter (#957)
### Description - implement `_snapshot_client` for `Snapshotter` ### Issues - Closes: #60
1 parent e05be11 commit ba4d384

File tree

2 files changed

+8
-4
lines changed

2 files changed

+8
-4
lines changed

src/crawlee/_autoscaling/snapshotter.py

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -305,11 +305,11 @@ def _snapshot_client(self) -> None:
305305
Only errors produced by a 2nd retry of the API call are considered for snapshotting since earlier errors may
306306
just be caused by a random spike in the number of requests and do not necessarily signify API overloading.
307307
"""
308-
# TODO: This is just a dummy placeholder. It can be implemented once `StorageClient` is ready.
309-
# Attribute `self._client_rate_limit_error_retry_count` will be used here.
310-
# https://github.com/apify/crawlee-python/issues/60
308+
client = service_locator.get_storage_client()
311309

312-
error_count = 0
310+
rate_limit_errors: dict[int, int] = client.get_rate_limit_errors()
311+
312+
error_count = rate_limit_errors.get(self._CLIENT_RATE_LIMIT_ERROR_RETRY_COUNT, 0)
313313
snapshot = ClientSnapshot(error_count=error_count, max_error_count=self._max_client_errors)
314314

315315
snapshots = cast(list[Snapshot], self._client_snapshots)

src/crawlee/storage_clients/_base/_base_storage_client.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -56,3 +56,7 @@ async def purge_on_start(self) -> None:
5656
It is primarily used to clean up residual data from previous runs to maintain a clean state.
5757
If the storage client does not support purging, leave it empty.
5858
"""
59+
60+
def get_rate_limit_errors(self) -> dict[int, int]:
61+
"""Returns statistics about rate limit errors encountered by the HTTP client in storage client."""
62+
return {}

0 commit comments

Comments
 (0)