Skip to content

Commit 84575c4

Browse files
committed
RQ FS client improvements
1 parent 59aa5d1 commit 84575c4

File tree

5 files changed

+293
-340
lines changed

5 files changed

+293
-340
lines changed

Makefile

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -30,13 +30,13 @@ type-check:
3030
uv run mypy
3131

3232
unit-tests:
33-
uv run pytest --numprocesses=auto --verbose --cov=src/crawlee tests/unit
33+
uv run pytest --numprocesses=auto -vv --cov=src/crawlee tests/unit
3434

3535
unit-tests-cov:
36-
uv run pytest --numprocesses=auto --verbose --cov=src/crawlee --cov-report=html tests/unit
36+
uv run pytest --numprocesses=auto -vv --cov=src/crawlee --cov-report=html tests/unit
3737

3838
e2e-templates-tests $(args):
39-
uv run pytest --numprocesses=$(E2E_TESTS_CONCURRENCY) --verbose tests/e2e/project_template "$(args)"
39+
uv run pytest --numprocesses=$(E2E_TESTS_CONCURRENCY) -vv tests/e2e/project_template "$(args)"
4040

4141
format:
4242
uv run ruff check --fix

src/crawlee/_request.py

Lines changed: 22 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -158,7 +158,23 @@ class Request(BaseModel):
158158
```
159159
"""
160160

161-
model_config = ConfigDict(populate_by_name=True)
161+
model_config = ConfigDict(populate_by_name=True, extra='allow')
162+
163+
id: str
164+
"""A unique identifier for the request. Note that this is not used for deduplication, and should not be confused
165+
with `unique_key`."""
166+
167+
unique_key: Annotated[str, Field(alias='uniqueKey')]
168+
"""A unique key identifying the request. Two requests with the same `unique_key` are considered as pointing
169+
to the same URL.
170+
171+
If `unique_key` is not provided, then it is automatically generated by normalizing the URL.
172+
For example, the URL of `HTTP://www.EXAMPLE.com/something/` will produce the `unique_key`
173+
of `http://www.example.com/something`.
174+
175+
Pass an arbitrary non-empty text value to the `unique_key` property to override the default behavior
176+
and specify which URLs shall be considered equal.
177+
"""
162178

163179
url: Annotated[str, BeforeValidator(validate_http_url), Field()]
164180
"""The URL of the web page to crawl. Must be a valid HTTP or HTTPS URL, and may include query parameters
@@ -207,22 +223,6 @@ class Request(BaseModel):
207223
handled_at: Annotated[datetime | None, Field(alias='handledAt')] = None
208224
"""Timestamp when the request was handled."""
209225

210-
unique_key: Annotated[str, Field(alias='uniqueKey')]
211-
"""A unique key identifying the request. Two requests with the same `unique_key` are considered as pointing
212-
to the same URL.
213-
214-
If `unique_key` is not provided, then it is automatically generated by normalizing the URL.
215-
For example, the URL of `HTTP://www.EXAMPLE.com/something/` will produce the `unique_key`
216-
of `http://www.example.com/something`.
217-
218-
Pass an arbitrary non-empty text value to the `unique_key` property
219-
to override the default behavior and specify which URLs shall be considered equal.
220-
"""
221-
222-
id: str
223-
"""A unique identifier for the request. Note that this is not used for deduplication, and should not be confused
224-
with `unique_key`."""
225-
226226
@classmethod
227227
def from_url(
228228
cls,
@@ -398,6 +398,11 @@ def forefront(self) -> bool:
398398
def forefront(self, new_value: bool) -> None:
399399
self.crawlee_data.forefront = new_value
400400

401+
@property
402+
def was_already_handled(self) -> bool:
403+
"""Indicates whether the request was handled."""
404+
return self.handled_at is not None
405+
401406

402407
class RequestWithLock(Request):
403408
"""A crawling request with information about locks."""

0 commit comments

Comments
 (0)