Skip to content

Commit d90ac9e

Browse files
committed
Merge branch 'master' into feat/storage-sign-content-url
2 parents eec0233 + 1cec1ea commit d90ac9e

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

41 files changed

+277
-208
lines changed

.github/workflows/check_pr_title.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,6 @@ jobs:
99
name: Check PR title
1010
runs-on: ubuntu-latest
1111
steps:
12-
- uses: amannn/action-semantic-pull-request@v5.5.3
12+
- uses: amannn/action-semantic-pull-request@v6.0.1
1313
env:
1414
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

CHANGELOG.md

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,10 +8,19 @@ All notable changes to this project will be documented in this file.
88
### 🚀 Features
99

1010
- Extend status parameter to an array of possible statuses ([#455](https://github.com/apify/apify-client-python/pull/455)) ([76f6769](https://github.com/apify/apify-client-python/commit/76f676973d067ce8af398d8e6ceea55595da5ecf)) by [@JanHranicky](https://github.com/JanHranicky)
11+
- Expose apify_client.errors module ([#468](https://github.com/apify/apify-client-python/pull/468)) ([c0cc147](https://github.com/apify/apify-client-python/commit/c0cc147fd0c5a60e5a025db6b6c761e811efe1da)) by [@Mantisus](https://github.com/Mantisus), closes [#158](https://github.com/apify/apify-client-python/issues/158)
12+
13+
### Chore
14+
15+
- [**breaking**] Bump minimum Python version to 3.10 ([#469](https://github.com/apify/apify-client-python/pull/469)) ([92b4789](https://github.com/apify/apify-client-python/commit/92b47895eb48635e2d573b99d59bb077999c5b27)) by [@vdusek](https://github.com/vdusek)
1116

1217
### Refactor
1318

1419
- [**breaking**] Remove support for passing a single string to the `unwind` parameter in `DatasetClient` ([#467](https://github.com/apify/apify-client-python/pull/467)) ([e8aea2c](https://github.com/apify/apify-client-python/commit/e8aea2c8f3833082bf78562f3fa981a1f8e88b26)) by [@Mantisus](https://github.com/Mantisus), closes [#255](https://github.com/apify/apify-client-python/issues/255)
20+
- [**breaking**] Remove deprecated constant re-exports from `consts.py` ([#466](https://github.com/apify/apify-client-python/pull/466)) ([7731f0b](https://github.com/apify/apify-client-python/commit/7731f0b3a4ca8c99be9392517d36f841cb293ed5)) by [@Mantisus](https://github.com/Mantisus), closes [#163](https://github.com/apify/apify-client-python/issues/163)
21+
- [**breaking**] Replace `httpx` HTTP client with `impit` ([#456](https://github.com/apify/apify-client-python/pull/456)) ([1df6792](https://github.com/apify/apify-client-python/commit/1df6792386398b28eb565dfbc58c7eba13f451a4)) by [@Mantisus](https://github.com/Mantisus)
22+
- [**breaking**] Remove deprecated `as_bytes` and `as_file` parameters from `KeyValueStoreClient.get_record` ([#463](https://github.com/apify/apify-client-python/pull/463)) ([b880231](https://github.com/apify/apify-client-python/commit/b88023125a41d02f95f687b8fd6090e7080efe3e)) by [@Mantisus](https://github.com/Mantisus)
23+
- [**breaking**] Remove `parse_response` arg from the `call` method ([#462](https://github.com/apify/apify-client-python/pull/462)) ([840d51a](https://github.com/apify/apify-client-python/commit/840d51af12a7e53decf9d3294d0e0c3c848e9c08)) by [@Mantisus](https://github.com/Mantisus), closes [#166](https://github.com/apify/apify-client-python/issues/166)
1524

1625

1726
<!-- git-cliff-unreleased-end -->

docs/04_upgrading/upgrading_to_v2.md

Lines changed: 31 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -3,16 +3,41 @@ id: upgrading-to-v2
33
title: Upgrading to v2
44
---
55

6-
This page summarizes the breaking changes between Apify Python API client v1.x and v2.0.
6+
This page summarizes the breaking changes between Apify Python API Client v1.x and v2.0.
77

88
## Python version support
99

10-
<!-- TODO -->
10+
Support for Python 3.9 has been dropped. The Apify Python API Client v2.x now requires Python 3.10 or later. Make sure your environment is running a compatible version before upgrading.
1111

12-
## Change underlying HTTP library
12+
## New underlying HTTP library
1313

14-
In v2.0, the Apify Python API client switched from using `httpx` to [`impit`](https://github.com/apify/impit) as the underlying HTTP library.
14+
In v2.0, the Apify Python API client switched from using [`httpx`](https://www.python-httpx.org/) to [`impit`](https://github.com/apify/impit) as the underlying HTTP library. However, this change shouldn't have much impact on the end user.
1515

16-
## Update signature of methods
16+
## API method changes
1717

18-
<!-- TODO -->
18+
Several public methods have changed their signatures or behavior.
19+
20+
### Removed parameters and attributes
21+
22+
- The `parse_response` parameter has been removed from the `HTTPClient.call()` method. This was an internal parameter that added a private attribute to the `Response` object.
23+
- The private `_maybe_parsed_body` attribute has been removed from the `Response` object.
24+
25+
### KeyValueStoreClient
26+
27+
- The deprecated parameters `as_bytes` and `as_file` have been removed from `KeyValueStoreClient.get_record()`. Use the dedicated methods `get_record_as_bytes()` and `stream_record()` instead.
28+
29+
### DatasetClient
30+
31+
- The `unwind` parameter no longer accepts a single string value. Use a list of strings instead: `unwind=['items']` rather than `unwind='items'`.
32+
33+
## Module reorganization
34+
35+
Some modules have been restructured.
36+
37+
### Constants
38+
39+
- Deprecated constant re-exports from `consts.py` have been removed. Constants should now be imported from the [apify-shared-python](https://github.com/apify/apify-shared-python) package if needed.
40+
41+
### Errors
42+
43+
- Error classes are now accessible from the public `apify_client.errors` module. See the [API documentation](https://docs.apify.com/api/client/python/reference/class/ApifyApiError) for a complete list of available error classes.

pyproject.toml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -24,9 +24,9 @@ classifiers = [
2424
]
2525
keywords = ["apify", "api", "client", "automation", "crawling", "scraping"]
2626
dependencies = [
27-
"apify-shared>=1.5.0,<2.0.0",
27+
"apify-shared>=2.0.0,<3.0.0",
2828
"colorama>=0.4.0",
29-
"impit>=0.5.2",
29+
"impit>=0.5.3",
3030
"more_itertools>=10.0.0",
3131
]
3232

src/apify_client/_http_client.py

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,6 @@
1111
from urllib.parse import urlencode
1212

1313
import impit
14-
from apify_shared.utils import ignore_docs
1514

1615
from apify_client._logging import log_context, logger_name
1716
from apify_client._statistics import Statistics
@@ -21,7 +20,7 @@
2120
if TYPE_CHECKING:
2221
from collections.abc import Callable
2322

24-
from apify_shared.types import JSONSerializable
23+
from apify_client._types import JSONSerializable
2524

2625
DEFAULT_BACKOFF_EXPONENTIAL_FACTOR = 2
2726
DEFAULT_BACKOFF_RANDOM_FACTOR = 1
@@ -30,7 +29,6 @@
3029

3130

3231
class _BaseHTTPClient:
33-
@ignore_docs
3432
def __init__(
3533
self,
3634
*,

src/apify_client/_types.py

Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
from __future__ import annotations
2+
3+
from typing import Any, Generic, TypeVar
4+
5+
JSONSerializable = str | int | float | bool | None | dict[str, Any] | list[Any]
6+
"""Type for representing json-serializable values. It's close enough to the real thing supported
7+
by json.parse, and the best we can do until mypy supports recursive types. It was suggested in
8+
a discussion with (and approved by) Guido van Rossum, so I'd consider it correct enough.
9+
"""
10+
11+
T = TypeVar('T')
12+
13+
14+
class ListPage(Generic[T]):
15+
"""A single page of items returned from a list() method."""
16+
17+
items: list[T]
18+
"""List of returned objects on this page."""
19+
20+
count: int
21+
"""Count of the returned objects on this page."""
22+
23+
offset: int
24+
"""The limit on the number of returned objects offset specified in the API call."""
25+
26+
limit: int
27+
"""The offset of the first object specified in the API call"""
28+
29+
total: int
30+
"""Total number of objects matching the API call criteria."""
31+
32+
desc: bool
33+
"""Whether the listing is descending or not."""
34+
35+
def __init__(self: ListPage, data: dict) -> None:
36+
"""Initialize a ListPage instance from the API response data."""
37+
self.items = data.get('items', [])
38+
self.offset = data.get('offset', 0)
39+
self.limit = data.get('limit', 0)
40+
self.count = data['count'] if 'count' in data else len(self.items)
41+
self.total = data.get('total', self.offset + self.count)
42+
self.desc = data.get('desc', False)

src/apify_client/_utils.py

Lines changed: 98 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -2,21 +2,20 @@
22

33
import asyncio
44
import base64
5+
import contextlib
6+
import io
7+
import json
58
import json as jsonlib
69
import random
10+
import re
711
import time
812
from collections.abc import Callable
13+
from datetime import datetime, timezone
14+
from enum import Enum
915
from http import HTTPStatus
1016
from typing import TYPE_CHECKING, Any, TypeVar, cast
1117

1218
import impit
13-
from apify_shared.utils import (
14-
is_content_type_json,
15-
is_content_type_text,
16-
is_content_type_xml,
17-
is_file_or_bytes,
18-
maybe_extract_enum_member_value,
19-
)
2019

2120
from apify_client.errors import InvalidResponseBodyError
2221

@@ -29,11 +28,102 @@
2928

3029
PARSE_DATE_FIELDS_MAX_DEPTH = 3
3130
PARSE_DATE_FIELDS_KEY_SUFFIX = 'At'
32-
3331
RECORD_NOT_FOUND_EXCEPTION_TYPES = ['record-not-found', 'record-or-token-not-found']
3432

3533
T = TypeVar('T')
3634
StopRetryingType = Callable[[], None]
35+
ListOrDict = TypeVar('ListOrDict', list, dict)
36+
37+
38+
def filter_out_none_values_recursively(dictionary: dict) -> dict:
39+
"""Return copy of the dictionary, recursively omitting all keys for which values are None."""
40+
return cast('dict', filter_out_none_values_recursively_internal(dictionary))
41+
42+
43+
def filter_out_none_values_recursively_internal(
44+
dictionary: dict,
45+
*,
46+
remove_empty_dicts: bool | None = None,
47+
) -> dict | None:
48+
"""Recursively filters out None values from a dictionary.
49+
50+
Unfortunately, it's necessary to have an internal function for the correct result typing,
51+
without having to create complicated overloads
52+
"""
53+
result = {}
54+
for k, v in dictionary.items():
55+
if isinstance(v, dict):
56+
v = filter_out_none_values_recursively_internal( # noqa: PLW2901
57+
v, remove_empty_dicts=remove_empty_dicts is True or remove_empty_dicts is None
58+
)
59+
if v is not None:
60+
result[k] = v
61+
if not result and remove_empty_dicts:
62+
return None
63+
return result
64+
65+
66+
def parse_date_fields(data: ListOrDict, max_depth: int = PARSE_DATE_FIELDS_MAX_DEPTH) -> ListOrDict:
67+
"""Recursively parse date fields in a list or dictionary up to the specified depth."""
68+
if max_depth < 0:
69+
return data
70+
71+
if isinstance(data, list):
72+
return [parse_date_fields(item, max_depth - 1) for item in data]
73+
74+
if isinstance(data, dict):
75+
76+
def parse(key: str, value: object) -> object:
77+
parsed_value = value
78+
if key.endswith(PARSE_DATE_FIELDS_KEY_SUFFIX) and isinstance(value, str):
79+
with contextlib.suppress(ValueError):
80+
parsed_value = datetime.strptime(value, '%Y-%m-%dT%H:%M:%S.%fZ').replace(tzinfo=timezone.utc)
81+
elif isinstance(value, dict):
82+
parsed_value = parse_date_fields(value, max_depth - 1)
83+
elif isinstance(value, list):
84+
parsed_value = parse_date_fields(value, max_depth)
85+
return parsed_value
86+
87+
return {key: parse(key, value) for (key, value) in data.items()}
88+
89+
return data
90+
91+
92+
def is_content_type_json(content_type: str) -> bool:
93+
"""Check if the given content type is JSON."""
94+
return bool(re.search(r'^application/json', content_type, flags=re.IGNORECASE))
95+
96+
97+
def is_content_type_xml(content_type: str) -> bool:
98+
"""Check if the given content type is XML."""
99+
return bool(re.search(r'^application/.*xml$', content_type, flags=re.IGNORECASE))
100+
101+
102+
def is_content_type_text(content_type: str) -> bool:
103+
"""Check if the given content type is text."""
104+
return bool(re.search(r'^text/', content_type, flags=re.IGNORECASE))
105+
106+
107+
def is_file_or_bytes(value: Any) -> bool:
108+
"""Check if the input value is a file-like object or bytes.
109+
110+
The check for IOBase is not ideal, it would be better to use duck typing,
111+
but then the check would be super complex, judging from how the 'requests' library does it.
112+
This way should be good enough for the vast majority of use cases, if it causes issues, we can improve it later.
113+
"""
114+
return isinstance(value, (bytes, bytearray, io.IOBase))
115+
116+
117+
def json_dumps(obj: Any) -> str:
118+
"""Dump JSON to a string with the correct settings and serializer."""
119+
return json.dumps(obj, ensure_ascii=False, indent=2, default=str)
120+
121+
122+
def maybe_extract_enum_member_value(maybe_enum_member: Any) -> Any:
123+
"""Extract the value of an enumeration member if it is an Enum, otherwise return the original value."""
124+
if isinstance(maybe_enum_member, Enum):
125+
return maybe_enum_member.value
126+
return maybe_enum_member
37127

38128

39129
def to_safe_id(id: str) -> str:

src/apify_client/client.py

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,5 @@
11
from __future__ import annotations
22

3-
from apify_shared.utils import ignore_docs
4-
53
from apify_client._http_client import HTTPClient, HTTPClientAsync
64
from apify_client._statistics import Statistics
75
from apify_client.clients import (
@@ -61,7 +59,6 @@
6159
class _BaseApifyClient:
6260
http_client: HTTPClient | HTTPClientAsync
6361

64-
@ignore_docs
6562
def __init__(
6663
self,
6764
token: str | None = None,

src/apify_client/clients/base/actor_job_base_client.py

Lines changed: 1 addition & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -7,9 +7,8 @@
77
from datetime import datetime, timezone
88

99
from apify_shared.consts import ActorJobStatus
10-
from apify_shared.utils import ignore_docs, parse_date_fields
1110

12-
from apify_client._utils import catch_not_found_or_throw, pluck_data
11+
from apify_client._utils import catch_not_found_or_throw, parse_date_fields, pluck_data
1312
from apify_client.clients.base.resource_client import ResourceClient, ResourceClientAsync
1413
from apify_client.errors import ApifyApiError
1514

@@ -19,7 +18,6 @@
1918
DEFAULT_WAIT_WHEN_JOB_NOT_EXIST_SEC = 3
2019

2120

22-
@ignore_docs
2321
class ActorJobBaseClient(ResourceClient):
2422
"""Base sub-client class for Actor runs and Actor builds."""
2523

@@ -74,7 +72,6 @@ def _abort(self, *, gracefully: bool | None = None) -> dict:
7472
return parse_date_fields(pluck_data(jsonlib.loads(response.text)))
7573

7674

77-
@ignore_docs
7875
class ActorJobBaseClientAsync(ResourceClientAsync):
7976
"""Base async sub-client class for Actor runs and Actor builds."""
8077

src/apify_client/clients/base/base_client.py

Lines changed: 0 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -2,8 +2,6 @@
22

33
from typing import TYPE_CHECKING, Any
44

5-
from apify_shared.utils import ignore_docs
6-
75
from apify_client._logging import WithLogDetailsClient
86
from apify_client._utils import to_safe_id
97

@@ -45,14 +43,12 @@ def _sub_resource_init_options(self, **kwargs: Any) -> dict:
4543
}
4644

4745

48-
@ignore_docs
4946
class BaseClient(_BaseBaseClient):
5047
"""Base class for sub-clients."""
5148

5249
http_client: HTTPClient
5350
root_client: ApifyClient
5451

55-
@ignore_docs
5652
def __init__(
5753
self,
5854
*,
@@ -88,14 +84,12 @@ def __init__(
8884
self.url = f'{self.url}/{self.safe_id}'
8985

9086

91-
@ignore_docs
9287
class BaseClientAsync(_BaseBaseClient):
9388
"""Base class for async sub-clients."""
9489

9590
http_client: HTTPClientAsync
9691
root_client: ApifyClientAsync
9792

98-
@ignore_docs
9993
def __init__(
10094
self,
10195
*,

0 commit comments

Comments
 (0)