Skip to content

Commit daef1c5

Browse files
authored
Fix Zyte API support (#112)
1 parent c3c35d2 commit daef1c5

File tree

6 files changed

+31
-29
lines changed

6 files changed

+31
-29
lines changed

docs/headers.rst

Lines changed: 4 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,13 @@
11
Headers
22
=======
33

4-
The Zyte proxy API services that you can use with this downloader middleware
5-
each support a different set of HTTP request and response headers that give
6-
you access to additional features. You can find more information about those
4+
The Zyte proxy services that you can use with this downloader middleware each
5+
support a different set of HTTP request and response headers that give you
6+
access to additional features. You can find more information about those
77
headers in the documentation of each service, `Zyte API’s <zyte-api-headers>`_
88
and `Zyte Smart Proxy Manager’s <spm-headers>`_.
99

10-
.. _zyte-api-headers: https://docs.zyte.com/zyte-api/usage/proxy-api.html
10+
.. _zyte-api-headers: https://docs.zyte.com/zyte-api/usage/proxy-mode.html
1111
.. _spm-headers: https://docs.zyte.com/smart-proxy-manager.html#request-headers
1212

1313
If you try to use a header for one service while using the other service, this
@@ -24,7 +24,6 @@ Translation is supported for the following headers:
2424
========================= ===========================
2525
Zyte API Zyte Smart Proxy Manager
2626
========================= ===========================
27-
``Zyte-Client`` ``X-Crawlera-Client``
2827
``Zyte-Device`` ``X-Crawlera-Profile``
2928
``Zyte-Error`` ``X-Crawlera-Error``
3029
``Zyte-Geolocation`` ``X-Crawlera-Region``

docs/index.rst

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -10,10 +10,11 @@ scrapy-zyte-smartproxy |version| documentation
1010
news
1111

1212
scrapy-zyte-smartproxy is a `Scrapy downloader middleware`_ to use one of
13-
Zyte’s proxy APIs: either the proxy API of `Zyte API`_ or `Zyte Smart Proxy
14-
Manager`_ (formerly Crawlera).
13+
Zyte’s proxy services: either the `proxy mode`_ of `Zyte API`_ or `Zyte Smart
14+
Proxy Manager`_ (formerly Crawlera).
1515

1616
.. _Scrapy downloader middleware: https://doc.scrapy.org/en/latest/topics/downloader-middleware.html
17+
.. _proxy mode: https://docs.zyte.com/zyte-api/usage/proxy-mode.html
1718
.. _Zyte API: https://docs.zyte.com/zyte-api/get-started.html
1819
.. _Zyte Smart Proxy Manager: https://www.zyte.com/smart-proxy-manager/
1920

@@ -52,7 +53,7 @@ Configuration
5253

5354
#. Set the ``ZYTE_SMARTPROXY_URL`` Scrapy setting as needed:
5455

55-
- To use the proxy API of Zyte API, set it to
56+
- To use the `proxy mode`_ of `Zyte API`_, set it to
5657
``http://api.zyte.com:8011``:
5758

5859
.. code-block:: python
@@ -76,14 +77,13 @@ Usage
7677
=====
7778

7879
Once the downloader middleware is properly configured, every request goes
79-
through the configured Zyte proxy API.
80+
through the configured Zyte proxy service.
8081

8182
.. _override:
8283

83-
Although the plugin configuration only allows defining a single proxy API
84-
endpoint and API key, it is possible to override them for specific requests, so
85-
that you can use different combinations for different requests within the same
86-
spider.
84+
Although the plugin configuration only allows defining a single proxy endpoint
85+
and API key, it is possible to override them for specific requests, so that you
86+
can use different combinations for different requests within the same spider.
8787

8888
To **override** which combination of endpoint and API key is used for a given
8989
request, set ``proxy`` in the request metadata to a URL indicating both the
@@ -128,7 +128,7 @@ or using the DEFAULT_REQUEST_HEADERS_ setting. For example:
128128
},
129129
)
130130
131-
.. _Zyte API proxy headers: https://docs.zyte.com/zyte-api/usage/proxy-api.html
131+
.. _Zyte API proxy headers: https://docs.zyte.com/zyte-api/usage/proxy-mode.html
132132
.. _Zyte Smart Proxy Manager headers: https://docs.zyte.com/smart-proxy-manager.html#request-headers
133133
.. _Scrapy headers: https://doc.scrapy.org/en/latest/topics/request-response.html#scrapy.http.Request.headers
134134
.. _DEFAULT_REQUEST_HEADERS: https://doc.scrapy.org/en/latest/topics/settings.html#default-request-headers

docs/news.rst

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,11 +3,18 @@
33
Changes
44
=======
55

6+
v2.3.1 (2023-11-20)
7+
-------------------
8+
9+
Fixed `Zyte API`_ `proxy mode`_ support by removing the mapping of unsupported
10+
headers ``Zyte-Client`` and ``Zyte-No-Bancheck``.
11+
612
v2.3.0 (2023-10-20)
713
-------------------
814

9-
Added support for the upcoming proxy API of `Zyte API`_.
15+
Added support for the upcoming `proxy mode`_ of `Zyte API`_.
1016

17+
.. _proxy mode: https://docs.zyte.com/zyte-api/usage/proxy-mode.html
1118
.. _Zyte API: https://docs.zyte.com/zyte-api/get-started.html
1219

1320
Added a BSD-3-Clause license file.

docs/settings.rst

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -3,14 +3,14 @@ Settings
33
========
44

55
This Scrapy downloader middleware adds some settings to configure how to work
6-
with your Zyte proxy API.
6+
with your Zyte proxy service.
77

88
ZYTE_SMARTPROXY_APIKEY
99
----------------------
1010

1111
Default: ``None``
1212

13-
Default API key for your Zyte proxy API service.
13+
Default API key for your Zyte proxy service.
1414

1515
Note that Zyte API and Zyte Smart Proxy Manager have different API keys.
1616

@@ -22,7 +22,7 @@ ZYTE_SMARTPROXY_URL
2222

2323
Default: ``'http://proxy.zyte.com:8011'``
2424

25-
Default endpoint for your Zyte proxy API service.
25+
Default endpoint for your Zyte proxy service.
2626

2727
For guidelines on setting a value, see the :ref:`initial configuration
2828
instructions <ZYTE_SMARTPROXY_URL>`.
@@ -79,9 +79,9 @@ ZYTE_SMARTPROXY_FORCE_ENABLE_ON_HTTP_CODES
7979

8080
Default: ``[]``
8181

82-
List of HTTP response status codes that warrant enabling your Zyte proxy API
82+
List of HTTP response status codes that warrant enabling your Zyte proxy
8383
service for the corresponding domain.
8484

8585
When a response with one of these HTTP status codes is received after an
86-
unproxied request, the request is retried with your Zyte proxy API service, and
87-
any new request to the same domain is also proxied.
86+
unproxied request, the request is retried with your Zyte proxy service, and any
87+
new request to the same domain is also proxied.

scrapy_zyte_smartproxy/middleware.py

Lines changed: 3 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -37,11 +37,9 @@ class ZyteSmartProxyMiddleware(object):
3737
enabled_for_domain = {}
3838
apikey = ""
3939
zyte_api_to_spm_translations = {
40-
b"zyte-client": b"x-crawlera-client",
4140
b"zyte-device": b"x-crawlera-profile",
4241
b"zyte-geolocation": b"x-crawlera-region",
4342
b"zyte-jobid": b"x-crawlera-jobid",
44-
b"zyte-no-bancheck": b"x-crawlera-no-bancheck",
4543
b"zyte-override-headers": b"x-crawlera-profile-pass",
4644
}
4745
spm_to_zyte_api_translations = {v: k for k, v in zyte_api_to_spm_translations.items()}
@@ -222,9 +220,9 @@ def process_request(self, request, spider):
222220
if self.job_id:
223221
job_header = 'Zyte-JobId' if targets_zyte_api else 'X-Crawlera-JobId'
224222
request.headers[job_header] = self.job_id
225-
client_header = 'Zyte-Client' if targets_zyte_api else 'X-Crawlera-Client'
226-
from scrapy_zyte_smartproxy import __version__
227-
request.headers[client_header] = 'scrapy-zyte-smartproxy/%s' % __version__
223+
if not targets_zyte_api:
224+
from scrapy_zyte_smartproxy import __version__
225+
request.headers['X-Crawlera-Client'] = 'scrapy-zyte-smartproxy/%s' % __version__
228226
self.crawler.stats.inc_value('zyte_smartproxy/request')
229227
self.crawler.stats.inc_value('zyte_smartproxy/request/method/%s' % request.method)
230228
self._translate_headers(request, targets_zyte_api=targets_zyte_api)

tests/test_all.py

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -983,7 +983,7 @@ def test_client_header(self):
983983
)
984984
self.assertEqual(mw.process_request(req2, self.spider), None)
985985
self.assertEqual(req2.headers.get('X-Crawlera-Client'), None)
986-
self.assertEqual(req2.headers.get('Zyte-Client'), client)
986+
self.assertEqual(req2.headers.get('Zyte-Client'), None)
987987

988988
def test_scrapy_httpproxy_integration(self):
989989
self.spider.zyte_smartproxy_enabled = True
@@ -1062,11 +1062,9 @@ def test_header_translation(self):
10621062
value = b"foo"
10631063

10641064
zyte_api_to_spm_translations = {
1065-
b"Zyte-Client": b"X-Crawlera-Client",
10661065
b"Zyte-Device": b"X-Crawlera-Profile",
10671066
b"Zyte-Geolocation": b"X-Crawlera-Region",
10681067
b"Zyte-JobId": b"X-Crawlera-JobId",
1069-
b"Zyte-No-Bancheck": b"X-Crawlera-No-Bancheck",
10701068
b"Zyte-Override-Headers": b"X-Crawlera-Profile-Pass",
10711069
}
10721070
for header, translation in zyte_api_to_spm_translations.items():

0 commit comments

Comments
 (0)