Skip to content
Merged
Show file tree
Hide file tree
Changes from 21 commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
0692f65
x402 support: initial draft
Gallaecio Jul 23, 2025
b0be889
x402: optimize out initial requests where possible
Gallaecio Jul 23, 2025
0df1f92
Add a n_x402_req stat
Gallaecio Jul 23, 2025
2f3c9c4
Raise non-402 response as RequestError
Gallaecio Jul 23, 2025
53f9a4c
Handle import errors from x402
Gallaecio Jul 23, 2025
33dcc3e
Add x402 envs to tox and CI
Gallaecio Jul 23, 2025
8970a7f
Keep mypy happy
Gallaecio Jul 23, 2025
115abc5
Complete branch coverage
Gallaecio Jul 23, 2025
27a3b13
Use a custom retry policy for x402 initial requests
Gallaecio Jul 23, 2025
e20d1ed
Remove unneeded skip
Gallaecio Jul 23, 2025
9b42f98
Test command call with an env key
Gallaecio Jul 23, 2025
12ec3a4
Test command when an Ethereum private key is set as an env var
Gallaecio Jul 23, 2025
ad3a1bb
Keep mypy happy
Gallaecio Jul 23, 2025
cae7b64
Fix min-x402 in CI
Gallaecio Jul 23, 2025
3720c9b
Clarify required Python version
Gallaecio Jul 23, 2025
1f80f57
Let --eth-key take precedence over ZYTE_API_KEY
Gallaecio Jul 25, 2025
dec667f
Handle race conditions and unexpected server responses
Gallaecio Jul 25, 2025
a5b5e40
Improve error mapping
Gallaecio Jul 25, 2025
433dbc8
Complete test coverage, fix backward compatibility, and add client.au…
Gallaecio Jul 28, 2025
9d5115b
Expose AuthInfo
Gallaecio Jul 28, 2025
5b00725
AuthInfo.key_type → type
Gallaecio Jul 28, 2025
e3931c7
Update tests/mockserver.py
wRAR Jul 29, 2025
3557ff1
Update tests/mockserver.py
wRAR Jul 29, 2025
852fb1c
Add release notes
Gallaecio Aug 7, 2025
46a3bdd
Prepare release notes, set the right endpoint for x402
Gallaecio Aug 7, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 13 additions & 2 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,18 @@ jobs:
strategy:
fail-fast: false
matrix:
python-version: ['3.9', '3.10', '3.11', '3.12', '3.13']
include:
- python-version: "3.9"
tox: min
- python-version: "3.9"
- python-version: "3.10"
tox: min-x402
- python-version: "3.10"
- python-version: "3.11"
- python-version: "3.12"
- python-version: "3.13"
- python-version: "3.13"
tox: x402

steps:
- uses: actions/checkout@v4
Expand All @@ -30,7 +41,7 @@ jobs:
python -m pip install tox
- name: tox
run: |
tox -e py
tox -e ${{ matrix.tox || 'py' }}
- name: coverage
if: ${{ success() }}
uses: codecov/codecov-action@v4.0.1
Expand Down
15 changes: 13 additions & 2 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -35,14 +35,22 @@ Installation

pip install zyte-api

.. note:: Python 3.9+ is required.
Or, to use x402_:

.. _x402: https://www.x402.org/

.. code-block:: shell

pip install zyte-api[x402]

.. note:: Python 3.9+ is required; Python 3.10+ if using x402.

.. install-end

Basic usage
===========

.. basic-start
.. basic-key-start

Set your API key
----------------
Expand All @@ -54,6 +62,9 @@ After you `sign up for a Zyte API account
<https://app.zyte.com/o/zyte-api/api-access>`_.

.. key-get-end
.. basic-key-end

.. basic-start


Use the command-line client
Expand Down
1 change: 1 addition & 0 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ python-zyte-api
:maxdepth: 1

use/key
use/x402
use/cli
use/api

Expand Down
8 changes: 8 additions & 0 deletions docs/intro/basic.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,14 @@
Basic usage
===========

.. include:: /../README.rst
:start-after: basic-key-start
:end-before: basic-key-end

To use x402_ instead, see :ref:`x402`.

.. _x402: https://www.x402.org/

.. include:: /../README.rst
:start-after: basic-start
:end-before: basic-end
5 changes: 3 additions & 2 deletions docs/use/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,9 @@
Python client library
=====================

Once you have :ref:`installed python-zyte-api <install>` and :ref:`configured
your API key <api-key>`, you can use one of its APIs from Python code:
Once you have :ref:`installed python-zyte-api <install>` and configured your
:ref:`API key <api-key>` or :ref:`Ethereum private key <x402>`, you can use one
of its APIs from Python code:

- The :ref:`sync API <sync>` can be used to build simple, proof-of-concept or
debugging Python scripts.
Expand Down
5 changes: 3 additions & 2 deletions docs/use/cli.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,9 @@
Command-line client
===================

Once you have :ref:`installed python-zyte-api <install>` and :ref:`configured
your API key <api-key>`, you can use the ``zyte-api`` command-line client.
Once you have :ref:`installed python-zyte-api <install>` and configured your
:ref:`API key <api-key>` or :ref:`Ethereum private key <x402>`, you can use the
``zyte-api`` command-line client.

To use ``zyte-api``, pass an :ref:`input file <input-file>` as the first
parameter and specify an :ref:`output file <output-file>` with ``--output``.
Expand Down
66 changes: 66 additions & 0 deletions docs/use/x402.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
.. _x402:

====
x402
====

It is possible to use :ref:`Zyte API <zyte-api>` without a Zyte API account by
using the x402_ protocol to handle payments:

#. Read the `Zyte Terms of Service`_. By using Zyte API, you are accepting
them.

.. _Zyte Terms of Service: https://www.zyte.com/terms-policies/terms-of-service/

#. During :ref:`installation <install>`, make sure to install the ``x402`` extra.

#. :ref:`Configure <eth-key>` the *private* key of your Ethereum_ account to
authorize payments.

.. _Ethereum: https://ethereum.org/


.. _eth-key:

Configuring your Ethereum private key
=====================================

It is recommended to configure your Ethereum private key through an environment
variable, so that it can be picked by both the :ref:`command-line client
<command_line>` and the :ref:`Python client library <api>`:

- On Windows’ CMD:

.. code-block:: shell

> set ZYTE_API_ETH_KEY=YOUR_ETH_PRIVATE_KEY

- On macOS and Linux:

.. code-block:: shell

$ export ZYTE_API_ETH_KEY=YOUR_ETH_PRIVATE_KEY

Alternatively, you may pass your Ethereum private key to the clients directly:

- To pass your Ethereum private key directly to the command-line client, use
the ``--eth-key`` switch:

.. code-block:: shell

zyte-api --eth-key YOUR_ETH_PRIVATE_KEY …

- To pass your Ethereum private key directly to the Python client classes,
use the ``eth_key`` parameter when creating a client object:

.. code-block:: python

from zyte_api import ZyteAPI

client = ZyteAPI(eth_key="YOUR_ETH_PRIVATE_KEY")

.. code-block:: python

from zyte_api import AsyncZyteAPI

client = AsyncZyteAPI(eth_key="YOUR_ETH_PRIVATE_KEY")
3 changes: 3 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -154,6 +154,9 @@ ignore = [
"zyte_api/errors.py" = ["UP007"]
"zyte_api/stats.py" = ["UP007"]

[tool.ruff.lint.flake8-pytest-style]
parametrize-values-type = "tuple"

[tool.ruff.lint.flake8-type-checking]
runtime-evaluated-decorators = ["attr.s"]

Expand Down
20 changes: 13 additions & 7 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,14 +16,20 @@
"console_scripts": ["zyte-api=zyte_api.__main__:_main"],
},
install_requires=[
"aiohttp >= 3.8.0",
"attrs",
"brotli",
"runstats",
"tenacity",
"tqdm",
"w3lib >= 2.1.1",
"aiohttp>=3.8.0",
"attrs>=20.1.0",
"brotli>=0.5.2",
"runstats>=0.0.1",
"tenacity>=8.2.0",
"tqdm>=4.16.0",
"w3lib>=2.1.1",
],
extras_require={
"x402": [
"eth-account>=0.13.7",
"x402>=0.1.1",
]
},
classifiers=[
"Development Status :: 3 - Alpha",
"Intended Audience :: Developers",
Expand Down
104 changes: 101 additions & 3 deletions tests/mockserver.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
import sys
import time
from base64 import b64encode
from collections import defaultdict
from importlib import import_module
from subprocess import PIPE, Popen
from typing import Any
Expand All @@ -14,6 +15,11 @@
from twisted.web.resource import Resource
from twisted.web.server import NOT_DONE_YET, Site

SCREENSHOT = (
"iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAYAAAAfFcSJAAAACklEQVR4nGMAAQAABQABDQott"
"AAAAABJRU5ErkJggg=="
)


# https://github.com/scrapy/scrapy/blob/02b97f98e74a994ad3e4d74e7ed55207e508a576/tests/mockserver.py#L27C1-L33C19
def getarg(request, name, default=None, type=None):
Expand Down Expand Up @@ -62,15 +68,35 @@ def _delayedRender(self, request):
request.finish()


RESPONSE_402 = {
"x402Version": 1,
"accepts": [
{
"scheme": "exact",
"network": "base-sepolia",
"maxAmountRequired": "1000",
"resource": "https://api.zyte.com/v1/extract",
"description": "",
"mimeType": "",
"payTo": "0xFACEdD967ea0592bbb9410fA4877Df9AeB628CB7",
"maxTimeoutSeconds": 130,
"asset": "0xFACEbD53842c5426634e7929541eC2318f3dCF7e",
"extra": {"name": "USDC", "version": "2"},
}
],
"error": "Use basic auth or x402",
}

WORKFLOWS: defaultdict[str, dict[str, Any]] = defaultdict(dict)


class DefaultResource(Resource):
request_count = 0

def getChild(self, path, request):
return self

def render_POST(self, request):
request_data = json.loads(request.content.read())

request.responseHeaders.setRawHeaders(
b"Content-Type",
[b"application/json"],
Expand All @@ -80,12 +106,17 @@ def render_POST(self, request):
[b"abcd1234"],
)

request_data = json.loads(request.content.read())

url = request_data["url"]
domain = urlparse(url).netloc
if domain == "e429.example":
request.setResponseCode(429)
response_data = {"status": 429, "type": "/limits/over-user-limit"}
return json.dumps(response_data).encode()
if domain == "e404.example":
request.setResponseCode(404)
return b""
if domain == "e500.example":
request.setResponseCode(500)
return ""
Expand Down Expand Up @@ -119,6 +150,70 @@ def render_POST(self, request):
request.setResponseCode(500)
return b'["foo"]'

auth_header = request.getHeader("Authorization")
payment_header = request.getHeader("X-Payment")
if not auth_header and not payment_header:
request.setResponseCode(402)
return json.dumps(RESPONSE_402).encode()

echo_data = request_data.get("echoData")
if echo_data:
session_data = WORKFLOWS.setdefault(echo_data, {})
if echo_data in {"402-payment-retry", "402-payment-retry-2"}:
assert request.getHeader("X-Payment")
# Return 402 on the first request, then 200 on the second
if not session_data:
session_data["payment_attempts"] = 1
request.setResponseCode(402)
return json.dumps(RESPONSE_402).encode()
elif echo_data == "402-payment-retry-exceeded":
assert request.getHeader("X-Payment")
# Return 402 on the first 2 requests, then 200 on the third
# (the client will give up after 2 attempts, so the will be no
# third in practice)
if not session_data:
session_data["payment_attempts"] = 1
session_data["payment"] = request.getHeader("X-Payment")
request.setResponseCode(402)
return json.dumps(RESPONSE_402).encode()
if session_data["payment_attempts"] == 1:
session_data["payment_attempts"] = 2
# Make sure the client refreshed the payment header
assert session_data["payment"] != request.getHeader("X-Payment")
request.setResponseCode(402)
return json.dumps(RESPONSE_402).encode()
elif echo_data == "402-no-payment-retry":
assert not request.getHeader("X-Payment")
# Return 402 on the first request, then 200 on the second
if not session_data:
session_data["payment_attempts"] = 1
request.setResponseCode(402)
return json.dumps(RESPONSE_402).encode()
elif echo_data == "402-no-payment-retry-exceeded":
assert not request.getHeader("X-Payment")
# Return 402 on the first 2 requests, then 200 on the third
# (the client will give up after 2 attempts, so the will be no
# third in practice)
if not session_data:
session_data["payment_attempts"] = 1
request.setResponseCode(402)
return json.dumps(RESPONSE_402).encode()
if session_data["payment_attempts"] == 1:
session_data["payment_attempts"] = 2
request.setResponseCode(402)
return json.dumps(RESPONSE_402).encode()
elif echo_data == "402-long-error":
request.setResponseCode(402)
response_data = {
**RESPONSE_402,
"error": (
"This is a long error message that exceeds the 32 "
"character limit for the error type prefix. It should "
"not be parsed as an error type."
),
}
return json.dumps(response_data).encode()

response_data: dict[str, Any] = {
"url": url,
}
Expand All @@ -127,9 +222,12 @@ def render_POST(self, request):
if "httpResponseBody" in request_data:
body = b64encode(html.encode()).decode()
response_data["httpResponseBody"] = body
else:
if "browserHtml" in request_data:
assert "browserHtml" in request_data
response_data["browserHtml"] = html
if "screenshot" in request_data:
assert "screenshot" in request_data
response_data["screenshot"] = SCREENSHOT

return json.dumps(response_data).encode()

Expand Down
15 changes: 15 additions & 0 deletions tests/test_apikey.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
import pytest

from zyte_api.apikey import NoApiKey, get_apikey


def test_get_apikey(monkeypatch):
assert get_apikey("a") == "a"
with pytest.raises(NoApiKey):
get_apikey()
with pytest.raises(NoApiKey):
get_apikey(None)
monkeypatch.setenv("ZYTE_API_KEY", "b")
assert get_apikey("a") == "a"
assert get_apikey() == "b"
assert get_apikey(None) == "b"
Loading