Skip to content

Commit 7eba7f1

Browse files
d-v-bmaxrjones
andauthored
add benchmarks using pytest-benchmark and codspeed (#3562)
* add benchmarks * remove failing zipstore * don't do benchmarking in default pytest runs * changelog * codspeed workflow * lint * remove pedantic mode * only run benchmarks in one environment * use better string id for test params, make test data 1MB, and simplify params * move layout to an external file * get workloads to resemble recent sharding perf tests * test ids * tweak tests * tweak tests * fix typo * add slice indexing benchmarks * remove readme * add docs documentation * simplify pytest benchmark options * use --codspeed flag in benchmark ci * measure walltime in ci * Update .github/workflows/codspeed.yml Co-authored-by: Max Jones <[email protected]> * Apply suggestion from @maxrjones Co-authored-by: Max Jones <[email protected]> * add --ignore option to main test and gpu test invocations * add comment * ignore codspeed warnings * update workflow --------- Co-authored-by: Max Jones <[email protected]>
1 parent c7b166e commit 7eba7f1

File tree

9 files changed

+203
-4
lines changed

9 files changed

+203
-4
lines changed

.github/workflows/codspeed.yml

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
name: CodSpeed Benchmarks
2+
3+
on:
4+
push:
5+
branches:
6+
- "main"
7+
pull_request:
8+
# `workflow_dispatch` allows CodSpeed to trigger backtest
9+
# performance analysis in order to generate initial data.
10+
workflow_dispatch:
11+
12+
permissions:
13+
contents: read
14+
15+
jobs:
16+
benchmarks:
17+
name: Run benchmarks
18+
runs-on: codspeed-macro
19+
steps:
20+
- uses: actions/checkout@v5
21+
with:
22+
fetch-depth: 0 # grab all branches and tags
23+
- name: Set up Python
24+
uses: actions/setup-python@v6
25+
with:
26+
python-version: "3.11"
27+
- name: Install Hatch
28+
run: |
29+
python -m pip install --upgrade pip
30+
pip install hatch
31+
- name: Run the benchmarks
32+
uses: CodSpeedHQ/action@v4
33+
with:
34+
mode: walltime
35+
run: hatch run test.py3.11-2.0-minimal:pytest tests/benchmarks --codspeed

changes/3562.misc.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
Add continuous performance benchmarking infrastructure.

docs/contributing.md

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -264,4 +264,13 @@ If an existing Zarr format version changes, or a new version of the Zarr format
264264
## Release procedure
265265

266266
Open an issue on GitHub announcing the release using the release checklist template:
267-
[https://github.com/zarr-developers/zarr-python/issues/new?template=release-checklist.md](https://github.com/zarr-developers/zarr-python/issues/new?template=release-checklist.md>). The release checklist includes all steps necessary for the release.
267+
[https://github.com/zarr-developers/zarr-python/issues/new?template=release-checklist.md](https://github.com/zarr-developers/zarr-python/issues/new?template=release-checklist.md>). The release checklist includes all steps necessary for the release.
268+
269+
## Benchmarks
270+
271+
Zarr uses [pytest-benchmark](https://pytest-benchmark.readthedocs.io/en/latest/) for running
272+
performance benchmarks as part of our test suite. The benchmarks can be are found in `tests/benchmarks`.
273+
By default pytest is configured to run these benchmarks as plain tests (i.e., no benchmarking). To run
274+
a benchmark with timing measurements, use the `--benchmark-enable` when invoking `pytest`.
275+
276+
The benchmarks are run as part of the continuous integration suite through [codspeed](https://codspeed.io/zarr-developers/zarr-python).

pyproject.toml

Lines changed: 9 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -82,6 +82,8 @@ test = [
8282
'numpydoc',
8383
"hypothesis",
8484
"pytest-xdist",
85+
"pytest-benchmark",
86+
"pytest-codspeed",
8587
"packaging",
8688
"tomlkit",
8789
"uv",
@@ -175,11 +177,12 @@ matrix.deps.dependencies = [
175177
run-coverage = "pytest --cov-config=pyproject.toml --cov=src --cov-append --cov-report xml --junitxml=junit.xml -o junit_family=legacy"
176178
run-coverage-html = "pytest --cov-config=pyproject.toml --cov=src --cov-append --cov-report html"
177179
run-coverage-gpu = "pip install cupy-cuda12x && pytest -m gpu --cov-config=pyproject.toml --cov=src --cov-append --cov-report xml --junitxml=junit.xml -o junit_family=legacy"
178-
run = "run-coverage --no-cov"
180+
run = "run-coverage --no-cov --ignore tests/benchmarks"
179181
run-pytest = "run"
180182
run-verbose = "run-coverage --verbose"
181183
run-mypy = "mypy src"
182184
run-hypothesis = "run-coverage -nauto --run-slow-hypothesis tests/test_properties.py tests/test_store/test_stateful*"
185+
run-benchmark = "pytest --benchmark-enable tests/benchmarks"
183186
list-env = "pip list"
184187

185188
[tool.hatch.envs.gputest]
@@ -196,7 +199,7 @@ numpy = ["2.0", "2.2"]
196199
version = ["minimal"]
197200

198201
[tool.hatch.envs.gputest.scripts]
199-
run-coverage = "pytest -m gpu --cov-config=pyproject.toml --cov=pkg --cov-report xml --cov=src --junitxml=junit.xml -o junit_family=legacy"
202+
run-coverage = "pytest -m gpu --cov-config=pyproject.toml --cov=pkg --cov-report xml --cov=src --junitxml=junit.xml -o junit_family=legacy --ignore tests/benchmarks"
200203
run = "run-coverage --no-cov"
201204
run-verbose = "run-coverage --verbose"
202205
run-mypy = "mypy src"
@@ -405,7 +408,10 @@ doctest_optionflags = [
405408
"IGNORE_EXCEPTION_DETAIL",
406409
]
407410
addopts = [
408-
"--durations=10", "-ra", "--strict-config", "--strict-markers",
411+
"--benchmark-columns", "min,mean,stddev,outliers,rounds,iterations",
412+
"--benchmark-disable", # benchmark routines run as tests without benchmarking instrumentation
413+
"--durations", "10",
414+
"-ra", "--strict-config", "--strict-markers",
409415
]
410416
filterwarnings = [
411417
"error",

tests/benchmarks/__init__.py

Whitespace-only changes.

tests/benchmarks/common.py

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
from dataclasses import dataclass
2+
3+
4+
@dataclass(kw_only=True, frozen=True)
5+
class Layout:
6+
shape: tuple[int, ...]
7+
chunks: tuple[int, ...]
8+
shards: tuple[int, ...] | None

tests/benchmarks/conftest.py

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
"""Pytest configuration for benchmark tests."""
2+
3+
import pytest
4+
5+
# Filter CodSpeed instrumentation warnings that can occur intermittently
6+
# when registering benchmark results. This is a known issue with the
7+
# CodSpeed walltime instrumentation hooks.
8+
# See: https://github.com/CodSpeedHQ/pytest-codspeed
9+
10+
11+
def pytest_configure(config: pytest.Config) -> None:
12+
config.addinivalue_line(
13+
"filterwarnings",
14+
"ignore:Failed to set executed benchmark:RuntimeWarning",
15+
)

tests/benchmarks/test_e2e.py

Lines changed: 82 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,82 @@
1+
"""
2+
Benchmarks for end-to-end read/write performance of Zarr
3+
"""
4+
5+
from __future__ import annotations
6+
7+
from typing import TYPE_CHECKING
8+
9+
from tests.benchmarks.common import Layout
10+
11+
if TYPE_CHECKING:
12+
from pytest_benchmark.fixture import BenchmarkFixture
13+
14+
from zarr.abc.store import Store
15+
from zarr.core.common import NamedConfig
16+
from operator import getitem, setitem
17+
from typing import Any, Literal
18+
19+
import pytest
20+
21+
from zarr import create_array
22+
23+
CompressorName = Literal["gzip"] | None
24+
25+
compressors: dict[CompressorName, NamedConfig[Any, Any] | None] = {
26+
None: None,
27+
"gzip": {"name": "gzip", "configuration": {"level": 1}},
28+
}
29+
30+
31+
layouts: tuple[Layout, ...] = (
32+
# No shards, just 1000 chunks
33+
Layout(shape=(1_000_000,), chunks=(1000,), shards=None),
34+
# 1:1 chunk:shard shape, should measure overhead of sharding
35+
Layout(shape=(1_000_000,), chunks=(1000,), shards=(1000,)),
36+
# One shard with all the chunks, should measure overhead of handling inner shard chunks
37+
Layout(shape=(1_000_000,), chunks=(100,), shards=(10000 * 100,)),
38+
)
39+
40+
41+
@pytest.mark.parametrize("compression_name", [None, "gzip"])
42+
@pytest.mark.parametrize("layout", layouts, ids=str)
43+
@pytest.mark.parametrize("store", ["memory", "local"], indirect=["store"])
44+
def test_write_array(
45+
store: Store, layout: Layout, compression_name: CompressorName, benchmark: BenchmarkFixture
46+
) -> None:
47+
"""
48+
Test the time required to fill an array with a single value
49+
"""
50+
arr = create_array(
51+
store,
52+
dtype="uint8",
53+
shape=layout.shape,
54+
chunks=layout.chunks,
55+
shards=layout.shards,
56+
compressors=compressors[compression_name], # type: ignore[arg-type]
57+
fill_value=0,
58+
)
59+
60+
benchmark(setitem, arr, Ellipsis, 1)
61+
62+
63+
@pytest.mark.parametrize("compression_name", [None, "gzip"])
64+
@pytest.mark.parametrize("layout", layouts, ids=str)
65+
@pytest.mark.parametrize("store", ["memory", "local"], indirect=["store"])
66+
def test_read_array(
67+
store: Store, layout: Layout, compression_name: CompressorName, benchmark: BenchmarkFixture
68+
) -> None:
69+
"""
70+
Test the time required to fill an array with a single value
71+
"""
72+
arr = create_array(
73+
store,
74+
dtype="uint8",
75+
shape=layout.shape,
76+
chunks=layout.chunks,
77+
shards=layout.shards,
78+
compressors=compressors[compression_name], # type: ignore[arg-type]
79+
fill_value=0,
80+
)
81+
arr[:] = 1
82+
benchmark(getitem, arr, Ellipsis)

tests/benchmarks/test_indexing.py

Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,43 @@
1+
from __future__ import annotations
2+
3+
from typing import TYPE_CHECKING
4+
5+
if TYPE_CHECKING:
6+
from pytest_benchmark.fixture import BenchmarkFixture
7+
8+
from zarr.abc.store import Store
9+
10+
from operator import getitem
11+
12+
import pytest
13+
14+
from zarr import create_array
15+
16+
indexers = (
17+
(0,) * 3,
18+
(slice(None),) * 3,
19+
(slice(0, None, 4),) * 3,
20+
(slice(10),) * 3,
21+
(slice(10, -10, 4),) * 3,
22+
(slice(None), slice(0, 3, 2), slice(0, 10)),
23+
)
24+
25+
26+
@pytest.mark.parametrize("store", ["memory"], indirect=["store"])
27+
@pytest.mark.parametrize("indexer", indexers, ids=str)
28+
def test_slice_indexing(
29+
store: Store, indexer: tuple[int | slice], benchmark: BenchmarkFixture
30+
) -> None:
31+
data = create_array(
32+
store=store,
33+
shape=(105,) * 3,
34+
dtype="uint8",
35+
chunks=(10,) * 3,
36+
shards=None,
37+
compressors=None,
38+
filters=None,
39+
fill_value=0,
40+
)
41+
42+
data[:] = 1
43+
benchmark(getitem, data, indexer)

0 commit comments

Comments
 (0)