Skip to content

Commit fec68a6

Browse files
committed
Merge branch 'main' into return-scalar-for-zero-dim-indexing
# Conflicts: # tests/test_api.py
2 parents 75a267b + 9e8b50a commit fec68a6

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

45 files changed

+2931
-314
lines changed

.github/ISSUE_TEMPLATE/config.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ contact_links:
55
about: A new major feature should be discussed in the Zarr specifications repository.
66
- name: Discuss something on ZulipChat
77
url: https://ossci.zulipchat.com/
8-
about: For questions like "How do I do X with Zarr?", you can move to our ZulipChat.
8+
about: For questions like "How do I do X with Zarr?", consider posting your question to our developer chat.
99
- name: Discuss something on GitHub Discussions
1010
url: https://github.com/zarr-developers/zarr-python/discussions
1111
about: For questions like "How do I do X with Zarr?", you can move to GitHub Discussions.

.pre-commit-config.yaml

Lines changed: 9 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -6,11 +6,11 @@ ci:
66
default_stages: [pre-commit, pre-push]
77
repos:
88
- repo: https://github.com/astral-sh/ruff-pre-commit
9-
rev: v0.9.4
9+
rev: v0.9.9
1010
hooks:
11-
- id: ruff
12-
args: ["--fix", "--show-fixes"]
13-
- id: ruff-format
11+
- id: ruff
12+
args: ["--fix", "--show-fixes"]
13+
- id: ruff-format
1414
- repo: https://github.com/codespell-project/codespell
1515
rev: v2.4.1
1616
hooks:
@@ -19,10 +19,10 @@ repos:
1919
- repo: https://github.com/pre-commit/pre-commit-hooks
2020
rev: v5.0.0
2121
hooks:
22-
- id: check-yaml
23-
- id: trailing-whitespace
22+
- id: check-yaml
23+
- id: trailing-whitespace
2424
- repo: https://github.com/pre-commit/mirrors-mypy
25-
rev: v1.14.1
25+
rev: v1.15.0
2626
hooks:
2727
- id: mypy
2828
files: src|tests
@@ -31,9 +31,10 @@ repos:
3131
- packaging
3232
- donfig
3333
- numcodecs[crc32c]
34-
- numpy==2.1 # until https://github.com/numpy/numpy/issues/28034 is resolved
34+
- numpy==2.1 # until https://github.com/numpy/numpy/issues/28034 is resolved
3535
- typing_extensions
3636
- universal-pathlib
37+
- obstore>=0.5.1
3738
# Tests
3839
- pytest
3940
- repo: https://github.com/scientific-python/cookie

README.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -70,7 +70,7 @@
7070
</td>
7171
</tr>
7272
<tr>
73-
<td>Zulip</td>
73+
<td>Developer Chat</td>
7474
<td>
7575
<a href="https://ossci.zulipchat.com/">
7676
<img src="https://img.shields.io/badge/zulip-join_chat-brightgreen.svg" />
@@ -101,13 +101,13 @@ Zarr is a Python package providing an implementation of compressed, chunked, N-d
101101

102102
## Main Features
103103

104-
- [**Create**](https://zarr.readthedocs.io/en/stable/tutorial.html#creating-an-array) N-dimensional arrays with any NumPy `dtype`.
105-
- [**Chunk arrays**](https://zarr.readthedocs.io/en/stable/tutorial.html#chunk-optimizations) along any dimension.
106-
- [**Compress**](https://zarr.readthedocs.io/en/stable/tutorial.html#compressors) and/or filter chunks using any NumCodecs codec.
107-
- [**Store arrays**](https://zarr.readthedocs.io/en/stable/tutorial.html#tutorial-storage) in memory, on disk, inside a zip file, on S3, etc...
108-
- [**Read**](https://zarr.readthedocs.io/en/stable/tutorial.html#reading-and-writing-data) an array [**concurrently**](https://zarr.readthedocs.io/en/stable/tutorial.html#parallel-computing-and-synchronization) from multiple threads or processes.
109-
- Write to an array concurrently from multiple threads or processes.
110-
- Organize arrays into hierarchies via [**groups**](https://zarr.readthedocs.io/en/stable/tutorial.html#groups).
104+
- [**Create**](https://zarr.readthedocs.io/en/stable/user-guide/arrays.html#creating-an-array) N-dimensional arrays with any NumPy `dtype`.
105+
- [**Chunk arrays**](https://zarr.readthedocs.io/en/stable/user-guide/performance.html#chunk-optimizations) along any dimension.
106+
- [**Compress**](https://zarr.readthedocs.io/en/stable/user-guide/arrays.html#compressors) and/or filter chunks using any NumCodecs codec.
107+
- [**Store arrays**](https://zarr.readthedocs.io/en/stable/user-guide/storage.html) in memory, on disk, inside a zip file, on S3, etc...
108+
- [**Read**](https://zarr.readthedocs.io/en/stable/user-guide/arrays.html#reading-and-writing-data) an array [**concurrently**](https://zarr.readthedocs.io/en/stable/user-guide/performance.html#parallel-computing-and-synchronization) from multiple threads or processes.
109+
- [**Write**](https://zarr.readthedocs.io/en/stable/user-guide/arrays.html#reading-and-writing-data) to an array concurrently from multiple threads or processes.
110+
- Organize arrays into hierarchies via [**groups**](https://zarr.readthedocs.io/en/stable/quickstart.html#hierarchical-groups).
111111

112112
## Where to get it
113113

changes/1661.feature.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
Add experimental ObjectStore storage class based on obstore.

changes/2796.chore.rst

Lines changed: 0 additions & 1 deletion
This file was deleted.

changes/2924.chore.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
Define a new versioning policy based on Effective Effort Versioning. This replaces the old
2+
Semantic Versioning-based policy.

docs/conf.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -369,6 +369,7 @@ def setup(app: sphinx.application.Sphinx) -> None:
369369
"python": ("https://docs.python.org/3/", None),
370370
"numpy": ("https://numpy.org/doc/stable/", None),
371371
"numcodecs": ("https://numcodecs.readthedocs.io/en/stable/", None),
372+
"obstore": ("https://developmentseed.org/obstore/latest/", None),
372373
}
373374

374375

docs/developers/contributing.rst

Lines changed: 68 additions & 73 deletions
Original file line numberDiff line numberDiff line change
@@ -261,90 +261,83 @@ Merging pull requests
261261
~~~~~~~~~~~~~~~~~~~~~
262262

263263
Pull requests submitted by an external contributor should be reviewed and approved by at least
264-
one core developers before being merged. Ideally, pull requests submitted by a core developer
265-
should be reviewed and approved by at least one other core developers before being merged.
264+
one core developer before being merged. Ideally, pull requests submitted by a core developer
265+
should be reviewed and approved by at least one other core developer before being merged.
266266

267267
Pull requests should not be merged until all CI checks have passed (GitHub Actions
268268
Codecov) against code that has had the latest main merged in.
269269

270270
Compatibility and versioning policies
271271
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
272272

273-
Because Zarr is a data storage library, there are two types of compatibility to
274-
consider: API compatibility and data format compatibility.
275-
276-
API compatibility
277-
"""""""""""""""""
278-
279-
All functions, classes and methods that are included in the API
280-
documentation (files under ``docs/api/*.rst``) are considered as part of the Zarr **public API**,
281-
except if they have been documented as an experimental feature, in which case they are part of
282-
the **experimental API**.
283-
284-
Any change to the public API that does **not** break existing third party
285-
code importing Zarr, or cause third party code to behave in a different way, is a
286-
**backwards-compatible API change**. For example, adding a new function, class or method is usually
287-
a backwards-compatible change. However, removing a function, class or method; removing an argument
288-
to a function or method; adding a required argument to a function or method; or changing the
289-
behaviour of a function or method, are examples of **backwards-incompatible API changes**.
290-
291-
If a release contains no changes to the public API (e.g., contains only bug fixes or
292-
other maintenance work), then the micro version number should be incremented (e.g.,
293-
2.2.0 -> 2.2.1). If a release contains public API changes, but all changes are
294-
backwards-compatible, then the minor version number should be incremented
295-
(e.g., 2.2.1 -> 2.3.0). If a release contains any backwards-incompatible public API changes,
296-
the major version number should be incremented (e.g., 2.3.0 -> 3.0.0).
297-
298-
Backwards-incompatible changes to the experimental API can be included in a minor release,
299-
although this should be minimised if possible. I.e., it would be preferable to save up
300-
backwards-incompatible changes to the experimental API to be included in a major release, and to
301-
stabilise those features at the same time (i.e., move from experimental to public API), rather than
302-
frequently tinkering with the experimental API in minor releases.
273+
Versioning
274+
""""""""""
275+
Versions of this library are identified by a triplet of integers with the form
276+
``<major>.<minor>.<patch>``, for example ``3.0.4``. A release of ``zarr-python`` is associated with a new
277+
version identifier. That new identifier is generated by incrementing exactly one of the components of
278+
the previous version identifier by 1. When incrementing the ``major`` component of the version identifier,
279+
the ``minor`` and ``patch`` components is reset to 0. When incrementing the minor component,
280+
the patch component is reset to 0.
281+
282+
Releases are classified by the library changes contained in that release. This classification
283+
determines which component of the version identifier is incremented on release.
284+
285+
* ``major`` releases (for example, ``2.18.0`` -> ``3.0.0``) are for changes that will
286+
require extensive adaptation efforts from many users and downstream projects.
287+
For example, breaking changes to widely-used user-facing APIs should only be applied in a major release.
288+
289+
290+
Users and downstream projects should carefully consider the impact of a major release before
291+
adopting it.
292+
In advance of a major release, developers should communicate the scope of the upcoming changes,
293+
and help users prepare for them.
294+
295+
* ``minor`` releases (or example, ``3.0.0`` -> ``3.1.0``) are for changes that do not require
296+
significant effort from most users or downstream downstream projects to respond to. API changes
297+
are possible in minor releases if the burden on users imposed by those changes is sufficiently small.
298+
299+
For example, a recently released API may need fixes or refinements that are breaking, but low impact
300+
due to the recency of the feature. Such API changes are permitted in a minor release.
301+
302+
303+
Minor releases are safe for most users and downstream projects to adopt.
304+
305+
306+
* ``patch`` releases (for example, ``3.1.0`` -> ``3.1.1``) are for changes that contain no breaking
307+
or behaviour changes for downstream projects or users. Examples of changes suitable for a patch release are
308+
bugfixes and documentation improvements.
309+
310+
311+
Users should always feel safe upgrading to a the latest patch release.
312+
313+
Note that this versioning scheme is not consistent with `Semantic Versioning <https://semver.org/>`_.
314+
Contrary to SemVer, the Zarr library may release breaking changes in ``minor`` releases, or even
315+
``patch`` releases under exceptional circumstances. But we should strive to avoid doing so.
316+
317+
A better model for our versioning scheme is `Intended Effort Versioning <https://jacobtomlinson.dev/effver/>`_,
318+
or "EffVer". The guiding principle off EffVer is to categorize releases based on the *expected effort
319+
required to upgrade to that release*.
320+
321+
Zarr developers should make changes as smooth as possible for users. This means making
322+
backwards-compatible changes wherever possible. When a backwards-incompatible change is necessary,
323+
users should be notified well in advance, e.g. via informative deprecation warnings.
303324

304325
Data format compatibility
305-
"""""""""""""""""""""""""
306-
307-
The data format used by Zarr is defined by a specification document, which should be
308-
platform-independent and contain sufficient detail to construct an interoperable
309-
software library to read and/or write Zarr data using any programming language. The
310-
latest version of the specification document is available on the
311-
`Zarr specifications website <https://zarr-specs.readthedocs.io>`_.
312-
313-
Here, **data format compatibility** means that all software libraries that implement a
314-
particular version of the Zarr storage specification are interoperable, in the sense
315-
that data written by any one library can be read by all others. It is obviously
316-
desirable to maintain data format compatibility wherever possible. However, if a change
317-
is needed to the storage specification, and that change would break data format
318-
compatibility in any way, then the storage specification version number should be
319-
incremented (e.g., 2 -> 3).
320-
321-
The versioning of the Zarr software library is related to the versioning of the storage
322-
specification as follows. A particular version of the Zarr library will
323-
implement a particular version of the storage specification. For example, Zarr version
324-
2.2.0 implements the Zarr storage specification version 2. If a release of the Zarr
325-
library implements a different version of the storage specification, then the major
326-
version number of the Zarr library should be incremented. E.g., if Zarr version 2.2.0
327-
implements the storage spec version 2, and the next release of the Zarr library
328-
implements storage spec version 3, then the next library release should have version
329-
number 3.0.0. Note however that the major version number of the Zarr library may not
330-
always correspond to the spec version number. For example, Zarr versions 2.x, 3.x, and
331-
4.x might all implement the same version of the storage spec and thus maintain data
332-
format compatibility, although they will not maintain API compatibility.
333-
334-
When to make a release
335-
~~~~~~~~~~~~~~~~~~~~~~
326+
^^^^^^^^^^^^^^^^^^^^^^^^^
327+
328+
The Zarr library is an implementation of a file format standard defined externally --
329+
see the `Zarr specifications website <https://zarr-specs.readthedocs.io>`_ for the list of
330+
Zarr file format specifications.
336331

337-
Ideally, any bug fixes that don't change the public API should be released as soon as
338-
possible. It is fine for a micro release to contain only a single bug fix.
339332

340-
When to make a minor release is at the discretion of the core developers. There are no
341-
hard-and-fast rules, e.g., it is fine to make a minor release to make a single new
342-
feature available; equally, it is fine to make a minor release that includes a number of
343-
changes.
333+
If an existing Zarr format version changes, or a new version of the Zarr format is released, then
334+
the Zarr library will generally require changes. It is very likely that a new Zarr format will
335+
require extensive breaking changes to the Zarr library, and so support for a new Zarr format in the
336+
Zarr library will almost certainly come in new ``major`` release.
337+
When the Zarr library adds support for a new Zarr format, there may be a period of accelerated
338+
changes as developers refine newly added APIs and deprecate old APIs. In such a transitional phase
339+
breaking changes may be more frequent than usual.
344340

345-
Major releases obviously need to be given careful consideration, and should be done as
346-
infrequently as possible, as they will break existing code and/or affect data
347-
compatibility in some way.
348341

349342
Release procedure
350343
~~~~~~~~~~~~~~~~~
@@ -387,5 +380,7 @@ pre-releases will be available under
387380
Post-release
388381
""""""""""""
389382

390-
- Review and merge the pull request on the `conda-forge feedstock <https://github.com/conda-forge/zarr-feedstock>`_ that will be automatically generated.
383+
- Review and merge the pull request on the
384+
`conda-forge feedstock <https://github.com/conda-forge/zarr-feedstock>`_ that will be
385+
automatically generated.
391386
- Create a new "Unreleased" section in the release notes

docs/index.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ Zarr-Python
2020
**Useful links**:
2121
`Source Repository <https://github.com/zarr-developers/zarr-python>`_ |
2222
`Issue Tracker <https://github.com/zarr-developers/zarr-python/issues>`_ |
23-
`Zulip Chat <https://ossci.zulipchat.com/>`_ |
23+
`Developer Chat <https://ossci.zulipchat.com/>`_ |
2424
`Zarr specifications <https://zarr-specs.readthedocs.io>`_
2525

2626
Zarr-Python is a Python library for reading and writing Zarr groups and arrays. Highlights include:

docs/quickstart.rst

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -119,6 +119,28 @@ Zarr allows you to create hierarchical groups, similar to directories::
119119

120120
This creates a group with two datasets: ``foo`` and ``bar``.
121121

122+
Batch Hierarchy Creation
123+
~~~~~~~~~~~~~~~~~~~~~~~~
124+
125+
Zarr provides tools for creating a collection of arrays and groups with a single function call.
126+
Suppose we want to copy existing groups and arrays into a new storage backend:
127+
128+
>>> # Create nested groups and add arrays
129+
>>> root = zarr.group("data/example-3.zarr", attributes={'name': 'root'})
130+
>>> foo = root.create_group(name="foo")
131+
>>> bar = root.create_array(
132+
... name="bar", shape=(100, 10), chunks=(10, 10), dtype="f4"
133+
... )
134+
>>> nodes = {'': root.metadata} | {k: v.metadata for k,v in root.members()}
135+
>>> print(nodes)
136+
>>> from zarr.storage import MemoryStore
137+
>>> new_nodes = dict(zarr.create_hierarchy(store=MemoryStore(), nodes=nodes))
138+
>>> new_root = new_nodes['']
139+
>>> assert new_root.attrs == root.attrs
140+
141+
Note that :func:`zarr.create_hierarchy` will only initialize arrays and groups -- copying array data must
142+
be done in a separate step.
143+
122144
Persistent Storage
123145
------------------
124146

0 commit comments

Comments
 (0)