Skip to content

Commit ac3bb0a

Browse files
committed
merge upstream changes
2 parents 9b23137 + 62551c7 commit ac3bb0a

29 files changed

+1928
-207
lines changed

.github/workflows/gpu_test.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,8 @@ jobs:
3030

3131
steps:
3232
- uses: actions/checkout@v5
33+
with:
34+
fetch-depth: 0 # grab all branches and tags
3335
# - name: cuda-toolkit
3436
# uses: Jimver/[email protected]
3537
# id: cuda-toolkit

changes/1798.feature.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
Add a command-line interface to migrate v2 Zarr metadata to v3. Corresponding functions are also
2+
provided under zarr.metadata.

changes/2992.bugfix.rst

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
Fix a bug preventing ``ones_like``, ``full_like``, ``empty_like``, ``zeros_like`` and ``open_like`` functions from accepting
2+
an explicit specification of array attributes like shape, dtype, chunks etc. The functions ``full_like``,
3+
``empty_like``, and ``open_like`` now also more consistently infer a ``fill_value`` parameter from the provided array.

changes/3436.feature.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
Adds a registry for chunk key encodings for extensibility.
2+
This allows users to implement a custom `ChunkKeyEncoding`, which can be registered via `register_chunk_key_encoding` or as an entry point under `zarr.chunk_key_encoding`.

docs/user-guide/cli.rst

Lines changed: 127 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,127 @@
1+
.. _user-guide-cli:
2+
3+
Command-line interface
4+
========================
5+
6+
Zarr-Python provides a command-line interface that enables:
7+
8+
- migration of Zarr v2 metadata to v3
9+
- removal of v2 or v3 metadata
10+
11+
To see available commands run the following in a terminal:
12+
13+
.. code-block:: bash
14+
15+
$ zarr --help
16+
17+
or to get help on individual commands:
18+
19+
.. code-block:: bash
20+
21+
$ zarr migrate --help
22+
23+
$ zarr remove-metadata --help
24+
25+
26+
Migrate metadata from v2 to v3
27+
------------------------------
28+
29+
Migrate to a separate location
30+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
31+
32+
To migrate a Zarr array/group's metadata from v2 to v3 run:
33+
34+
.. code-block:: bash
35+
36+
$ zarr migrate v3 path/to/input.zarr path/to/output.zarr
37+
38+
This will write new ``zarr.json`` files to ``output.zarr``, leaving ``input.zarr`` un-touched.
39+
Note - this will migrate the entire Zarr hierarchy, so if ``input.zarr`` contains multiple groups/arrays,
40+
new ``zarr.json`` will be made for all of them.
41+
42+
Migrate in-place
43+
~~~~~~~~~~~~~~~~
44+
45+
If you'd prefer to migrate the metadata in-place run:
46+
47+
.. code-block:: bash
48+
49+
$ zarr migrate v3 path/to/input.zarr
50+
51+
This will write new ``zarr.json`` files to ``input.zarr``, leaving the existing v2 metadata un-touched.
52+
53+
To open the array/group using the new metadata use:
54+
55+
.. code-block:: python
56+
57+
>>> import zarr
58+
>>> zarr_with_v3_metadata = zarr.open('path/to/input.zarr', zarr_format=3)
59+
60+
Once you are happy with the conversion, you can run the following to remove the old v2 metadata:
61+
62+
.. code-block:: bash
63+
64+
$ zarr remove-metadata v2 path/to/input.zarr
65+
66+
Note there is also a shortcut to migrate and remove v2 metadata in one step:
67+
68+
.. code-block:: bash
69+
70+
$ zarr migrate v3 path/to/input.zarr --remove-v2-metadata
71+
72+
73+
Remove metadata
74+
----------------
75+
76+
Remove v2 metadata using:
77+
78+
.. code-block:: bash
79+
80+
$ zarr remove-metadata v2 path/to/input.zarr
81+
82+
or v3 with:
83+
84+
.. code-block:: bash
85+
86+
$ zarr remove-metadata v3 path/to/input.zarr
87+
88+
By default, this will only allow removal of metadata if a valid alternative exists. For example, you can't
89+
remove v2 metadata unless v3 metadata exists at that location.
90+
91+
To override this behaviour use ``--force``:
92+
93+
.. code-block:: bash
94+
95+
$ zarr remove-metadata v3 path/to/input.zarr --force
96+
97+
98+
Dry run
99+
--------
100+
All commands provide a ``--dry-run`` option that will log changes that would be made on a real run, without creating
101+
or modifying any files.
102+
103+
.. code-block:: bash
104+
105+
$ zarr migrate v3 path/to/input.zarr --dry-run
106+
107+
Dry run enabled - no new files will be created or changed. Log of files that would be created on a real run:
108+
Saving metadata to path/to/input.zarr/zarr.json
109+
110+
111+
Verbose
112+
--------
113+
You can also add ``--verbose`` **before** any command, to see a full log of its actions:
114+
115+
.. code-block:: bash
116+
117+
$ zarr --verbose migrate v3 path/to/input.zarr
118+
119+
$ zarr --verbose remove-metadata v2 path/to/input.zarr
120+
121+
122+
Equivalent functions
123+
--------------------
124+
All features of the command-line interface are also available via functions under
125+
:mod:`zarr.metadata`.
126+
127+

docs/user-guide/consolidated_metadata.rst

Lines changed: 35 additions & 38 deletions
Original file line numberDiff line numberDiff line change
@@ -49,44 +49,41 @@ that can be used.:
4949
>>> from pprint import pprint
5050
>>> pprint(dict(consolidated_metadata.items()))
5151
{'a': ArrayV3Metadata(shape=(1,),
52-
data_type=Float64(endianness='little'),
53-
chunk_grid=RegularChunkGrid(chunk_shape=(1,)),
54-
chunk_key_encoding=DefaultChunkKeyEncoding(name='default',
55-
separator='/'),
56-
fill_value=np.float64(0.0),
57-
codecs=(BytesCodec(endian=<Endian.little: 'little'>),
58-
ZstdCodec(level=0, checksum=False)),
59-
attributes={},
60-
dimension_names=None,
61-
zarr_format=3,
62-
node_type='array',
63-
storage_transformers=()),
64-
'b': ArrayV3Metadata(shape=(2, 2),
65-
data_type=Float64(endianness='little'),
66-
chunk_grid=RegularChunkGrid(chunk_shape=(2, 2)),
67-
chunk_key_encoding=DefaultChunkKeyEncoding(name='default',
68-
separator='/'),
69-
fill_value=np.float64(0.0),
70-
codecs=(BytesCodec(endian=<Endian.little: 'little'>),
71-
ZstdCodec(level=0, checksum=False)),
72-
attributes={},
73-
dimension_names=None,
74-
zarr_format=3,
75-
node_type='array',
76-
storage_transformers=()),
77-
'c': ArrayV3Metadata(shape=(3, 3, 3),
78-
data_type=Float64(endianness='little'),
79-
chunk_grid=RegularChunkGrid(chunk_shape=(3, 3, 3)),
80-
chunk_key_encoding=DefaultChunkKeyEncoding(name='default',
81-
separator='/'),
82-
fill_value=np.float64(0.0),
83-
codecs=(BytesCodec(endian=<Endian.little: 'little'>),
84-
ZstdCodec(level=0, checksum=False)),
85-
attributes={},
86-
dimension_names=None,
87-
zarr_format=3,
88-
node_type='array',
89-
storage_transformers=())}
52+
data_type=Float64(endianness='little'),
53+
chunk_grid=RegularChunkGrid(chunk_shape=(1,)),
54+
chunk_key_encoding=DefaultChunkKeyEncoding(separator='/'),
55+
fill_value=np.float64(0.0),
56+
codecs=(BytesCodec(endian=<Endian.little: 'little'>),
57+
ZstdCodec(level=0, checksum=False)),
58+
attributes={},
59+
dimension_names=None,
60+
zarr_format=3,
61+
node_type='array',
62+
storage_transformers=()),
63+
'b': ArrayV3Metadata(shape=(2, 2),
64+
data_type=Float64(endianness='little'),
65+
chunk_grid=RegularChunkGrid(chunk_shape=(2, 2)),
66+
chunk_key_encoding=DefaultChunkKeyEncoding(separator='/'),
67+
fill_value=np.float64(0.0),
68+
codecs=(BytesCodec(endian=<Endian.little: 'little'>),
69+
ZstdCodec(level=0, checksum=False)),
70+
attributes={},
71+
dimension_names=None,
72+
zarr_format=3,
73+
node_type='array',
74+
storage_transformers=()),
75+
'c': ArrayV3Metadata(shape=(3, 3, 3),
76+
data_type=Float64(endianness='little'),
77+
chunk_grid=RegularChunkGrid(chunk_shape=(3, 3, 3)),
78+
chunk_key_encoding=DefaultChunkKeyEncoding(separator='/'),
79+
fill_value=np.float64(0.0),
80+
codecs=(BytesCodec(endian=<Endian.little: 'little'>),
81+
ZstdCodec(level=0, checksum=False)),
82+
attributes={},
83+
dimension_names=None,
84+
zarr_format=3,
85+
node_type='array',
86+
storage_transformers=())}
9087

9188
Operations on the group to get children automatically use the consolidated metadata.:
9289

docs/user-guide/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@ User guide
1313
storage
1414
config
1515
v3_migration
16+
cli
1617

1718
Advanced Topics
1819
---------------

pyproject.toml

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -68,6 +68,7 @@ remote = [
6868
gpu = [
6969
"cupy-cuda12x",
7070
]
71+
cli = ["typer"]
7172
# Development extras
7273
test = [
7374
"coverage>=7.10",
@@ -114,6 +115,9 @@ docs = [
114115
'pytest'
115116
]
116117

118+
[project.scripts]
119+
zarr = "zarr._cli.cli:app"
120+
117121

118122
[project.urls]
119123
issues = "https://github.com/zarr-developers/zarr-python/issues"
@@ -164,7 +168,7 @@ deps = ["minimal", "optional"]
164168

165169
[tool.hatch.envs.test.overrides]
166170
matrix.deps.dependencies = [
167-
{value = "zarr[remote, remote_tests, test, optional]", if = ["optional"]}
171+
{value = "zarr[remote, remote_tests, test, optional, cli]", if = ["optional"]}
168172
]
169173

170174
[tool.hatch.envs.test.scripts]

src/zarr/__init__.py

Lines changed: 58 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,7 @@
1+
import functools
2+
import logging
3+
from typing import Literal
4+
15
from zarr._version import version as __version__
26
from zarr.api.synchronous import (
37
array,
@@ -37,6 +41,8 @@
3741
# in case setuptools scm screw up and find version to be 0.0.0
3842
assert not __version__.startswith("0.0.0")
3943

44+
_logger = logging.getLogger(__name__)
45+
4046

4147
def print_debug_info() -> None:
4248
"""
@@ -85,6 +91,58 @@ def print_packages(packages: list[str]) -> None:
8591
print_packages(optional)
8692

8793

94+
# The decorator ensures this always returns the same handler (and it is only
95+
# attached once).
96+
@functools.cache
97+
def _ensure_handler() -> logging.Handler:
98+
"""
99+
The first time this function is called, attach a `StreamHandler` using the
100+
same format as `logging.basicConfig` to the Zarr-Python root logger.
101+
102+
Return this handler every time this function is called.
103+
"""
104+
handler = logging.StreamHandler()
105+
handler.setFormatter(logging.Formatter(logging.BASIC_FORMAT))
106+
_logger.addHandler(handler)
107+
return handler
108+
109+
110+
def set_log_level(
111+
level: Literal["NOTSET", "DEBUG", "INFO", "WARNING", "ERROR", "CRITICAL"],
112+
) -> None:
113+
"""Set the logging level for Zarr-Python.
114+
115+
Zarr-Python uses the standard library `logging` framework under the root
116+
logger 'zarr'. This is a helper function to:
117+
118+
- set Zarr-Python's root logger level
119+
- set the root logger handler's level, creating the handler
120+
if it does not exist yet
121+
122+
Parameters
123+
----------
124+
level : str
125+
The logging level to set.
126+
"""
127+
_logger.setLevel(level)
128+
_ensure_handler().setLevel(level)
129+
130+
131+
def set_format(log_format: str) -> None:
132+
"""Set the format of logging messages from Zarr-Python.
133+
134+
Zarr-Python uses the standard library `logging` framework under the root
135+
logger 'zarr'. This sets the format of log messages from the root logger's StreamHandler.
136+
137+
Parameters
138+
----------
139+
log_format : str
140+
A string determining the log format (as defined in the standard library's `logging` module
141+
for logging.Formatter)
142+
"""
143+
_ensure_handler().setFormatter(logging.Formatter(fmt=log_format))
144+
145+
88146
__all__ = [
89147
"Array",
90148
"AsyncArray",

src/zarr/_cli/__init__.py

Whitespace-only changes.

0 commit comments

Comments
 (0)