Skip to content

Commit 9603b0e

Browse files
committed
Merge remote-tracking branch 'upstream/v3' into tom/fix/dtype-str-special-case
2 parents d8f24a8 + 3964eab commit 9603b0e

File tree

16 files changed

+2071
-235
lines changed

16 files changed

+2071
-235
lines changed

docs/consolidated_metadata.rst

Lines changed: 74 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,74 @@
1+
Consolidated Metadata
2+
=====================
3+
4+
Zarr-Python implements the `Consolidated Metadata_` extension to the Zarr Spec.
5+
Consolidated metadata can reduce the time needed to load the metadata for an
6+
entire hierarchy, especially when the metadata is being served over a network.
7+
Consolidated metadata essentially stores all the metadata for a hierarchy in the
8+
metadata of the root Group.
9+
10+
Usage
11+
-----
12+
13+
If consolidated metadata is present in a Zarr Group's metadata then it is used
14+
by default. The initial read to open the group will need to communicate with
15+
the store (reading from a file for a :class:`zarr.store.LocalStore`, making a
16+
network request for a :class:`zarr.store.RemoteStore`). After that, any subsequent
17+
metadata reads get child Group or Array nodes will *not* require reads from the store.
18+
19+
In Python, the consolidated metadata is available on the ``.consolidated_metadata``
20+
attribute of the ``GroupMetadata`` object.
21+
22+
.. code-block:: python
23+
24+
>>> import zarr
25+
>>> store = zarr.store.MemoryStore({}, mode="w")
26+
>>> group = zarr.open_group(store=store)
27+
>>> group.create_array(shape=(1,), name="a")
28+
>>> group.create_array(shape=(2, 2), name="b")
29+
>>> group.create_array(shape=(3, 3, 3), name="c")
30+
>>> zarr.consolidate_metadata(store)
31+
32+
If we open that group, the Group's metadata has a :class:`zarr.ConsolidatedMetadata`
33+
that can be used.
34+
35+
.. code-block:: python
36+
37+
>>> consolidated = zarr.open_group(store=store)
38+
>>> consolidated.metadata.consolidated_metadata.metadata
39+
{'b': ArrayV3Metadata(shape=(2, 2), fill_value=np.float64(0.0), ...),
40+
'a': ArrayV3Metadata(shape=(1,), fill_value=np.float64(0.0), ...),
41+
'c': ArrayV3Metadata(shape=(3, 3, 3), fill_value=np.float64(0.0), ...)}
42+
43+
Operations on the group to get children automatically use the consolidated metadata.
44+
45+
.. code-block:: python
46+
47+
>>> consolidated["a"] # no read / HTTP request to the Store is required
48+
<Array memory://.../a shape=(1,) dtype=float64>
49+
50+
With nested groups, the consolidated metadata is available on the children, recursively.
51+
52+
... code-block:: python
53+
54+
>>> child = group.create_group("child", attributes={"kind": "child"})
55+
>>> grandchild = child.create_group("child", attributes={"kind": "grandchild"})
56+
>>> consolidated = zarr.consolidate_metadata(store)
57+
58+
>>> consolidated["child"].metadata.consolidated_metadata
59+
ConsolidatedMetadata(metadata={'child': GroupMetadata(attributes={'kind': 'grandchild'}, zarr_format=3, )}, ...)
60+
61+
Synchronization and Concurrency
62+
-------------------------------
63+
64+
Consolidated metadata is intended for read-heavy use cases on slowly changing
65+
hierarchies. For hierarchies where new nodes are constantly being added,
66+
removed, or modified, consolidated metadata may not be desirable.
67+
68+
1. It will add some overhead to each update operation, since the metadata
69+
would need to be re-consolidated to keep it in sync with the store.
70+
2. Readers using consolidated metadata will regularly see a "past" version
71+
of the metadata, at the time they read the root node with its consolidated
72+
metadata.
73+
74+
.. _Consolidated Metadata: https://zarr-specs.readthedocs.io/en/latest/v3/core/v3.0.html#consolidated-metadata

docs/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@ Zarr-Python
1010

1111
getting_started
1212
tutorial
13+
consolidated_metadata
1314
api/index
1415
spec
1516
release

0 commit comments

Comments
 (0)