Skip to content

Commit 813839f

Browse files
authored
Merge pull request #211 from zarr-developers/storage-updates
Various storage updates
2 parents 0790546 + 8d43cfc commit 813839f

File tree

4 files changed

+1664
-60
lines changed

4 files changed

+1664
-60
lines changed

docs/_static/js/mermaid.js

Lines changed: 1589 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

docs/conf.py

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,11 @@
3434
'sphinx_reredirects',
3535
]
3636

37+
mermaid_version = ""
38+
html_js_files = [
39+
'js/mermaid.js', # v9.4.0
40+
]
41+
3742
# Display todos by setting to True
3843
todo_include_todos = True
3944

docs/v3/core/v3.0.rst

Lines changed: 54 additions & 60 deletions
Original file line numberDiff line numberDiff line change
@@ -18,8 +18,8 @@ Editors:
1818
Corresponding ZEP:
1919
`ZEP 1 — Zarr specification version 3 <https://zarr.dev/zeps/draft/ZEP0001.html>`_
2020

21-
Issue tracking and discussion overview:
22-
`GitHub project board <https://github.com/orgs/zarr-developers/projects/2>`_
21+
Issue tracking:
22+
`GitHub issues <https://github.com/zarr-developers/zarr-specs/labels/core-protocol-v3.0>`_
2323

2424
Suggest an edit for this spec:
2525
`GitHub editor <https://github.com/zarr-developers/zarr-specs/blob/main/docs/v3/core/v3.0.rst>`_
@@ -1124,6 +1124,9 @@ interface`_ subsection. The store interface can be implemented using a
11241124
variety of underlying storage technologies, described in the
11251125
subsection on `Store implementations`_.
11261126

1127+
Additionally, a store should specify a canonical URI format that can be used to
1128+
identify nodes in this store. Implementations should use the specified formats
1129+
when opening a Zarr hierarchy to automatically determine the appropriate store.
11271130

11281131
.. _abstract-store-interface:
11291132

@@ -1391,125 +1394,112 @@ array, the prefix is the empty string. For a non-root array with hierarchy path
13911394
- `(1, 0)`
13921395
- `foo/baz/c/1/0`
13931396

1397+
It is recommended that the root of a Zarr fileset ends with ``.zarr``
1398+
to indicate the start of a hierarchy to users.
13941399

13951400

13961401
Operations
13971402
----------
13981403

1399-
.. todo::
1400-
The following section describes possible operations of an implementation
1401-
as a guide-line. Those descriptions are not yet finalized.
1404+
The following section describes possible operations of an implementation as a
1405+
non-normative guide-line.
14021406

14031407
Let `P` be an arbitrary hierarchy path.
14041408

1405-
Let ``array_meta_key(P)`` be the array metadata key for `P`. Let
1406-
``group_meta_key(P)`` be the group metadata key for `P`.
1409+
Let ``meta_key(P)`` be the metadata key for `P`, ``P/zarr.json``.
14071410

14081411
Let ``data_key(P, j, i ...)`` be the data key for `P` for the chunk
14091412
with grid coordinates (`j`, `i`, ...).
14101413

14111414
Let "+" be the string concatenation operator.
14121415

1413-
.. note::
1414-
1415-
Store and implementation can assume that a client will not try to
1416-
create both an *array* and *group* at the same path, and thus
1417-
may skip check of existence of a group/array of the same name.
14181416

14191417
**Create a group**
14201418

14211419
To create an explicit group at hierarchy path `P`, perform
1422-
``set(group_meta_key(P), value)``, where `value` is the
1420+
``set(meta_key(P), value)``, where `value` is the
14231421
serialization of a valid group metadata document.
14241422

1425-
If `P` is a non-root path then it is **not** necessary to create
1426-
or check for the existence of metadata documents for groups at any
1427-
of the ancestor paths of `P`. Creating a group at path `P` implies
1423+
Creating a group at path `P` implies
14281424
the existence of groups at all ancestor paths of `P`.
14291425

14301426
**Create an array**
14311427

14321428
To create an array at hierarchy path `P`, perform
1433-
``set(array_meta_key(P), value)``, where `value` is the
1434-
serialisation of a valid array metadata document.
1429+
``set(meta_key(P), value)``, where `value` is the serialisation of a valid
1430+
array metadata document.
14351431

1436-
If `P` is a non-root path then it is **not** necessary to create
1437-
or check for the existence of metadata documents for groups at any
1438-
of the ancestor paths of `P`. Creating an array at path `P`
1439-
implies the existence of groups at all ancestor paths of `P`.
1432+
Creating an array at path `P` implies the existence of groups at all
1433+
ancestor paths of `P`.
14401434

14411435
**Store chunk data in an array**
14421436

14431437
To store chunk data in an array at path `P` and chunk coordinate (`j`, `i`,
1444-
...), perform ``set(data_key(P, j, i, ...), value)``, where
1445-
`value` is the serialisation of the corresponding chunk, encoded
1446-
according to the information in the array metadata stored under
1447-
the key ``array_meta_key(P)``.
1438+
...), perform ``set(data_key(P, j, i, ...), value)``, where `value` is the
1439+
serialisation of the corresponding chunk, encoded according to the
1440+
information in the array metadata stored under the key ``meta_key(P)``.
14481441

14491442
**Retrieve chunk data in an array**
14501443

14511444
To retrieve chunk data in an array at path `P` and chunk coordinate (`i`,
14521445
`j`, ...), perform ``get(data_key(P, j, i, ...), value)``. The returned
1453-
value is the serialisation of the corresponding chunk, encoded
1454-
according to the array metadata stored at ``array_meta_key(P)``.
1446+
value is the serialisation of the corresponding chunk, encoded according to
1447+
the array metadata stored at ``meta_key(P)``.
14551448

14561449
**Discover children of a group**
14571450

14581451
To discover the children of a group at hierarchy path `P`, perform
1459-
``list_dir(P + "/")``. Any returned prefix not being ``c`` or
1460-
starting with ``__`` indicates a child group implied by some
1461-
descendant group or array.
1452+
``list_dir(P + "/")``. Any returned prefix ``Q`` not starting with ``__``
1453+
indicates a child array or group. To determine whether the child is
1454+
an array or group, the document ``meta_key(Q)`` must be checked.
14621455

14631456
For example, if a group is created at path "/foo/bar" and an array
14641457
is created at path "/foo/baz/qux", then the store will contain the
14651458
keys "foo/bar/zarr.json" and "foo/baz/qux/zarr.json".
14661459
Groups at paths "/", "/foo" and "/foo/baz" have not been explicitly
14671460
created but are implied by their descendants. To list the children
1468-
of the group at path "/foo", perform ``list_dir("meta/foo/")``,
1469-
which will return the prefixes "meta/foo/bar" and "meta/foo/baz".
1461+
of the group at path "/foo", perform ``list_dir("/foo/")``,
1462+
which will return the prefixes "foo/bar" and "foo/baz".
14701463
From this it can be inferred that child groups or arrays
14711464
"/foo/bar" and "/foo/baz" are present.
14721465

1473-
If a store does not support any of the list operations then
1474-
discovery of group children is not possible, and the contents of
1475-
the hierarchy must be communicated by some other means, such as
1476-
via an extension, or via some out of band communication.
1466+
If a store does not support any of the list operations then discovery of
1467+
group children is not possible, and the contents of the hierarchy must be
1468+
communicated by some other means, such as via an extension (see
1469+
https://github.com/zarr-developers/zarr-specs/issues/15) or via some out of
1470+
band communication.
14771471

14781472
**Discover all nodes in a hierarchy**
14791473

1480-
To discover all nodes in a hierarchy, one can call
1481-
``list_prefix("meta/")``. All keys represent either explicit group or
1482-
arrays. All intermediate prefixes ending in a ``/`` are implicit
1474+
To discover all nodes in a hierarchy, one should discover the children of
1475+
the root of the hierarchy and then recursively list children of child
1476+
groups.
1477+
1478+
For hierarchies without group storage transformers one may also call
1479+
``list_prefix("/")``. All ``zarr.json`` keys represent either explicit
1480+
groups or arrays. All intermediate prefixes ending in a ``/`` are implicit
14831481
groups.
14841482

14851483
**Erase a group or array**
14861484

1487-
To erase an array at path `P`:
1488-
- erase the metadata document for the array, ``erase(array_meta_key(P))``
1489-
- erase all data keys which prefix have path pointing to this array,
1490-
``erase_prefix("data" + P + "/")``
1485+
To erase an array at path `P`, erase the metadata document and array data
1486+
for the array, ``erase_prefix(P + "/")``.
14911487

1492-
To erase an implicit group at path `P`:
1493-
- erase all nodes under this group - it should be sufficient to
1494-
perform ``erase_prefix("meta" + P + "/")`` and
1495-
``erase_prefix("data" + P + "/")``.
1496-
1497-
To erase an explicit group at path `P`:
1498-
- erase the metadata document for the group, ``erase(group_meta_key(P))``
1499-
- erase all nodes under this group - it should be sufficient to
1500-
perform ``erase_prefix("meta" + P + "/")`` and
1501-
``erase_prefix("data" + P + "/")``.
1488+
To erase an explicit or implicit group at path `P`: erase all nodes under
1489+
this group and its metadata document - it should be sufficient to perform
1490+
``erase_prefix(P + "/")``
15021491

15031492
**Determine if a node exists**
15041493

1505-
To determine if a node exists at path ``P``, try in the following
1506-
order ``get(array_meta_key(P))`` (success implies an array at
1507-
``P``); ``get(group_meta_key(P))`` (success implies an explicit
1508-
group at ``P``); ``list_dir("meta" + P + "/")`` (non-empty
1509-
result set implies an implicit group at ``P``).
1494+
To determine if a node exists at path ``P``, try in the following order
1495+
1496+
- ``get(meta_key(P))``
1497+
(success implies an array or explicit group at ``P``);
1498+
- ``list_dir(P + "/")``
1499+
(non-empty result set implies an implicit group at ``P``).
15101500

15111501
.. note::
1512-
For listable store, ``list_dir(parent(P))`` can be an alternative.
1502+
For listable stores, ``list_dir(parent(P))`` can be an alternative.
15131503

15141504

15151505
Storage transformers
@@ -1607,6 +1597,10 @@ a prefix will erase all the implicit group in the prefix.
16071597
Care must thus be taken when erasing an array or a group if the parent needs to
16081598
be converted into an explicit group.
16091599

1600+
A race-condition arises if a client writes an array at path ``P``,
1601+
and another concurrently assumes ``P`` is an implicit group and writes subgroups or arrays into it.
1602+
Implementations may choose to never use implicit groups to avoid this.
1603+
16101604
Resizing
16111605
--------
16121606

docs/v3/stores/filesystem/v1.0.rst

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -182,6 +182,19 @@ in the section above.
182182
directory path ``dp``.
183183

184184

185+
Canonical URI
186+
=============
187+
188+
The canonical URI format for this store follows the file URI scheme of the base
189+
directory path, as defined in [RFC8089]_. For a Windows base directory path
190+
"c:\\my data" the canonical URI would be "file:///c:/my%20data", for a Posix
191+
base directory "/my data" it would be"file:///my%20data".
192+
193+
When expecting a URI string, but no scheme is present, implementations may
194+
assume a filesystem store with the (supposedly URI) string as the base directory
195+
path.
196+
197+
185198
Store limitations
186199
=================
187200

@@ -199,6 +212,9 @@ References
199212
Requirement Levels. March 1997. Best Current Practice. URL:
200213
https://tools.ietf.org/html/rfc2119
201214
215+
.. [RFC8089] M. Kerwin. The "file" URI Scheme. February 2017. Proposed Standard.
216+
URL: https://tools.ietf.org/html/rfc8089
217+
202218
203219
Change log
204220
==========

0 commit comments

Comments
 (0)