From 65bc69fe139da34dc8cc9591cf7fbd3335b3e5d5 Mon Sep 17 00:00:00 2001 From: Josh Moore Date: Wed, 12 Feb 2025 09:32:18 +0100 Subject: [PATCH 01/64] Merge Davis proposal with ZEP0009 Remaining text blocks are likely to be re-used under the more general "Extension points" section. see: https://github.com/zarr-developers/zarr-specs/pull/312 --- docs/v3/codecs.rst | 4 ++- docs/v3/core/v3.0.rst | 65 ++++++++++++++++++++++--------------------- 2 files changed, 37 insertions(+), 32 deletions(-) diff --git a/docs/v3/codecs.rst b/docs/v3/codecs.rst index 0bb25363..e1f34bec 100644 --- a/docs/v3/codecs.rst +++ b/docs/v3/codecs.rst @@ -2,7 +2,9 @@ Codecs ====== -Under construction. +The following documents specify codecs which are defined by the maintainers +of the Zarr specification. Being listed below does not imply that a codec is +required to be implemented by implementations. .. toctree:: :glob: diff --git a/docs/v3/core/v3.0.rst b/docs/v3/core/v3.0.rst index 92c227d8..89fcb1ca 100644 --- a/docs/v3/core/v3.0.rst +++ b/docs/v3/core/v3.0.rst @@ -295,7 +295,7 @@ The following figure illustrates the first part of the terminology: *Codec* - The list of *codecs* specified for an array_ determine the encoded byte + The list of *codecs* specified for an array_ determines the encoded byte representation of each chunk in the store_. .. _metadata document: @@ -632,12 +632,9 @@ mandatory names: ^^^^^^^^^^ Specifies a list of codecs to be used for encoding and decoding chunks. The - value must be an array of objects, each object containing a member with - ``name`` whose value is a string referring to a v3 codec specification. The - codec object may also contain a ``configuration`` object which consists of - the parameter names and values as defined by the corresponding codec - specification. Since an ``array -> bytes`` codec must be specified, the - list cannot be empty. + value MUST be an array of extension definitions as defined under TODO. + Because ``codecs`` MUST contain an ``array + -> bytes`` codec, the list cannot be empty (See :ref:`codecs `). The following members are optional: @@ -1204,19 +1201,37 @@ the following procedure: 4. The chunk array ``A`` is equal to ``EC[0]``. -Specifying codecs ------------------ +.. _codec-specification: -To allow for flexibility to define and implement new codecs, this -specification does not define any codecs, nor restrict the set of -codecs that may be used. Each codec must be defined via a separate -specification. In order to refer to codecs in array metadata -documents, each codec must have a unique identifier, which is a URI -that dereferences to a human-readable specification of the codec. A -codec specification must declare the codec identifier, and describe +Core codecs +----------- + +This spec defines a set of well-known codecs ("core codecs") which all Zarr implementations SHOULD implement in +order to ensure a minimal level of interoperability between Zarr implementations. +The list of core codecs is part of the Zarr v3 specification. +Changes to the list of core codecs MUST be made via the same protocol used for +changing the Zarr v3 specification. Changes to the list of core codecs SHOULD be made +in close collaboration with extant Zarr v3 implementations. A new core codec SHOULD be added to the +list when a sufficient number of Zarr implementations support or intend to support that codec. +An existing core codec SHOULD be removed from the list when a sufficient number of implementation +developers and Zarr users deem the codec worth removing, e.g. because of a technical flaw in the +algorithm underlying the codec. + +Extension codecs +---------------- + +To allow for flexibility to define and implement new codecs, the +list of codecs defined for an array MAY contain codecs which are +defined in separate specifications. In order to refer to codecs in array metadata +documents, each codec must have a unique identifier, which is either +a known "raw name" or as a URI as defined under :ref:`extensions_section`. +For ease of discovery, it is +recommended that codec specifications are contributed to the +registry of extensions (TODO). + +A codec specification must declare the codec identifier, and describe (or cite documents that describe) the encoding and decoding algorithms and the format of the encoded data. - A codec may have configuration parameters which modify the behaviour of the codec in some way. For example, a compression codec may have a compression level parameter, which is an integer that affects the @@ -1224,20 +1239,8 @@ resulting compression ratio of the data. Configuration parameters must be declared in the codec specification, including a definition of how configuration parameters are represented as JSON. -The Zarr core development team maintains a repository of codec -specifications, which are hosted alongside this specification in the -`zarr-specs GitHub repository`_, and which are -published on the `zarr-specs documentation Web site -`_. For ease of discovery, it is -recommended that codec specifications are contributed to the -zarr-specs GitHub repository. However, codec specifications may be -maintained by any group or organisation and published in any location -on the Web. For further details of the process for contributing a -codec specification to the zarr-specs GitHub repository, see -`ZEP 0 `_ which describes -the process for Zarr specification changes. - -Further details of how codecs are configured for an array are given in the `Array metadata`_ section. +Further details of how codecs are configured for an array are given in the +`Array metadata`_ section. Stores ====== From b109fb77c02b32ab0d3a9e9cd2b408770debcb0c Mon Sep 17 00:00:00 2001 From: Josh Moore Date: Wed, 12 Feb 2025 09:42:05 +0100 Subject: [PATCH 02/64] Start changelog --- docs/v3/core/v3.0.rst | 18 +++++++----------- 1 file changed, 7 insertions(+), 11 deletions(-) diff --git a/docs/v3/core/v3.0.rst b/docs/v3/core/v3.0.rst index 89fcb1ca..d52b8cbd 100644 --- a/docs/v3/core/v3.0.rst +++ b/docs/v3/core/v3.0.rst @@ -1697,6 +1697,12 @@ All notable and possibly implementation-affecting changes to this specification are documented in this section, grouped by the specification status and ordered by time. +3.1 +--- + +- Clarification of extensions. `PR #TODO + `_ + Changes after Provisional Acceptance ------------------------------------ - Support for implicit groups was removed. `PR #292 @@ -1713,7 +1719,7 @@ Changes after Provisional Acceptance `_ Draft Changes --------------------------- +------------- - Removed `extensions` field and clarified extension point behaviour, changing the config format of data-types, chunk-grid, storage-transformers and codecs. `PR #204 @@ -1751,14 +1757,4 @@ Draft Changes - The changelog is incomplete before 2022, please refer to the commits on GitHub. -@@tag@@ -------- - -Links: `view spec -`_; -`view source -`_ - -@@TODO summary of changes since previous tag. - .. _zarr-specs GitHub repository: https://github.com/zarr-developers/zarr-specs From 4c0e49431cadf477625b8b93d07bbf29f80b740c Mon Sep 17 00:00:00 2001 From: Josh Moore Date: Wed, 12 Feb 2025 14:43:03 +0100 Subject: [PATCH 03/64] Add definitions --- docs/v3/core/v3.0.rst | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/docs/v3/core/v3.0.rst b/docs/v3/core/v3.0.rst index d52b8cbd..fea457ab 100644 --- a/docs/v3/core/v3.0.rst +++ b/docs/v3/core/v3.0.rst @@ -349,6 +349,24 @@ terminology for a use case of reading from an array: .. image:: terminology-read.excalidraw.png :width: 600 +*Extension points* + + Locations within a `metadata_document_` where extension-related + metadata can be found. Current extension points are listed in the core spec, + e.g. `codecs`, `data_type`. See :ref:`extension points ` below. + +*Extensions* + + Components defined in a `metadata_document`_ to + configure how metadata are interpreted by implementations. These + components include codecs, data types, chunk key encodings, chunk grids and + storage transformers. See :ref:`extension points ` below. + +*Core* + + Core concepts refers to the Zarr v3 core specification as defined in this document + not necessarily if something is a MUST for implementations. + .. _stored-representation: Stored representation From 05b4fa4ad9b10df9a266e246f74bbf8bb8b14204 Mon Sep 17 00:00:00 2001 From: Josh Moore Date: Wed, 12 Feb 2025 14:56:20 +0100 Subject: [PATCH 04/64] Fix definitions --- docs/v3/core/v3.0.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/v3/core/v3.0.rst b/docs/v3/core/v3.0.rst index fea457ab..98ba7fb5 100644 --- a/docs/v3/core/v3.0.rst +++ b/docs/v3/core/v3.0.rst @@ -351,13 +351,13 @@ terminology for a use case of reading from an array: *Extension points* - Locations within a `metadata_document_` where extension-related + Locations within a `metadata document_` where extension-related metadata can be found. Current extension points are listed in the core spec, e.g. `codecs`, `data_type`. See :ref:`extension points ` below. *Extensions* - Components defined in a `metadata_document`_ to + Components defined in a `metadata document`_ to configure how metadata are interpreted by implementations. These components include codecs, data types, chunk key encodings, chunk grids and storage transformers. See :ref:`extension points ` below. From f9508d4a486330302dc77d818b6cde6b7f0e4e7a Mon Sep 17 00:00:00 2001 From: Josh Moore Date: Wed, 12 Feb 2025 14:58:28 +0100 Subject: [PATCH 05/64] slightly longer change log --- docs/v3/core/v3.0.rst | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/docs/v3/core/v3.0.rst b/docs/v3/core/v3.0.rst index 98ba7fb5..0d77a626 100644 --- a/docs/v3/core/v3.0.rst +++ b/docs/v3/core/v3.0.rst @@ -1719,7 +1719,9 @@ by time. --- - Clarification of extensions. `PR #TODO - `_ + `_. With this change, + it is now possible to register new names for extension objects as well as use + URI. Changes after Provisional Acceptance ------------------------------------ From 34ac2826c41a12390a52f8ed459a3c2f9d8cad6a Mon Sep 17 00:00:00 2001 From: Josh Moore Date: Wed, 12 Feb 2025 15:29:42 +0100 Subject: [PATCH 06/64] New extensions section --- docs/v3/core/v3.0.rst | 116 +++++++++++++++++++++++++++++++++++++----- 1 file changed, 102 insertions(+), 14 deletions(-) diff --git a/docs/v3/core/v3.0.rst b/docs/v3/core/v3.0.rst index 0d77a626..fff787db 100644 --- a/docs/v3/core/v3.0.rst +++ b/docs/v3/core/v3.0.rst @@ -1619,9 +1619,14 @@ Storage transformers may be stacked to combine different functionalities: .. _extensions_section: -Extension points -================ +Extensions +========== +This section describes how additional functionality can defined +for Zarr datasets by extended the `metadata documents`_. + +Extension points +---------------- Different types of extensions can exist and they can be grouped as follows: @@ -1637,20 +1642,103 @@ array storage transformer `storage_transformers`_ If such extension points are used by groups or arrays, they are required. -See https://github.com/zarr-developers/zarr-specs/issues/49 for a list of -potential extensions. +New extension points may be proposed to the Zarr community through the ZEP +process. See `ZEP 0 `_ for more information. + +Extension definition +-------------------- + +Extensions are defined in `metadata documents`_ either as objects or as +short-hand names. If using an objection definition, the following pattern +MUST be followed:: + + { + "name": "", + "configuration": { ... } # optional + } + +If such an object is present, the field `must_understand` is implicitly set to +`True` and an object can explicitly set `must_understand=False` if +implementations can ignore its presence, following the current guidelines in +the v3 specification. + +Instead of extension objects, short-hand names may continue to be used if no +configuration metadata is required. They would be equivalent to extension +objects with just a `name` key. + +Extension naming +---------------- + +The identifier used in the `name` field of the extension definition can follow one of two forms: + +1. **Raw names** MUST be assigned within a central repository and follow the + compatibility and versioning [v3 stability + policy](https://zarr-specs.readthedocs.io/en/latest/v3/core/v3.0.html#stability-policy). + The name assignment is managed through the [`zarr-developers` Github + organization](https://github.com/zarr-developers), where each extension is + listed and either contains a spec document or links to a spec document. + Names are never unassigned or reassigned. The ZSC or by delegation a + maintainer team reserves the right to refuse name assignment at its own + discretion. + + - **Example:** `zstd` + - **Acceptd regex:** `^[a-z0-9-_.]+$` + +2. **URI-based names** can be used by anyone without further coordination + though the assumption is that users reasonably "own" the URI. The URL SHOULD + resolve to a human-readable explanation of the extension, but + implementations SHOULD NOT attempt to resolve the URL during processing. + There are no guarantees in terms of versioning or compatibility. However, + preserving backwards-compatibility is strongly encouraged. See the + [versioning section](#Versioning-and-spec-evolution) below. + + - **Example:** `https://example.com/zarr3/consolidated-metadata` + - **Accepted regex:** `^https?:\/\/[^/?#]+[^?#]*$` + +Extension example +----------------- + +The following example represents an Array showing many of the proposed changes +described above:: + + { + "zarr_format": 3, + "data_type": "https://example.com/zarr/string", // URI-based name, short-hand name + "chunk_key_encoding": { + "name": "default", // core + "configuration": { "separator": "." } + }, + "codecs": [ + { + "name": "https://numcodecs.dev/vlen-utf8" // URI-based name + }, + { + "name": "zstd", // raw name + "configuration": { ... } + } + ], + "chunk_grid": { + "name": "regular", // core + "configuration": { "chunk_shape": [ 32 ] } + }, + "shape": [ 128 ], + "dimension_names": [ "x" ], + "attributes": { ... }, + "storage_transformers": [] + } + +Extension specifications +------------------------ -Specifications for new extensions are recommended to be published in the -https://github.com/zarr-developers/zarr-specs repository via the -`ZEP process `_. If a specification -is published decentralized (e.g. for initial experimentation or due to a very -specialized scope), it must use a URL in the `name` key of its metadata, which -identifies the publishing organization or individual, and should point to the -specification of the extension. +There is no strict requirement for extensions to have a formal specification. +However, for adoption in the community it is STRONGLY RECOMMENDED to write a +specification. -Future versions of this specification may also add new core features by adding new top-level -metadata keys. Such features are required by default. However, if the value of an unknown feature -is an object containing the key-value pair ``"must_understand": false``, it can be ignored. +For extensions with raw names, the `zarr-developers/zarr-extensions` repository +SHOULD either contain the specification directly or link to the official location. +For extensions with URI-based names, it is RECOMMENDED to publish the specification +under the URI of the extension. Additionally, URI-based extensions MAY also register +themselves under the `zarr-extensions` repository for better discovery. Implementation Notes ==================== From 16e34ca617290ef6e1be2a1ff4850b941394981d Mon Sep 17 00:00:00 2001 From: Josh Moore Date: Wed, 12 Feb 2025 15:29:53 +0100 Subject: [PATCH 07/64] Update array metadata section --- docs/v3/core/v3.0.rst | 49 ++++++++++++++++++++++++++++--------------- 1 file changed, 32 insertions(+), 17 deletions(-) diff --git a/docs/v3/core/v3.0.rst b/docs/v3/core/v3.0.rst index fff787db..e67dfefc 100644 --- a/docs/v3/core/v3.0.rst +++ b/docs/v3/core/v3.0.rst @@ -458,23 +458,28 @@ Array metadata -------------- Each Zarr array in a hierarchy must have an array metadata document, named -``zarr.json``. This document must contain a single object with the following +``zarr.json``. + +Mandatory +^^^^^^^^^ + +This document must contain a single object with the following mandatory names: ``zarr_format`` -^^^^^^^^^^^^^^^ +"""""""""""""""" An integer defining the version of the storage specification to which the array store adheres, must be ``3`` here. ``node_type`` -^^^^^^^^^^^^^^^ +""""""""""""""" A string defining the type of hierarchy node element, must be ``array`` here. ``shape`` -^^^^^^^^^ +""""""""" An array of integers providing the length of each dimension of the Zarr array. For example, a value ``[10, 20]`` indicates a @@ -482,7 +487,7 @@ mandatory names: 10 and the second dimension has length 20. ``data_type`` -^^^^^^^^^^^^^ +""""""""""""" The data type of the Zarr array. If the data type is defined in this specification, then the value must be the data type @@ -498,7 +503,7 @@ mandatory names: is optional and its value is defined by the extension. ``chunk_grid`` -^^^^^^^^^^^^^^ +"""""""""""""" The chunk grid of the Zarr array. If the chunk grid is a regular chunk grid as defined in this specification, then the value must be an object with the @@ -517,7 +522,7 @@ mandatory names: ``configuration`` is optional and defined by the extension. ``chunk_key_encoding`` -^^^^^^^^^^^^^^^^^^^^^^ +"""""""""""""""""""""" The mapping from chunk grid cell coordinates to keys in the underlying store. @@ -583,7 +588,7 @@ mandatory names: to be used when writing new arrays. ``fill_value`` -^^^^^^^^^^^^^^ +"""""""""""""" Provides an element value to use for uninitialised portions of the Zarr array. @@ -647,17 +652,20 @@ mandatory names: chosen MUST be recorded in the metadata. ``codecs`` -^^^^^^^^^^ +"""""""""" Specifies a list of codecs to be used for encoding and decoding chunks. The value MUST be an array of extension definitions as defined under TODO. Because ``codecs`` MUST contain an ``array -> bytes`` codec, the list cannot be empty (See :ref:`codecs `). +Optional +^^^^^^^^ + The following members are optional: ``attributes`` -^^^^^^^^^^^^^^ +"""""""""""""" The value must be an object. The object may contain any key/value pairs, where the key must be a string and the value can be an arbitrary @@ -673,7 +681,7 @@ The following members are optional: https://github.com/zarr-developers/zeps/pull/28. ``storage_transformers`` -^^^^^^^^^^^^^^^^^^^^^^^^ +"""""""""""""""""""""""" Specifies a stack of `storage transformers`_. Each value in the list must be an object containing the names ``name`` and optionally ``configuration``. @@ -684,7 +692,7 @@ The following members are optional: absent no storage transformer is used, same for an empty list. ``dimension_names`` -^^^^^^^^^^^^^^^^^^^ +""""""""""""""""""" Specifies dimension names, e.g. ``["x", "y", "z"]``. If specified, must be an array of strings or null objects with the same length as ``shape``. An @@ -700,11 +708,18 @@ The following members are optional: same dimension name across multiple arrays within the same Zarr hierarchy, but extensions or specific applications may do so. -The array metadata object must not contain any other names. -Those are reserved for future versions of this specification. -An implementation must fail to open Zarr hierarchies, groups -or arrays with unknown metadata fields, with the exception of -objects with a ``"must_understand": false`` key-value pair. +Extensions +^^^^^^^^^^ + +All other names found in the metadata object MUST be interpreted +following the `extensions_section`_. +An implementation MUST fail to open Zarr hierarchies, groups +or arrays if any metadata fields are present which (a) the +implementation does not recognize and (b) are not explicitly +set to ``"must_understand": false``. + +Example +^^^^^^^ For example, the array metadata JSON document below defines a two-dimensional array of 64-bit little-endian floating point numbers, From d2f6f9d8e262fb715a2db147c93aa268f1eafcd8 Mon Sep 17 00:00:00 2001 From: Josh Moore Date: Wed, 12 Feb 2025 19:23:34 +0100 Subject: [PATCH 08/64] Update group metadata section --- docs/v3/core/v3.0.rst | 25 ++++++++++++++++++++++--- 1 file changed, 22 insertions(+), 3 deletions(-) diff --git a/docs/v3/core/v3.0.rst b/docs/v3/core/v3.0.rst index e67dfefc..3d1da397 100644 --- a/docs/v3/core/v3.0.rst +++ b/docs/v3/core/v3.0.rst @@ -810,29 +810,48 @@ above, but using a (currently made up) extension data type:: Group metadata -------------- +Mandatory +^^^^^^^^^ + A Zarr group metadata object must contain the following mandatory key: ``zarr_format`` -^^^^^^^^^^^^^^^ +""""""""""""""" An integer defining the version of the storage specification to which the array store adheres, must be ``3`` here. ``node_type`` -^^^^^^^^^^^^^^^ +""""""""""""""" A string defining the type of hierarchy node element, must be ``group`` here. +Optional +^^^^^^^^ + Optional keys: ``attributes`` -^^^^^^^^^^^^^^ +"""""""""""""" The value must be an object. The object may contain any key/value pairs, where the key must be a string and the value can be an arbitrary JSON literal. Intended to allow storage of arbitrary user metadata. +Extensions +^^^^^^^^^^ + +All other names found in the metadata object MUST be interpreted +following the `extensions_section`_. +An implementation MUST fail to open Zarr hierarchies, groups +or arrays if any metadata fields are present which (a) the +implementation does not recognize and (b) are not explicitly +set to ``"must_understand": false``. + +Example +^^^^^^^ + For example, the JSON document below defines a group:: { From 1d85e707c80a7475ab2cb6f44dfa9923ceb42d6b Mon Sep 17 00:00:00 2001 From: Josh Moore Date: Wed, 12 Feb 2025 19:23:44 +0100 Subject: [PATCH 09/64] Clean the extension listing pages --- docs/v3/array-storage-transformers.rst | 12 ++++++++++-- docs/v3/codecs.rst | 2 +- docs/v3/data-types.rst | 4 +++- docs/v3/stores.rst | 3 ++- docs/v3/stores/filesystem/v1.0.rst | 2 +- 5 files changed, 17 insertions(+), 6 deletions(-) diff --git a/docs/v3/array-storage-transformers.rst b/docs/v3/array-storage-transformers.rst index f0d56221..9e03f306 100644 --- a/docs/v3/array-storage-transformers.rst +++ b/docs/v3/array-storage-transformers.rst @@ -2,12 +2,20 @@ Array Storage Transformers ========================== -Under construction. +.. COMMENT TO BE REMOVED WHEN ONE IS ADDED -.. toctree:: + The following documents specify transformers which are defined by the maintainers of + the Zarr specification. Being listed below does not imply that a transformer is + required to be implemented by implementations. + + toctree:: :glob: :maxdepth: 1 :titlesonly: :caption: Contents: array-storage-transformers/*/* + +Currently, no transformers are defined by the maintainers of +the Zarr specification. Being listed below does not imply that a transformer is +required to be implemented by implementations. diff --git a/docs/v3/codecs.rst b/docs/v3/codecs.rst index e1f34bec..1f8c41ce 100644 --- a/docs/v3/codecs.rst +++ b/docs/v3/codecs.rst @@ -4,7 +4,7 @@ Codecs The following documents specify codecs which are defined by the maintainers of the Zarr specification. Being listed below does not imply that a codec is -required to be implemented by implementations. +required to be implemented by all implementations. .. toctree:: :glob: diff --git a/docs/v3/data-types.rst b/docs/v3/data-types.rst index a8e9a10f..5bbbe971 100644 --- a/docs/v3/data-types.rst +++ b/docs/v3/data-types.rst @@ -2,7 +2,9 @@ Data Types ========== -Under construction. +The following documents specify data types which are defined by the maintainers of +the Zarr specification. Being listed below does not imply that a data type is +required to be implemented by implementations. .. toctree:: :glob: diff --git a/docs/v3/stores.rst b/docs/v3/stores.rst index 9c99f324..164b5637 100644 --- a/docs/v3/stores.rst +++ b/docs/v3/stores.rst @@ -2,7 +2,8 @@ Stores ====== -Under construction. +The following documents specify stores which are defined by the maintainers of the Zarr specification. +Being listed below does not imply that a store is required to be implemented by all implementations. .. toctree:: :glob: diff --git a/docs/v3/stores/filesystem/v1.0.rst b/docs/v3/stores/filesystem/v1.0.rst index 7ff55cee..3408c3cd 100644 --- a/docs/v3/stores/filesystem/v1.0.rst +++ b/docs/v3/stores/filesystem/v1.0.rst @@ -207,4 +207,4 @@ References Change log ========== -@@TODO +No changes yet. From 43e3862bdd66a07d5a4684afdb044d5464069cdc Mon Sep 17 00:00:00 2001 From: Josh Moore Date: Wed, 12 Feb 2025 19:46:56 +0100 Subject: [PATCH 10/64] Also list no datatypes as defined --- docs/v3/data-types.rst | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-) diff --git a/docs/v3/data-types.rst b/docs/v3/data-types.rst index 5bbbe971..809bd830 100644 --- a/docs/v3/data-types.rst +++ b/docs/v3/data-types.rst @@ -2,14 +2,20 @@ Data Types ========== -The following documents specify data types which are defined by the maintainers of -the Zarr specification. Being listed below does not imply that a data type is -required to be implemented by implementations. +.. COMMENT TO BE REMOVED WHEN ONE IS ADDED + + The following documents specify data types which are defined by the maintainers of + the Zarr specification. Being listed below does not imply that a data type is + required to be implemented by implementations. -.. toctree:: + toctree:: :glob: :maxdepth: 1 :titlesonly: :caption: Contents: data-types/*/* + +Currently, no data types are defined by the maintainers of +the Zarr specification. Being listed below does not imply that a data type is +required to be implemented by implementations. From c1accfecb72d7f584779829ea638920417a4a7c7 Mon Sep 17 00:00:00 2001 From: Josh Moore Date: Wed, 12 Feb 2025 19:59:46 +0100 Subject: [PATCH 11/64] Link more terms to extensions --- docs/v3/core/v3.0.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/v3/core/v3.0.rst b/docs/v3/core/v3.0.rst index 3d1da397..b7d06623 100644 --- a/docs/v3/core/v3.0.rst +++ b/docs/v3/core/v3.0.rst @@ -263,7 +263,7 @@ The following figure illustrates the first part of the terminology: contain. For example, the 32-bit signed integer data type defines binary representations for all integers in the range −2,147,483,648 to 2,147,483,647. This specification only defines a limited set of data types, - but extensions may define other data types. + but :ref:`extensions` may define other data types. .. _chunk: .. _chunks: @@ -287,7 +287,7 @@ The following figure illustrates the first part of the terminology: The chunks_ of an array_ are organised into a grid. This specification only considers the case where all chunks_ have the same chunk shape and the chunks form a regular grid. However, - extensions may define other grid types such as + :ref:`extensions` may define other grid types such as rectilinear grids. .. _codec: From 454faaf6123c122ff845834d757b3e7eff2280fc Mon Sep 17 00:00:00 2001 From: Josh Moore Date: Wed, 12 Feb 2025 20:11:53 +0100 Subject: [PATCH 12/64] More crosslinks and identifier clarifications --- docs/v3/core/v3.0.rst | 16 ++++++++++------ 1 file changed, 10 insertions(+), 6 deletions(-) diff --git a/docs/v3/core/v3.0.rst b/docs/v3/core/v3.0.rst index b7d06623..dfc29b3b 100644 --- a/docs/v3/core/v3.0.rst +++ b/docs/v3/core/v3.0.rst @@ -494,9 +494,10 @@ mandatory names: identifier provided as a string. For example, ``"float64"`` for little-endian 64-bit floating point number. - The ``data_type`` value is an extension point and may be defined by a data + The ``data_type`` value is an :ref:`extension point` + and may be defined by a data type extension. If the data type is defined by an extension, then the value - may be either a plain string or an object containing the members ``name`` + may be either a plain string (incl. URI) or an object containing the members ``name`` and optionally ``configuration``. A plain string is equivalent to specifying an object with just a ``name`` member. The ``name`` is required and its value must refer to a v3 data type specification. ``configuration`` @@ -516,7 +517,8 @@ mandatory names: means a regular grid where the chunks have length 2 along the first dimension and length 5 along the second dimension. - The ``chunk_grid`` value is an extension point and may be defined by an + The ``chunk_grid`` value is an :ref:`extension point` + and may be defined by an extension. If the chunk grid type is defined by an extension, then ``name`` must be a string referring to a v3 chunk grid specification. The ``configuration`` is optional and defined by the extension. @@ -655,7 +657,7 @@ mandatory names: """""""""" Specifies a list of codecs to be used for encoding and decoding chunks. The - value MUST be an array of extension definitions as defined under TODO. + value MUST be an array of :ref:`extension definitions`. Because ``codecs`` MUST contain an ``array -> bytes`` codec, the list cannot be empty (See :ref:`codecs `). @@ -929,7 +931,8 @@ should be interpreted. This core specification defines a limited set of data types to represent boolean values, integers, and floating point -numbers. Extensions may define additional data types. All of the data +numbers. :ref:`Extensions` may define additional +data types. All of the data types defined here have a fixed size, in the sense that all values require the same number of bytes. However, extensions may define variable sized data types. @@ -1020,7 +1023,8 @@ chunk, and there are no gaps or overlaps between chunks. In general there are different possible types of grids. The core specification defines the regular grid type, where all chunks are -hyperrectangles of the same shape. Extensions may define other grid +hyperrectangles of the same shape. +:ref:`Extensions` may define other grid types, such as rectilinear grids where chunks are still hyperrectangles but do not all share the same shape. From ef69ff1fe4245803f337e04da97ef56b0e2d3b7a Mon Sep 17 00:00:00 2001 From: Norman Rzepka Date: Mon, 17 Feb 2025 15:44:24 +0100 Subject: [PATCH 13/64] add zarr-extensions repo --- docs/v3/core/v3.0.rst | 28 ++++++++++++++-------------- 1 file changed, 14 insertions(+), 14 deletions(-) diff --git a/docs/v3/core/v3.0.rst b/docs/v3/core/v3.0.rst index dfc29b3b..a1652116 100644 --- a/docs/v3/core/v3.0.rst +++ b/docs/v3/core/v3.0.rst @@ -131,10 +131,10 @@ with implementation B. Therefore, data is only marked with the respective major version, unknown features are auto-discovered via the metadata document. -Notably, this excludes extension points such as codecs, data types, chunk grids +Notably, this excludes extensions such as codecs, data types, chunk grids and storage transformers from the compatibility of the core specification, as -well as store support. However, versioned extension points and stores are also -expected to follow this stability policy. +well as store support. However, extensions and stores are also RECOMMENDED to +follow this stability policy. Document conventions ==================== @@ -1710,17 +1710,17 @@ Extension naming The identifier used in the `name` field of the extension definition can follow one of two forms: 1. **Raw names** MUST be assigned within a central repository and follow the - compatibility and versioning [v3 stability - policy](https://zarr-specs.readthedocs.io/en/latest/v3/core/v3.0.html#stability-policy). - The name assignment is managed through the [`zarr-developers` Github - organization](https://github.com/zarr-developers), where each extension is + compatibility and versioning v3 `stability policy`_. + The name assignment is managed through the `zarr-extensions + `_ Github + repository, where each extension is listed and either contains a spec document or links to a spec document. - Names are never unassigned or reassigned. The ZSC or by delegation a + Names are never unassigned or reassigned. The Zarr Steering Council or by delegation a maintainer team reserves the right to refuse name assignment at its own discretion. - - **Example:** `zstd` - - **Acceptd regex:** `^[a-z0-9-_.]+$` + - **Example:** ``zstd`` + - **Acceptd regex:** ``^[a-z0-9-_.]+$`` 2. **URI-based names** can be used by anyone without further coordination though the assumption is that users reasonably "own" the URI. The URL SHOULD @@ -1730,8 +1730,8 @@ The identifier used in the `name` field of the extension definition can follow o preserving backwards-compatibility is strongly encouraged. See the [versioning section](#Versioning-and-spec-evolution) below. - - **Example:** `https://example.com/zarr3/consolidated-metadata` - - **Accepted regex:** `^https?:\/\/[^/?#]+[^?#]*$` + - **Example:** ``https://example.com/zarr3/consolidated-metadata`` + - **Accepted regex:** ``^https?:\/\/[^/?#]+[^?#]*$`` Extension example ----------------- @@ -1772,11 +1772,11 @@ There is no strict requirement for extensions to have a formal specification. However, for adoption in the community it is STRONGLY RECOMMENDED to write a specification. -For extensions with raw names, the `zarr-developers/zarr-extensions` repository +For extensions with raw names, the `zarr-extensions `_ repository SHOULD either contain the specification directly or link to the official location. For extensions with URI-based names, it is RECOMMENDED to publish the specification under the URI of the extension. Additionally, URI-based extensions MAY also register -themselves under the `zarr-extensions` repository for better discovery. +themselves under the `zarr-extensions `_ repository for better discovery. Implementation Notes ==================== From 3d448c1260452fd4649789f863a99e86ffe0a081 Mon Sep 17 00:00:00 2001 From: Josh Moore Date: Tue, 18 Feb 2025 13:35:09 +0100 Subject: [PATCH 14/64] Remove TODOs with PR and repo link --- docs/v3/core/v3.0.rst | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/docs/v3/core/v3.0.rst b/docs/v3/core/v3.0.rst index a1652116..2b806ff6 100644 --- a/docs/v3/core/v3.0.rst +++ b/docs/v3/core/v3.0.rst @@ -1283,7 +1283,8 @@ documents, each codec must have a unique identifier, which is either a known "raw name" or as a URI as defined under :ref:`extensions_section`. For ease of discovery, it is recommended that codec specifications are contributed to the -registry of extensions (TODO). +registry of extensions +(`zarr-extensions `_). A codec specification must declare the codec identifier, and describe (or cite documents that describe) the encoding and decoding algorithms @@ -1844,8 +1845,8 @@ by time. 3.1 --- -- Clarification of extensions. `PR #TODO - `_. With this change, +- Clarification of extensions. `PR #330 + `_. With this change, it is now possible to register new names for extension objects as well as use URI. From e6200c87091a3225785e7227dc6affca61266023 Mon Sep 17 00:00:00 2001 From: Josh Moore Date: Tue, 18 Feb 2025 13:39:20 +0100 Subject: [PATCH 15/64] Move 'core data types' to a subpage --- docs/v3/core/v3.0.rst | 44 +++-------------------------------- docs/v3/data-types.rst | 53 +++++++++++++++++++++++++++++++----------- 2 files changed, 43 insertions(+), 54 deletions(-) diff --git a/docs/v3/core/v3.0.rst b/docs/v3/core/v3.0.rst index 2b806ff6..cd3d9f0d 100644 --- a/docs/v3/core/v3.0.rst +++ b/docs/v3/core/v3.0.rst @@ -931,7 +931,9 @@ should be interpreted. This core specification defines a limited set of data types to represent boolean values, integers, and floating point -numbers. :ref:`Extensions` may define additional +numbers. These can be found under :ref:`Data Types`. + +:ref:`Extensions` may define additional data types. All of the data types defined here have a fixed size, in the sense that all values require the same number of bytes. However, extensions may define @@ -947,46 +949,6 @@ defined in this specification, the identifier is a simple ASCII string. However, extensions may use any JSON value to identify a data type. - -Core data types ---------------- - -.. list-table:: Data types - :header-rows: 1 - - * - Identifier - - Numerical type - * - ``bool`` - - Boolean - * - ``int8`` - - Integer in ``[-2^7, 2^7-1]`` - * - ``int16`` - - Integer in ``[-2^15, 2^15-1]`` - * - ``int32`` - - Integer in ``[-2^31, 2^31-1]`` - * - ``int64`` - - Integer in ``[-2^63, 2^63-1]`` - * - ``uint8`` - - Integer in ``[0, 2^8-1]`` - * - ``uint16`` - - Integer in ``[0, 2^16-1]`` - * - ``uint32`` - - Integer in ``[0, 2^32-1]`` - * - ``uint64`` - - Integer in ``[0, 2^64-1]`` - * - ``float16`` (optionally supported) - - IEEE 754 half-precision floating point: sign bit, 5 bits exponent, 10 bits mantissa - * - ``float32`` - - IEEE 754 single-precision floating point: sign bit, 8 bits exponent, 23 bits mantissa - * - ``float64`` - - IEEE 754 double-precision floating point: sign bit, 11 bits exponent, 52 bits mantissa - * - ``complex64`` - - real and complex components are each IEEE 754 single-precision floating point - * - ``complex128`` - - real and complex components are each IEEE 754 double-precision floating point - * - ``r*`` (Optional) - - raw bits, variable size given by ``*``, limited to be a multiple of 8 - Additionally to these base types, an implementation should also handle the raw/opaque pass-through type designated by the lower-case letter ``r`` followed by the number of bits, multiple of 8. For example, ``r8``, ``r16``, and ``r24`` diff --git a/docs/v3/data-types.rst b/docs/v3/data-types.rst index 809bd830..270b28ef 100644 --- a/docs/v3/data-types.rst +++ b/docs/v3/data-types.rst @@ -2,20 +2,47 @@ Data Types ========== -.. COMMENT TO BE REMOVED WHEN ONE IS ADDED +.. _data-types: - The following documents specify data types which are defined by the maintainers of - the Zarr specification. Being listed below does not imply that a data type is - required to be implemented by implementations. +The following data types are defined by the maintainers of +the Zarr specification. Being listed below does not imply that a data type is +required to be implemented by implementations. - toctree:: - :glob: - :maxdepth: 1 - :titlesonly: - :caption: Contents: +Core data types +--------------- - data-types/*/* +.. list-table:: Data types + :header-rows: 1 -Currently, no data types are defined by the maintainers of -the Zarr specification. Being listed below does not imply that a data type is -required to be implemented by implementations. + * - Identifier + - Numerical type + * - ``bool`` + - Boolean + * - ``int8`` + - Integer in ``[-2^7, 2^7-1]`` + * - ``int16`` + - Integer in ``[-2^15, 2^15-1]`` + * - ``int32`` + - Integer in ``[-2^31, 2^31-1]`` + * - ``int64`` + - Integer in ``[-2^63, 2^63-1]`` + * - ``uint8`` + - Integer in ``[0, 2^8-1]`` + * - ``uint16`` + - Integer in ``[0, 2^16-1]`` + * - ``uint32`` + - Integer in ``[0, 2^32-1]`` + * - ``uint64`` + - Integer in ``[0, 2^64-1]`` + * - ``float16`` (optionally supported) + - IEEE 754 half-precision floating point: sign bit, 5 bits exponent, 10 bits mantissa + * - ``float32`` + - IEEE 754 single-precision floating point: sign bit, 8 bits exponent, 23 bits mantissa + * - ``float64`` + - IEEE 754 double-precision floating point: sign bit, 11 bits exponent, 52 bits mantissa + * - ``complex64`` + - real and complex components are each IEEE 754 single-precision floating point + * - ``complex128`` + - real and complex components are each IEEE 754 double-precision floating point + * - ``r*`` (Optional) + - raw bits, variable size given by ``*``, limited to be a multiple of 8 From 0e0a03bf672cfc81f58ed909a3489005fb7f02cc Mon Sep 17 00:00:00 2001 From: Josh Moore Date: Tue, 18 Feb 2025 13:42:25 +0100 Subject: [PATCH 16/64] Clarify concept of 'core' --- docs/v3/core/v3.0.rst | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/docs/v3/core/v3.0.rst b/docs/v3/core/v3.0.rst index cd3d9f0d..f1bffbd3 100644 --- a/docs/v3/core/v3.0.rst +++ b/docs/v3/core/v3.0.rst @@ -364,8 +364,9 @@ terminology for a use case of reading from an array: *Core* - Core concepts refers to the Zarr v3 core specification as defined in this document - not necessarily if something is a MUST for implementations. + Core indicates a feature or concepts defined within the Zarr v3 + specification as defined in this repository. Note, however, that certain + core features are explicitly marked as optional for implementations. .. _stored-representation: From 1600ee9d01b026004c2be8e1ad82ff6a1b8b3893 Mon Sep 17 00:00:00 2001 From: Josh Moore Date: Tue, 18 Feb 2025 14:31:52 +0100 Subject: [PATCH 17/64] Unify listing of all extensions on subpages No changes to content were made! --- docs/specs.rst | 4 +- docs/v3/array-storage-transformers.rst | 8 ++ docs/v3/chunk-grid.rst | 85 ++++++++++++ docs/v3/chunk-key-encoding.rst | 75 ++++++++++ docs/v3/codecs.rst | 8 ++ docs/v3/core/v3.0.rst | 184 ++++++------------------- docs/v3/data-types.rst | 10 +- docs/v3/stores.rst | 6 + 8 files changed, 235 insertions(+), 145 deletions(-) create mode 100644 docs/v3/chunk-grid.rst create mode 100644 docs/v3/chunk-key-encoding.rst diff --git a/docs/specs.rst b/docs/specs.rst index fec7add6..f9d4ed3e 100644 --- a/docs/specs.rst +++ b/docs/specs.rst @@ -9,8 +9,10 @@ Specifications :caption: v3 Core - v3/data-types v3/codecs + v3/chunk-grid + v3/chunk-key-encoding + v3/data-types v3/stores v3/array-storage-transformers diff --git a/docs/v3/array-storage-transformers.rst b/docs/v3/array-storage-transformers.rst index 9e03f306..0657abb9 100644 --- a/docs/v3/array-storage-transformers.rst +++ b/docs/v3/array-storage-transformers.rst @@ -1,3 +1,5 @@ +.. _storage-transformers-list: + ========================== Array Storage Transformers ========================== @@ -19,3 +21,9 @@ Array Storage Transformers Currently, no transformers are defined by the maintainers of the Zarr specification. Being listed below does not imply that a transformer is required to be implemented by implementations. + +Extensions +---------- + +Registered storage transform extensions can be found under +`zarr-extensions::storage-transformers `_. diff --git a/docs/v3/chunk-grid.rst b/docs/v3/chunk-grid.rst new file mode 100644 index 00000000..aaf8aa0e --- /dev/null +++ b/docs/v3/chunk-grid.rst @@ -0,0 +1,85 @@ +.. _chunk-grid-list: + +========== +Chunk Grid +========== + +The following chunk grids are defined by the maintainers of +the Zarr specification. Being listed below does not imply that a data type is +required to be implemented by implementations. + +Regular grids +------------- + +A regular grid is a type of grid where an array is divided into chunks +such that each chunk is a hyperrectangle of the same shape. The +dimensionality of the grid is the same as the dimensionality of the +array. Each chunk in the grid can be addressed by a tuple of positive +integers (`k`, `j`, `i`, ...) corresponding to the indices of the +chunk along each dimension. + +The origin element of a chunk has coordinates in the array space (`k` * +`dz`, `j` * `dy`, `i` * `dx`, ...) where (`dz`, `dy`, `dx`, ...) are +the chunk sizes along each dimension. +Thus the origin element of the chunk at grid index (0, 0, 0, +...) is at coordinate (0, 0, 0, ...) in the array space, i.e., the +grid is aligned with the origin of the array. If the length of any +array dimension is not perfectly divisible by the chunk length along +the same dimension, then the grid will overhang the edge of the array +space. + +The shape of the chunk grid will be (ceil(`z` / `dz`), ceil(`y` / +`dy`), ceil(`x` / `dx`), ...) where (`z`, `y`, `x`, ...) is the array +shape, "/" is the division operator and "ceil" is the ceiling +function. For example, if a 3 dimensional array has shape (10, 200, +3000), and has chunk shape (5, 20, 400), then the shape of the chunk +grid will be (2, 10, 8), meaning that there will be 2 chunks along the +first dimension, 10 along the second dimension, and 8 along the third +dimension. + +.. list-table:: Regular Grid Example + :header-rows: 1 + + * - Array Shape + - Chunk Shape + - Chunk Grid Shape + - Notes + * - (10, 200, 3000) + - (5, 20, 400) + - (2, 10, 8) + - The grid does overhang the edge of the array on the 3rd dimension. + +An element of an array with coordinates (`c`, `b`, `a`, ...) will +occur within the chunk at grid index (`c` // `dz`, `b` // `dy`, `a` // +`dx`, ...), where "//" is the floor division operator. The element +will have coordinates (`c` % `dz`, `b` % `dy`, `a` % `dx`, ...) within +that chunk, where "%" is the modulo operator. For example, if a +3 dimensional array has shape (10, 200, 3000), and has chunk shape +(5, 20, 400), then the element of the array with coordinates (7, 150, 900) +is contained within the chunk at grid index (1, 7, 2) and has coordinates +(2, 10, 100) within that chunk. + +The store key corresponding to a given grid cell is determined based on the +:ref:`array-metadata-chunk-key-encoding` member of the :ref:`array-metadata`. + +Note that this specification does not consider the case where the +chunk grid and the array space are not aligned at the origin vertices +of the array and the chunk at grid index (0, 0, 0, ...). However, +extensions may define variations on the regular grid type +such that the grid indices may include negative integers, and the +origin element of the array may occur at an arbitrary position within +any chunk, which is required to allow arrays to be extended by an +arbitrary length in a "negative" direction along any dimension. + +.. note:: Chunks at the border of an array always have the full chunk size, even when + the array only covers parts of it. For example, having an array with ``"shape": [30, 30]`` and + ``"chunk_shape": [16, 16]``, the chunk ``0,1`` would also contain unused values for the indices + ``0-16, 30-31``. When writing such chunks it is recommended to use the current fill value + for elements outside the bounds of the array. + + +Extensions +---------- + +Registered chunk grid extensions can be found under +`zarr-extensions::chunk-grids `_. diff --git a/docs/v3/chunk-key-encoding.rst b/docs/v3/chunk-key-encoding.rst new file mode 100644 index 00000000..58877371 --- /dev/null +++ b/docs/v3/chunk-key-encoding.rst @@ -0,0 +1,75 @@ +.. _chunk-key-encoding-list: + +=================== +Chunk Key Encodings +=================== + +The following chunk key encodings are defined by the maintainers of +the Zarr specification. Being listed below does not imply that a chunk key encoding is +required to be implemented by implementations. + +Core chunk key encodings +------------------------ + +The following encodings are defined: + +``default`` +^^^^^^^^^^^ + +The ``configuration`` object may contain one optional member, +``separator``, which must be either ``"/"`` or ``"."``. If not specified, +``separator`` defaults to ``"/"``. + +The key for a chunk with grid index (``k``, ``j``, ``i``, ...) is +formed by taking the initial prefix ``c``, and appending for each dimension: + +- the ``separator`` character, followed by, + +- the ASCII decimal string representation of the chunk index within that dimension. + +For example, in a 3 dimensional array, with a separator of ``/`` the identifier +for the chunk at grid index (1, 23, 45) is the string ``"c/1/23/45"``. With a +separator of ``.``, the identifier is the string ``"c.1.23.45"``. The initial prefix +``c`` ensures that metadata documents and chunks have separate prefixes. + +.. note:: A main difference with spec v2 is that the default chunk separator + changed from ``.`` to ``/``, as in N5. This decreases the maximum number of + items in hierarchical stores like directory stores. + +.. note:: Arrays may have 0 dimensions (when for example representing scalars), + in which case the coordinate of a chunk is the empty tuple, and the chunk key + will consist of the string ``c``. + +``v2`` +^^^^^^ + +The ``configuration`` object may contain one optional member, +``separator``, which must be either ``"/"`` or ``"."``. If not specified, +``separator`` defaults to ``"."``. + +The identifier for chunk with at least one dimension is formed by +concatenating for each dimension: + + - the ASCII decimal string representation of the chunk index within that + dimension, followed by + + - the ``separator`` character, except that it is omitted for the last + dimension. + +For example, in a 3 dimensional array, with a separator of ``.`` the identifier +for the chunk at grid index (1, 23, 45) is the string ``"1.23.45"``. With a +separator of ``/``, the identifier is the string ``"1/23/45"``. + +For chunk grids with 0 dimensions, the single chunk has the key ``"0"``. + +.. note:: + + This encoding is intended only to allow existing v2 arrays to be + converted to v3 without having to rename chunks. It is not recommended + to be used when writing new arrays. + +Extensions +---------- + +Registered chunk grid extensions can be found under +`zarr-extensions::chunk-key-encodings `_. diff --git a/docs/v3/codecs.rst b/docs/v3/codecs.rst index 1f8c41ce..4c5c9d1a 100644 --- a/docs/v3/codecs.rst +++ b/docs/v3/codecs.rst @@ -1,3 +1,5 @@ +.. _codecs-list: + ====== Codecs ====== @@ -13,3 +15,9 @@ required to be implemented by all implementations. :caption: Contents: codecs/*/* + +Extensions +---------- + +Registered codec extensions can be found under +`zarr-extensions::codecs `_. diff --git a/docs/v3/core/v3.0.rst b/docs/v3/core/v3.0.rst index f1bffbd3..252305ee 100644 --- a/docs/v3/core/v3.0.rst +++ b/docs/v3/core/v3.0.rst @@ -12,6 +12,7 @@ Editors: * Alistair Miles (`@alimanfoo `_), Wellcome Sanger Institute * Jonathan Striebel (`@jstriebel `_), Scalable Minds * Jeremy Maitin-Shepard (`@jbms `_), Google + * Josh Moore(`@joshmoore `_), German BioImaging Corresponding ZEP: `ZEP0001 — Zarr specification version 3 `_ @@ -467,18 +468,24 @@ Mandatory This document must contain a single object with the following mandatory names: +.. _array-metadata-zarr-format: + ``zarr_format`` """""""""""""""" An integer defining the version of the storage specification to which the array store adheres, must be ``3`` here. +.. _array-metadata-node-type: + ``node_type`` """"""""""""""" A string defining the type of hierarchy node element, must be ``array`` here. +.. _array-metadata-shape: + ``shape`` """"""""" @@ -487,6 +494,8 @@ mandatory names: two-dimensional Zarr array, where the first dimension has length 10 and the second dimension has length 20. +.. _array-metadata-data-type: + ``data_type`` """"""""""""" @@ -504,6 +513,8 @@ mandatory names: and its value must refer to a v3 data type specification. ``configuration`` is optional and its value is defined by the extension. +.. _array-metadata-chunk-grid: + ``chunk_grid`` """""""""""""" @@ -524,6 +535,8 @@ mandatory names: must be a string referring to a v3 chunk grid specification. The ``configuration`` is optional and defined by the extension. +.. _array-metadata-chunk-key-encoding: + ``chunk_key_encoding`` """""""""""""""""""""" @@ -535,60 +548,7 @@ mandatory names: type-dependent parameters; the ``configuration`` value must be an object if it is specified. - The following encodings are defined: - - - ``default`` - - The ``configuration`` object may contain one optional member, - ``separator``, which must be either ``"/"`` or ``"."``. If not specified, - ``separator`` defaults to ``"/"``. - - The key for a chunk with grid index (``k``, ``j``, ``i``, ...) is - formed by taking the initial prefix ``c``, and appending for each dimension: - - - the ``separator`` character, followed by, - - - the ASCII decimal string representation of the chunk index within that dimension. - - For example, in a 3 dimensional array, with a separator of ``/`` the identifier - for the chunk at grid index (1, 23, 45) is the string ``"c/1/23/45"``. With a - separator of ``.``, the identifier is the string ``"c.1.23.45"``. The initial prefix - ``c`` ensures that metadata documents and chunks have separate prefixes. - - .. note:: A main difference with spec v2 is that the default chunk separator - changed from ``.`` to ``/``, as in N5. This decreases the maximum number of - items in hierarchical stores like directory stores. - - .. note:: Arrays may have 0 dimensions (when for example representing scalars), - in which case the coordinate of a chunk is the empty tuple, and the chunk key - will consist of the string ``c``. - - - ``v2`` - - The ``configuration`` object may contain one optional member, - ``separator``, which must be either ``"/"`` or ``"."``. If not specified, - ``separator`` defaults to ``"."``. - - The identifier for chunk with at least one dimension is formed by - concatenating for each dimension: - - - the ASCII decimal string representation of the chunk index within that - dimension, followed by - - - the ``separator`` character, except that it is omitted for the last - dimension. - - For example, in a 3 dimensional array, with a separator of ``.`` the identifier - for the chunk at grid index (1, 23, 45) is the string ``"1.23.45"``. With a - separator of ``/``, the identifier is the string ``"1/23/45"``. - - For chunk grids with 0 dimensions, the single chunk has the key ``"0"``. - - .. note:: - - This encoding is intended only to allow existing v2 arrays to be - converted to v3 without having to rename chunks. It is not recommended - to be used when writing new arrays. +.. _array-metadata-fill-value: ``fill_value`` """""""""""""" @@ -654,6 +614,8 @@ mandatory names: the data type will be chosen. However, the default fill value that is chosen MUST be recorded in the metadata. +.. _array-metadata-codecs: + ``codecs`` """""""""" @@ -667,6 +629,8 @@ Optional The following members are optional: +.. _array-metadata-attributes: + ``attributes`` """""""""""""" @@ -683,6 +647,8 @@ The following members are optional: A proposal to specify metadata conventions (ZEP 4) is being discussed in https://github.com/zarr-developers/zeps/pull/28. +.. _array-metadata-storage-transformers: + ``storage_transformers`` """""""""""""""""""""""" @@ -694,6 +660,8 @@ The following members are optional: storage transformer specification. When the ``storage_transformers`` name is absent no storage transformer is used, same for an empty list. +.. _array-metadata-dimension-names: + ``dimension_names`` """"""""""""""""""" @@ -711,6 +679,8 @@ The following members are optional: same dimension name across multiple arrays within the same Zarr hierarchy, but extensions or specific applications may do so. +.. _array-metadata-extensions: + Extensions ^^^^^^^^^^ @@ -932,7 +902,7 @@ should be interpreted. This core specification defines a limited set of data types to represent boolean values, integers, and floating point -numbers. These can be found under :ref:`Data Types`. +numbers. These can be found under :ref:`Data Types`. :ref:`Extensions` may define additional data types. All of the data @@ -984,9 +954,8 @@ which is a space defined by the dimensionality and shape of the array. This means that every element of the array is a member of one chunk, and there are no gaps or overlaps between chunks. -In general there are different possible types of grids. The core -specification defines the regular grid type, where all chunks are -hyperrectangles of the same shape. +In general there are different possible types of grids. Those defined +under the core specification can be found under :ref:`chunk-grid-list`. :ref:`Extensions` may define other grid types, such as rectilinear grids where chunks are still hyperrectangles but do not all share the same shape. @@ -996,75 +965,6 @@ each chunk that is unique within the grid, which is a string of ASCII characters that can be used to construct keys to save and retrieve chunk data in a store, see also the `Storage`_ section. -Regular grids -------------- - -A regular grid is a type of grid where an array is divided into chunks -such that each chunk is a hyperrectangle of the same shape. The -dimensionality of the grid is the same as the dimensionality of the -array. Each chunk in the grid can be addressed by a tuple of positive -integers (`k`, `j`, `i`, ...) corresponding to the indices of the -chunk along each dimension. - -The origin element of a chunk has coordinates in the array space (`k` * -`dz`, `j` * `dy`, `i` * `dx`, ...) where (`dz`, `dy`, `dx`, ...) are -the chunk sizes along each dimension. -Thus the origin element of the chunk at grid index (0, 0, 0, -...) is at coordinate (0, 0, 0, ...) in the array space, i.e., the -grid is aligned with the origin of the array. If the length of any -array dimension is not perfectly divisible by the chunk length along -the same dimension, then the grid will overhang the edge of the array -space. - -The shape of the chunk grid will be (ceil(`z` / `dz`), ceil(`y` / -`dy`), ceil(`x` / `dx`), ...) where (`z`, `y`, `x`, ...) is the array -shape, "/" is the division operator and "ceil" is the ceiling -function. For example, if a 3 dimensional array has shape (10, 200, -3000), and has chunk shape (5, 20, 400), then the shape of the chunk -grid will be (2, 10, 8), meaning that there will be 2 chunks along the -first dimension, 10 along the second dimension, and 8 along the third -dimension. - -.. list-table:: Regular Grid Example - :header-rows: 1 - - * - Array Shape - - Chunk Shape - - Chunk Grid Shape - - Notes - * - (10, 200, 3000) - - (5, 20, 400) - - (2, 10, 8) - - The grid does overhang the edge of the array on the 3rd dimension. - -An element of an array with coordinates (`c`, `b`, `a`, ...) will -occur within the chunk at grid index (`c` // `dz`, `b` // `dy`, `a` // -`dx`, ...), where "//" is the floor division operator. The element -will have coordinates (`c` % `dz`, `b` % `dy`, `a` % `dx`, ...) within -that chunk, where "%" is the modulo operator. For example, if a -3 dimensional array has shape (10, 200, 3000), and has chunk shape -(5, 20, 400), then the element of the array with coordinates (7, 150, 900) -is contained within the chunk at grid index (1, 7, 2) and has coordinates -(2, 10, 100) within that chunk. - -The store key corresponding to a given grid cell is determined based on the -`chunk_key_encoding`_ member of the `Array metadata`_. - -Note that this specification does not consider the case where the -chunk grid and the array space are not aligned at the origin vertices -of the array and the chunk at grid index (0, 0, 0, ...). However, -extensions may define variations on the regular grid type -such that the grid indices may include negative integers, and the -origin element of the array may occur at an arbitrary position within -any chunk, which is required to allow arrays to be extended by an -arbitrary length in a "negative" direction along any dimension. - -.. note:: Chunks at the border of an array always have the full chunk size, even when - the array only covers parts of it. For example, having an array with ``"shape": [30, 30]`` and - ``"chunk_shape": [16, 16]``, the chunk ``0,1`` would also contain unused values for the indices - ``0-16, 30-31``. When writing such chunks it is recommended to use the current fill value - for elements outside the bounds of the array. - Chunk encoding ============== @@ -1247,7 +1147,7 @@ a known "raw name" or as a URI as defined under :ref:`extensions_section`. For ease of discovery, it is recommended that codec specifications are contributed to the registry of extensions -(`zarr-extensions `_). +(`zarr-extensions`_). A codec specification must declare the codec identifier, and describe (or cite documents that describe) the encoding and decoding algorithms @@ -1632,15 +1532,15 @@ Extension points Different types of extensions can exist and they can be grouped as follows: -=========== ======================= ================================================ -level extension metadata -=========== ======================= ================================================ -array data type `data_type`_ -array chunk grid `chunk_grid`_ -array chunk key encoding `chunk_key_encoding`_ -array codecs `codecs`_ -array storage transformer `storage_transformers`_ -=========== ======================= ================================================ +=========== ======================= ========================= ================================ +level extension metadata core definitions +=========== ======================= ========================= ================================ +array data type `data_type`_ :ref:`data-types-list` +array chunk grid `chunk_grid`_ :ref:`chunk-grid-list` +array chunk key encoding `chunk_key_encoding`_ :ref:`chunk-key-encoding-list` +array codecs `codecs`_ :ref:`codecs-list` +array storage transformer `storage_transformers`_ :ref:`storage-transformers-list` +=========== ======================= ========================= ================================ If such extension points are used by groups or arrays, they are required. @@ -1675,9 +1575,8 @@ The identifier used in the `name` field of the extension definition can follow o 1. **Raw names** MUST be assigned within a central repository and follow the compatibility and versioning v3 `stability policy`_. - The name assignment is managed through the `zarr-extensions - `_ Github - repository, where each extension is + The name assignment is managed through the `zarr-extensions`_ + Github repository, where each extension is listed and either contains a spec document or links to a spec document. Names are never unassigned or reassigned. The Zarr Steering Council or by delegation a maintainer team reserves the right to refuse name assignment at its own @@ -1736,11 +1635,11 @@ There is no strict requirement for extensions to have a formal specification. However, for adoption in the community it is STRONGLY RECOMMENDED to write a specification. -For extensions with raw names, the `zarr-extensions `_ repository +For extensions with raw names, the `zarr-extensions`_ repository SHOULD either contain the specification directly or link to the official location. For extensions with URI-based names, it is RECOMMENDED to publish the specification under the URI of the extension. Additionally, URI-based extensions MAY also register -themselves under the `zarr-extensions `_ repository for better discovery. +themselves under the `zarr-extensions`_ repository for better discovery. Implementation Notes ==================== @@ -1868,3 +1767,4 @@ Draft Changes GitHub. .. _zarr-specs GitHub repository: https://github.com/zarr-developers/zarr-specs +.. _zarr-extensions: https://github.com/zarr-developers/zarr-extensions diff --git a/docs/v3/data-types.rst b/docs/v3/data-types.rst index 270b28ef..4708121d 100644 --- a/docs/v3/data-types.rst +++ b/docs/v3/data-types.rst @@ -1,9 +1,9 @@ +.. _data-types-list: + ========== Data Types ========== -.. _data-types: - The following data types are defined by the maintainers of the Zarr specification. Being listed below does not imply that a data type is required to be implemented by implementations. @@ -46,3 +46,9 @@ Core data types - real and complex components are each IEEE 754 double-precision floating point * - ``r*`` (Optional) - raw bits, variable size given by ``*``, limited to be a multiple of 8 + +Extensions +---------- + +Registered data type extensions can be found under +`zarr-extensions::data-types `_. diff --git a/docs/v3/stores.rst b/docs/v3/stores.rst index 164b5637..51067cc7 100644 --- a/docs/v3/stores.rst +++ b/docs/v3/stores.rst @@ -1,3 +1,5 @@ +.. _stores-list: + ====== Stores ====== @@ -12,3 +14,7 @@ Being listed below does not imply that a store is required to be implemented by :caption: Contents: stores/*/* + +.. note:: + Stores are *not* extension points since they define the mechanism + for loading metadata documents such that extensions can be loaded. From 429988a3588496b361e5040ddf1a3e78e79dc3ed Mon Sep 17 00:00:00 2001 From: Josh Moore Date: Tue, 18 Feb 2025 14:41:37 +0100 Subject: [PATCH 18/64] Rename core/v3.0 to core/index --- docs/conf.py | 1 + docs/index.rst | 2 +- docs/specs.rst | 2 +- docs/v3/core/{v3.0.rst => index.rst} | 8 ++++---- 4 files changed, 7 insertions(+), 6 deletions(-) rename docs/v3/core/{v3.0.rst => index.rst} (99%) diff --git a/docs/conf.py b/docs/conf.py index dbab3504..67c48a4c 100644 --- a/docs/conf.py +++ b/docs/conf.py @@ -87,4 +87,5 @@ redirects = { "index": "specs.html", + "v3/core/v3.0.html": "./index.html", } diff --git a/docs/index.rst b/docs/index.rst index 50182692..78eff822 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -2,7 +2,7 @@ Specs ===== -A good starting point is the :ref:`zarr-core-specification-v3.0`. +A good starting point is the :ref:`zarr-core-specification-v3`. .. toctree:: diff --git a/docs/specs.rst b/docs/specs.rst index f9d4ed3e..d1097193 100644 --- a/docs/specs.rst +++ b/docs/specs.rst @@ -8,7 +8,7 @@ Specifications :maxdepth: 1 :caption: v3 - Core + Core v3/codecs v3/chunk-grid v3/chunk-key-encoding diff --git a/docs/v3/core/v3.0.rst b/docs/v3/core/index.rst similarity index 99% rename from docs/v3/core/v3.0.rst rename to docs/v3/core/index.rst index 252305ee..1fd0bc43 100644 --- a/docs/v3/core/v3.0.rst +++ b/docs/v3/core/index.rst @@ -1,12 +1,12 @@ .. This file is in restructured text format: https://docutils.sourceforge.io/rst.html -.. _zarr-core-specification-v3.0: +.. _zarr-core-specification-v3: ====================================== Zarr core specification (version 3.0) ====================================== Specification URI: - https://zarr-specs.readthedocs.io/en/latest/v3/core/v3.0.html + https://zarr-specs.readthedocs.io/en/latest/v3/core/index.html Editors: * Alistair Miles (`@alimanfoo `_), Wellcome Sanger Institute @@ -18,10 +18,10 @@ Corresponding ZEP: `ZEP0001 — Zarr specification version 3 `_ Issue tracking: - `GitHub issues `_ + `GitHub issues `_ Suggest an edit for this spec: - `GitHub editor `_ + `GitHub editor `_ Copyright 2019-Present Zarr core development team. This work is licensed under a `Creative Commons Attribution 3.0 Unported License From c6fb1506745848f81f8b8c87ea40d2f35390afa4 Mon Sep 17 00:00:00 2001 From: Josh Moore Date: Tue, 18 Feb 2025 14:46:48 +0100 Subject: [PATCH 19/64] Correct extensions table links --- docs/v3/core/index.rst | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/docs/v3/core/index.rst b/docs/v3/core/index.rst index 1fd0bc43..a4bf6bdc 100644 --- a/docs/v3/core/index.rst +++ b/docs/v3/core/index.rst @@ -1532,15 +1532,15 @@ Extension points Different types of extensions can exist and they can be grouped as follows: -=========== ======================= ========================= ================================ -level extension metadata core definitions -=========== ======================= ========================= ================================ -array data type `data_type`_ :ref:`data-types-list` -array chunk grid `chunk_grid`_ :ref:`chunk-grid-list` -array chunk key encoding `chunk_key_encoding`_ :ref:`chunk-key-encoding-list` -array codecs `codecs`_ :ref:`codecs-list` -array storage transformer `storage_transformers`_ :ref:`storage-transformers-list` -=========== ======================= ========================= ================================ +=========== ======================= ====================================== ================================ +level extension metadata core definitions +=========== ======================= ====================================== ================================ +array data type `array-metadata-data-type`_ :ref:`data-types-list` +array chunk grid `array-metadata-chunk-grid`_ :ref:`chunk-grid-list` +array chunk key encoding `array-metadata-chunk-key-encoding`_ :ref:`chunk-key-encoding-list` +array codecs `array-metadata-codecs`_ :ref:`codecs-list` +array storage transformer `array-metadata-storage-transformers`_ :ref:`storage-transformers-list` +=========== ======================= ====================================== ================================ If such extension points are used by groups or arrays, they are required. From 46630f71dea75857f669a05fa9cb25c3e9ae484d Mon Sep 17 00:00:00 2001 From: Josh Moore Date: Tue, 18 Feb 2025 14:50:31 +0100 Subject: [PATCH 20/64] Add 'core' to each of the subpages --- docs/v3/array-storage-transformers.rst | 2 +- docs/v3/chunk-grid.rst | 2 +- docs/v3/chunk-key-encoding.rst | 2 +- docs/v3/codecs.rst | 2 +- docs/v3/data-types.rst | 2 +- docs/v3/stores.rst | 2 +- 6 files changed, 6 insertions(+), 6 deletions(-) diff --git a/docs/v3/array-storage-transformers.rst b/docs/v3/array-storage-transformers.rst index 0657abb9..68609444 100644 --- a/docs/v3/array-storage-transformers.rst +++ b/docs/v3/array-storage-transformers.rst @@ -18,7 +18,7 @@ Array Storage Transformers array-storage-transformers/*/* -Currently, no transformers are defined by the maintainers of +Currently, no core storage transformers are defined by the maintainers of the Zarr specification. Being listed below does not imply that a transformer is required to be implemented by implementations. diff --git a/docs/v3/chunk-grid.rst b/docs/v3/chunk-grid.rst index aaf8aa0e..2fa7dcf7 100644 --- a/docs/v3/chunk-grid.rst +++ b/docs/v3/chunk-grid.rst @@ -4,7 +4,7 @@ Chunk Grid ========== -The following chunk grids are defined by the maintainers of +The following core chunk grids are defined by the maintainers of the Zarr specification. Being listed below does not imply that a data type is required to be implemented by implementations. diff --git a/docs/v3/chunk-key-encoding.rst b/docs/v3/chunk-key-encoding.rst index 58877371..545fc8d4 100644 --- a/docs/v3/chunk-key-encoding.rst +++ b/docs/v3/chunk-key-encoding.rst @@ -4,7 +4,7 @@ Chunk Key Encodings =================== -The following chunk key encodings are defined by the maintainers of +The following core chunk key encodings are defined by the maintainers of the Zarr specification. Being listed below does not imply that a chunk key encoding is required to be implemented by implementations. diff --git a/docs/v3/codecs.rst b/docs/v3/codecs.rst index 4c5c9d1a..9d7ebe68 100644 --- a/docs/v3/codecs.rst +++ b/docs/v3/codecs.rst @@ -4,7 +4,7 @@ Codecs ====== -The following documents specify codecs which are defined by the maintainers +The following documents specify core codecs which are defined by the maintainers of the Zarr specification. Being listed below does not imply that a codec is required to be implemented by all implementations. diff --git a/docs/v3/data-types.rst b/docs/v3/data-types.rst index 4708121d..a15f73de 100644 --- a/docs/v3/data-types.rst +++ b/docs/v3/data-types.rst @@ -4,7 +4,7 @@ Data Types ========== -The following data types are defined by the maintainers of +The following core data types are defined by the maintainers of the Zarr specification. Being listed below does not imply that a data type is required to be implemented by implementations. diff --git a/docs/v3/stores.rst b/docs/v3/stores.rst index 51067cc7..63bbbc4d 100644 --- a/docs/v3/stores.rst +++ b/docs/v3/stores.rst @@ -4,7 +4,7 @@ Stores ====== -The following documents specify stores which are defined by the maintainers of the Zarr specification. +The following documents specify core stores which are defined by the maintainers of the Zarr specification. Being listed below does not imply that a store is required to be implemented by all implementations. .. toctree:: From 20d645776a54a1be5af55331117512d02a0ee28a Mon Sep 17 00:00:00 2001 From: Josh Moore Date: Wed, 19 Feb 2025 15:50:29 +0100 Subject: [PATCH 21/64] Simplify subpage headers --- docs/v3/array-storage-transformers.rst | 4 ++-- docs/v3/chunk-grid.rst | 4 ++-- docs/v3/chunk-key-encoding.rst | 4 ++-- docs/v3/codecs.rst | 4 ++-- docs/v3/data-types.rst | 4 ++-- docs/v3/stores.rst | 2 +- 6 files changed, 11 insertions(+), 11 deletions(-) diff --git a/docs/v3/array-storage-transformers.rst b/docs/v3/array-storage-transformers.rst index 68609444..10737f5c 100644 --- a/docs/v3/array-storage-transformers.rst +++ b/docs/v3/array-storage-transformers.rst @@ -18,8 +18,8 @@ Array Storage Transformers array-storage-transformers/*/* -Currently, no core storage transformers are defined by the maintainers of -the Zarr specification. Being listed below does not imply that a transformer is +Currently, no core storage transformers are defined by specification. +Being listed below does not imply that a transformer is required to be implemented by implementations. Extensions diff --git a/docs/v3/chunk-grid.rst b/docs/v3/chunk-grid.rst index 2fa7dcf7..d5a60db8 100644 --- a/docs/v3/chunk-grid.rst +++ b/docs/v3/chunk-grid.rst @@ -4,8 +4,8 @@ Chunk Grid ========== -The following core chunk grids are defined by the maintainers of -the Zarr specification. Being listed below does not imply that a data type is +The following core chunk grids are defined by the specification. +Being listed below does not imply that a chunk grid is required to be implemented by implementations. Regular grids diff --git a/docs/v3/chunk-key-encoding.rst b/docs/v3/chunk-key-encoding.rst index 545fc8d4..1a02af66 100644 --- a/docs/v3/chunk-key-encoding.rst +++ b/docs/v3/chunk-key-encoding.rst @@ -4,8 +4,8 @@ Chunk Key Encodings =================== -The following core chunk key encodings are defined by the maintainers of -the Zarr specification. Being listed below does not imply that a chunk key encoding is +The following core chunk key encodings are defined by the specification. +Being listed below does not imply that a chunk key encoding is required to be implemented by implementations. Core chunk key encodings diff --git a/docs/v3/codecs.rst b/docs/v3/codecs.rst index 9d7ebe68..7a39775d 100644 --- a/docs/v3/codecs.rst +++ b/docs/v3/codecs.rst @@ -4,8 +4,8 @@ Codecs ====== -The following documents specify core codecs which are defined by the maintainers -of the Zarr specification. Being listed below does not imply that a codec is +The following documents specify core codecs which are defined by the specification. +Being listed below does not imply that a codec is required to be implemented by all implementations. .. toctree:: diff --git a/docs/v3/data-types.rst b/docs/v3/data-types.rst index a15f73de..aa331815 100644 --- a/docs/v3/data-types.rst +++ b/docs/v3/data-types.rst @@ -4,8 +4,8 @@ Data Types ========== -The following core data types are defined by the maintainers of -the Zarr specification. Being listed below does not imply that a data type is +The following core data types are defined by the specification. +Being listed below does not imply that a data type is required to be implemented by implementations. Core data types diff --git a/docs/v3/stores.rst b/docs/v3/stores.rst index 63bbbc4d..6b377a34 100644 --- a/docs/v3/stores.rst +++ b/docs/v3/stores.rst @@ -4,7 +4,7 @@ Stores ====== -The following documents specify core stores which are defined by the maintainers of the Zarr specification. +The following documents specify core stores which are defined the specification. Being listed below does not imply that a store is required to be implemented by all implementations. .. toctree:: From a1d52b136a5ee6e1d0db3d05793723b5ebe73034 Mon Sep 17 00:00:00 2001 From: Josh Moore Date: Wed, 19 Feb 2025 15:57:45 +0100 Subject: [PATCH 22/64] simplify reference to extensions in field definition Co-authored-by: Davis Bennett --- docs/v3/core/index.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/v3/core/index.rst b/docs/v3/core/index.rst index a4bf6bdc..db494c98 100644 --- a/docs/v3/core/index.rst +++ b/docs/v3/core/index.rst @@ -264,7 +264,7 @@ The following figure illustrates the first part of the terminology: contain. For example, the 32-bit signed integer data type defines binary representations for all integers in the range −2,147,483,648 to 2,147,483,647. This specification only defines a limited set of data types, - but :ref:`extensions` may define other data types. + but additional data types can be defined as :ref:`extensions`. .. _chunk: .. _chunks: From 2b046614a49dae2794ccbe04825445b77d28c0ec Mon Sep 17 00:00:00 2001 From: Josh Moore Date: Wed, 19 Feb 2025 15:59:23 +0100 Subject: [PATCH 23/64] simplify reference to extensions in field definition Co-authored-by: Davis Bennett --- docs/v3/core/index.rst | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/docs/v3/core/index.rst b/docs/v3/core/index.rst index db494c98..99caeaea 100644 --- a/docs/v3/core/index.rst +++ b/docs/v3/core/index.rst @@ -288,8 +288,7 @@ The following figure illustrates the first part of the terminology: The chunks_ of an array_ are organised into a grid. This specification only considers the case where all chunks_ have the same chunk shape and the chunks form a regular grid. However, - :ref:`extensions` may define other grid types such as - rectilinear grids. + additional chunk grids can be defined as :ref:`extensions.` .. _codec: .. _codecs: From 1fa95e8a4fd4e0a78bf321364d465d68d4510e47 Mon Sep 17 00:00:00 2001 From: Josh Moore Date: Thu, 20 Feb 2025 15:32:01 +0100 Subject: [PATCH 24/64] Implement suggestions from d-v-b and s/URI/URL/ --- docs/v3/array-storage-transformers.rst | 9 +- docs/v3/chunk-grid.rst | 5 +- docs/v3/chunk-key-encoding.rst | 5 +- docs/v3/codecs.rst | 5 +- docs/v3/core/index.rst | 229 ++++++++++++++----------- docs/v3/data-types.rst | 5 +- docs/v3/stores.rst | 4 +- 7 files changed, 139 insertions(+), 123 deletions(-) diff --git a/docs/v3/array-storage-transformers.rst b/docs/v3/array-storage-transformers.rst index 10737f5c..c095bd6c 100644 --- a/docs/v3/array-storage-transformers.rst +++ b/docs/v3/array-storage-transformers.rst @@ -6,9 +6,8 @@ Array Storage Transformers .. COMMENT TO BE REMOVED WHEN ONE IS ADDED - The following documents specify transformers which are defined by the maintainers of - the Zarr specification. Being listed below does not imply that a transformer is - required to be implemented by implementations. + The following documents specify core storage transformers which SHOULD + be implemented by all implementations. toctree:: :glob: @@ -18,9 +17,7 @@ Array Storage Transformers array-storage-transformers/*/* -Currently, no core storage transformers are defined by specification. -Being listed below does not imply that a transformer is -required to be implemented by implementations. +Currently, no core storage transformers are defined by this specification. Extensions ---------- diff --git a/docs/v3/chunk-grid.rst b/docs/v3/chunk-grid.rst index d5a60db8..e43a7326 100644 --- a/docs/v3/chunk-grid.rst +++ b/docs/v3/chunk-grid.rst @@ -4,9 +4,8 @@ Chunk Grid ========== -The following core chunk grids are defined by the specification. -Being listed below does not imply that a chunk grid is -required to be implemented by implementations. +The following documents specify chunk grids which SHOULD +be implemented by all implementations. Regular grids ------------- diff --git a/docs/v3/chunk-key-encoding.rst b/docs/v3/chunk-key-encoding.rst index 1a02af66..a2f94a3f 100644 --- a/docs/v3/chunk-key-encoding.rst +++ b/docs/v3/chunk-key-encoding.rst @@ -4,9 +4,8 @@ Chunk Key Encodings =================== -The following core chunk key encodings are defined by the specification. -Being listed below does not imply that a chunk key encoding is -required to be implemented by implementations. +The following documents specify chunk key encodings which SHOULD +be implemented by all implementations. Core chunk key encodings ------------------------ diff --git a/docs/v3/codecs.rst b/docs/v3/codecs.rst index 7a39775d..52f9407b 100644 --- a/docs/v3/codecs.rst +++ b/docs/v3/codecs.rst @@ -4,9 +4,8 @@ Codecs ====== -The following documents specify core codecs which are defined by the specification. -Being listed below does not imply that a codec is -required to be implemented by all implementations. +The following documents specify codecs which SHOULD +be implemented by all implementations. .. toctree:: :glob: diff --git a/docs/v3/core/index.rst b/docs/v3/core/index.rst index 99caeaea..3b56b858 100644 --- a/docs/v3/core/index.rst +++ b/docs/v3/core/index.rst @@ -288,7 +288,7 @@ The following figure illustrates the first part of the terminology: The chunks_ of an array_ are organised into a grid. This specification only considers the case where all chunks_ have the same chunk shape and the chunks form a regular grid. However, - additional chunk grids can be defined as :ref:`extensions.` + additional chunk grids can be defined as :ref:`extensions`. .. _codec: .. _codecs: @@ -364,9 +364,9 @@ terminology for a use case of reading from an array: *Core* - Core indicates a feature or concepts defined within the Zarr v3 - specification as defined in this repository. Note, however, that certain - core features are explicitly marked as optional for implementations. + Core refers to features or concepts defined within this specification. The + designation of a feature as core does not imply that it is mandatory for + all implementations. .. _stored-representation: @@ -498,26 +498,27 @@ mandatory names: ``data_type`` """"""""""""" - The data type of the Zarr array. If the data type is defined in + The data type of the Zarr array. + + ``data_type`` is an :ref:`extension point` + and MUST conform to the :ref:`extension definition`. + + If the data type is defined in this specification, then the value must be the data type identifier provided as a string. For example, ``"float64"`` for little-endian 64-bit floating point number. - The ``data_type`` value is an :ref:`extension point` - and may be defined by a data - type extension. If the data type is defined by an extension, then the value - may be either a plain string (incl. URI) or an object containing the members ``name`` - and optionally ``configuration``. A plain string is equivalent to - specifying an object with just a ``name`` member. The ``name`` is required - and its value must refer to a v3 data type specification. ``configuration`` - is optional and its value is defined by the extension. - .. _array-metadata-chunk-grid: ``chunk_grid`` """""""""""""" - The chunk grid of the Zarr array. If the chunk grid is a regular chunk grid + The chunk grid of the Zarr array. + + ``chunk_grid`` is an :ref:`extension point` + and MUST conform to the :ref:`extension definition`. + + If the chunk grid is a regular chunk grid as defined in this specification, then the value must be an object with the names ``name`` and ``configuration``. The value of ``name`` must be the string ``"regular"``, and the value of ``configuration`` an object with the @@ -528,11 +529,6 @@ mandatory names: means a regular grid where the chunks have length 2 along the first dimension and length 5 along the second dimension. - The ``chunk_grid`` value is an :ref:`extension point` - and may be defined by an - extension. If the chunk grid type is defined by an extension, then ``name`` - must be a string referring to a v3 chunk grid specification. The - ``configuration`` is optional and defined by the extension. .. _array-metadata-chunk-key-encoding: @@ -542,10 +538,8 @@ mandatory names: The mapping from chunk grid cell coordinates to keys in the underlying store. - The value must be an object with required string member ``name``, specifying - the encoding type, and optional member ``configuration`` specifying encoding - type-dependent parameters; the ``configuration`` value must be an object if - it is specified. + ``chunk_key_encoding`` is an :ref:`extension point` + and MUST conform to the :ref:`extension definition`. .. _array-metadata-fill-value: @@ -618,8 +612,11 @@ mandatory names: ``codecs`` """""""""" - Specifies a list of codecs to be used for encoding and decoding chunks. The - value MUST be an array of :ref:`extension definitions`. + Specifies a list of codecs to be used for encoding and decoding chunks. + + Each codec is an :ref:`extension point` + and MUST conform to the :ref:`extension definition`. + Because ``codecs`` MUST contain an ``array -> bytes`` codec, the list cannot be empty (See :ref:`codecs `). @@ -651,12 +648,12 @@ The following members are optional: ``storage_transformers`` """""""""""""""""""""""" - Specifies a stack of `storage transformers`_. Each value in the list must be - an object containing the names ``name`` and optionally ``configuration``. - The ``name`` is required and the value must be a string referring to the - extension. The object may also contain a ``configuration`` object which - consists of the parameter names and values as defined by the corresponding - storage transformer specification. When the ``storage_transformers`` name is + Specifies a list of `storage transformers`_. + + Each storage transformer is an :ref:`extension point` + and MUST conform to the :ref:`extension definition`. + + When the ``storage_transformers`` name is absent no storage transformer is used, same for an empty list. .. _array-metadata-dimension-names: @@ -683,12 +680,8 @@ The following members are optional: Extensions ^^^^^^^^^^ -All other names found in the metadata object MUST be interpreted +All other keys found in the metadata object MUST be interpreted following the `extensions_section`_. -An implementation MUST fail to open Zarr hierarchies, groups -or arrays if any metadata fields are present which (a) the -implementation does not recognize and (b) are not explicitly -set to ``"must_understand": false``. Example ^^^^^^^ @@ -814,12 +807,8 @@ Optional keys: Extensions ^^^^^^^^^^ -All other names found in the metadata object MUST be interpreted +All other keys found in the metadata object MUST be interpreted following the `extensions_section`_. -An implementation MUST fail to open Zarr hierarchies, groups -or arrays if any metadata fields are present which (a) the -implementation does not recognize and (b) are not explicitly -set to ``"must_understand": false``. Example ^^^^^^^ @@ -899,15 +888,15 @@ A data type describes the set of possible binary values that an array element may take, along with some information about how the values should be interpreted. -This core specification defines a limited set of data types to +This specification defines a limited set of data types to represent boolean values, integers, and floating point numbers. These can be found under :ref:`Data Types`. -:ref:`Extensions` may define additional -data types. All of the data -types defined here have a fixed size, in the sense that all values -require the same number of bytes. However, extensions may define -variable sized data types. +All of the data types defined here have a fixed size, in the sense that all values +require the same number of bytes. + +Additional data types may be defined as :ref:`extensions` +which MAY have variable sized data types. Note that the Zarr specification is intended to enable communication of data between a variety of computing environments. The native byte @@ -919,7 +908,7 @@ defined in this specification, the identifier is a simple ASCII string. However, extensions may use any JSON value to identify a data type. -Additionally to these base types, an implementation should also handle the +In addition to these base types, an implementation should also handle the raw/opaque pass-through type designated by the lower-case letter ``r`` followed by the number of bits, multiple of 8. For example, ``r8``, ``r16``, and ``r24`` should be understood as fall-back types of respectively 1, 2, and 3 byte length. @@ -955,8 +944,8 @@ chunk, and there are no gaps or overlaps between chunks. In general there are different possible types of grids. Those defined under the core specification can be found under :ref:`chunk-grid-list`. -:ref:`Extensions` may define other grid -types, such as rectilinear grids where chunks are still +Additional grid types MAY be defined as :ref:`extensions`, +such as rectilinear grids where chunks are still hyperrectangles but do not all share the same shape. A grid type must also define rules for constructing an identifier for @@ -1124,7 +1113,7 @@ the following procedure: Core codecs ----------- -This spec defines a set of well-known codecs ("core codecs") which all Zarr implementations SHOULD implement in +This specification defines a set of codecs ("core codecs") which all Zarr implementations SHOULD implement in order to ensure a minimal level of interoperability between Zarr implementations. The list of core codecs is part of the Zarr v3 specification. Changes to the list of core codecs MUST be made via the same protocol used for @@ -1142,7 +1131,8 @@ To allow for flexibility to define and implement new codecs, the list of codecs defined for an array MAY contain codecs which are defined in separate specifications. In order to refer to codecs in array metadata documents, each codec must have a unique identifier, which is either -a known "raw name" or as a URI as defined under :ref:`extensions_section`. +a known "`raw name `_" or +a "`URI-based name `_" as defined under :ref:`extensions_section`. For ease of discovery, it is recommended that codec specifications are contributed to the registry of extensions @@ -1523,8 +1513,8 @@ Storage transformers may be stacked to combine different functionalities: Extensions ========== -This section describes how additional functionality can defined -for Zarr datasets by extended the `metadata documents`_. +This section describes how additional functionality can be defined +for Zarr datasets by the `metadata documents`_. Extension points ---------------- @@ -1532,17 +1522,15 @@ Extension points Different types of extensions can exist and they can be grouped as follows: =========== ======================= ====================================== ================================ -level extension metadata core definitions +node_type extension point metadata definition list of core extensions =========== ======================= ====================================== ================================ -array data type `array-metadata-data-type`_ :ref:`data-types-list` -array chunk grid `array-metadata-chunk-grid`_ :ref:`chunk-grid-list` -array chunk key encoding `array-metadata-chunk-key-encoding`_ :ref:`chunk-key-encoding-list` -array codecs `array-metadata-codecs`_ :ref:`codecs-list` -array storage transformer `array-metadata-storage-transformers`_ :ref:`storage-transformers-list` +array data type `array-metadata-data-type`_ :ref:`data-types-list` +array chunk grid `array-metadata-chunk-grid`_ :ref:`chunk-grid-list` +array chunk key encoding `array-metadata-chunk-key-encoding`_ :ref:`chunk-key-encoding-list` +array codecs `array-metadata-codecs`_ :ref:`codecs-list` +array storage transformer `array-metadata-storage-transformers`_ :ref:`storage-transformers-list` =========== ======================= ====================================== ================================ -If such extension points are used by groups or arrays, they are required. - New extension points may be proposed to the Zarr community through the ZEP process. See `ZEP 0 `_ for more information. @@ -1550,67 +1538,102 @@ Extension definition -------------------- Extensions are defined in `metadata documents`_ either as objects or as -short-hand names. If using an objection definition, the following pattern -MUST be followed:: +short-hand names. If using an objection definition, the member ``name`` +MUST be a plain string which conforms to :refs:`extension name `. +Optionally, the member ``configuration`` MAY be present but if so MUST be +an object. + +For example:: { - "name": "", + "name": "", # "raw name" or URL-based name "configuration": { ... } # optional } +Instead of extension objects, short-hand names MAY be used if no +configuration metadata is required. They are equivalent to extension +objects with just a `name` key. + If such an object is present, the field `must_understand` is implicitly set to -`True` and an object can explicitly set `must_understand=False` if -implementations can ignore its presence, following the current guidelines in -the v3 specification. +`True` and an object MAY explicitly set `must_understand=False` if +implementations can ignore its presence. -Instead of extension objects, short-hand names may continue to be used if no -configuration metadata is required. They would be equivalent to extension -objects with just a `name` key. +An implementation MUST fail to open Zarr groups or arrays if any +metadata fields are present which (a) the +implementation does not recognize and (b) are not explicitly +set to ``"must_understand": false``. + +`must_understand=False` is not supported for the following extension points: +data type, chunk grid, and chunk key encoding. Extension naming ---------------- -The identifier used in the `name` field of the extension definition can follow one of two forms: +The `name` field of an extension can take two forms: **raw names** and **URI-based names**. + +.. _extension-naming-raw-names: + +Raw names +^^^^^^^^^ + +Raw names are centrally registered names which can be used without prefix. -1. **Raw names** MUST be assigned within a central repository and follow the - compatibility and versioning v3 `stability policy`_. - The name assignment is managed through the `zarr-extensions`_ - Github repository, where each extension is - listed and either contains a spec document or links to a spec document. - Names are never unassigned or reassigned. The Zarr Steering Council or by delegation a - maintainer team reserves the right to refuse name assignment at its own - discretion. +Raw names MUST be assigned within a central repository. +Raw names are unique and immutable. +Raw names MUST be composed of lower case letters a-z, numerals 0-9, underscores, dashes, and dots. +Raw name assignment is managed through the `zarr-extensions`_ +Github repository, where extensions and their specification are listed. +The Zarr Steering Council or by delegation a +maintainer team reserves the right to refuse name assignment at its own +discretion. - - **Example:** ``zstd`` - - **Acceptd regex:** ``^[a-z0-9-_.]+$`` +- **Example:** ``zstd`` +- **Acceptd regex:** ``^[a-z0-9-_.]+$`` -2. **URI-based names** can be used by anyone without further coordination - though the assumption is that users reasonably "own" the URI. The URL SHOULD - resolve to a human-readable explanation of the extension, but - implementations SHOULD NOT attempt to resolve the URL during processing. - There are no guarantees in terms of versioning or compatibility. However, - preserving backwards-compatibility is strongly encouraged. See the - [versioning section](#Versioning-and-spec-evolution) below. +.. _extension-naming-url-based-names: - - **Example:** ``https://example.com/zarr3/consolidated-metadata`` - - **Accepted regex:** ``^https?:\/\/[^/?#]+[^?#]*$`` +URL-based names +^^^^^^^^^^^^^^^ + +URL-based names delegate name registration to the domain name system (DNS). + +URL-based named MAY be used by any extension without further coordination. +Entities defining a URL-based name SHOULD have appropriate +authority over the URL. A persistent redirecting URL like PURL MAY be used. +URLs have been chosen due to their potential for being self-documenting. +While a URL SHOULD resolve to a human-readable +explanation of the extension, preferably including a JSON schema definition +of the extension metadata, implementations are not expected to resolve +URLs during processing. + +- **Example:** ``https://example.com/zarr3/consolidated-metadata`` +- **Accepted regex:** ``^https?:\/\/[^/?#]+[^?#]*$`` + +Extension versioning +-------------------- + +Extensions with **raw names** SHOULD follow the +compatibility and versioning v3 `stability policy`_. + +For extensions with **URL-based names**, there are no guarantees in terms of +versioning or compatibility. However, preserving backwards-compatibility is +strongly encouraged. Extension example ----------------- -The following example represents an Array showing many of the proposed changes -described above:: +The following example of array metadata demonstrates these extension naming schemes:: { "zarr_format": 3, - "data_type": "https://example.com/zarr/string", // URI-based name, short-hand name + "data_type": "https://example.com/zarr/string", // URL-based name, short-hand name "chunk_key_encoding": { "name": "default", // core "configuration": { "separator": "." } }, "codecs": [ { - "name": "https://numcodecs.dev/vlen-utf8" // URI-based name + "name": "https://numcodecs.dev/vlen-utf8" // URL-based name }, { "name": "zstd", // raw name @@ -1630,14 +1653,14 @@ described above:: Extension specifications ------------------------ -There is no strict requirement for extensions to have a formal specification. -However, for adoption in the community it is STRONGLY RECOMMENDED to write a -specification. +Extensions SHOULD have a published specification. A published specification +facilitates multiple implementations of an extension. For extensions with raw names, the `zarr-extensions`_ repository -SHOULD either contain the specification directly or link to the official location. -For extensions with URI-based names, it is RECOMMENDED to publish the specification -under the URI of the extension. Additionally, URI-based extensions MAY also register +SHOULD either contain the specification or link to it. + +For extensions with URL-based names, it is RECOMMENDED that the URL resolve to +a specification of the extension. Additionally, URL-based extensions MAY also register themselves under the `zarr-extensions`_ repository for better discovery. Implementation Notes @@ -1709,7 +1732,7 @@ by time. - Clarification of extensions. `PR #330 `_. With this change, it is now possible to register new names for extension objects as well as use - URI. + URL. Changes after Provisional Acceptance ------------------------------------ diff --git a/docs/v3/data-types.rst b/docs/v3/data-types.rst index aa331815..4ce9adcf 100644 --- a/docs/v3/data-types.rst +++ b/docs/v3/data-types.rst @@ -4,9 +4,8 @@ Data Types ========== -The following core data types are defined by the specification. -Being listed below does not imply that a data type is -required to be implemented by implementations. +The following section specifies data types which SHOULD +be implemented by all implementations. Core data types --------------- diff --git a/docs/v3/stores.rst b/docs/v3/stores.rst index 6b377a34..67ab4733 100644 --- a/docs/v3/stores.rst +++ b/docs/v3/stores.rst @@ -4,8 +4,8 @@ Stores ====== -The following documents specify core stores which are defined the specification. -Being listed below does not imply that a store is required to be implemented by all implementations. +The following documents specify stores which SHOULD +be implemented by all implementations. .. toctree:: :glob: From 95eef7679c809c4becd987a685d19d547c837adc Mon Sep 17 00:00:00 2001 From: Josh Moore Date: Thu, 20 Feb 2025 17:52:14 +0100 Subject: [PATCH 25/64] Move all v3 subdocs to index.rst --- docs/conf.py | 7 +++++++ docs/v3/codecs/blosc/{v1.0.rst => index.rst} | 2 +- docs/v3/codecs/bytes/{v1.0.rst => index.rst} | 2 +- docs/v3/codecs/crc32c/{v1.0.rst => index.rst} | 2 +- docs/v3/codecs/gzip/{v1.0.rst => index.rst} | 2 +- docs/v3/codecs/sharding-indexed/{v1.0.rst => index.rst} | 2 +- docs/v3/codecs/transpose/{v1.0.rst => index.rst} | 2 +- docs/v3/stores/filesystem/{v1.0.rst => index.rst} | 2 +- 8 files changed, 14 insertions(+), 7 deletions(-) rename docs/v3/codecs/blosc/{v1.0.rst => index.rst} (99%) rename docs/v3/codecs/bytes/{v1.0.rst => index.rst} (98%) rename docs/v3/codecs/crc32c/{v1.0.rst => index.rst} (98%) rename docs/v3/codecs/gzip/{v1.0.rst => index.rst} (98%) rename docs/v3/codecs/sharding-indexed/{v1.0.rst => index.rst} (99%) rename docs/v3/codecs/transpose/{v1.0.rst => index.rst} (98%) rename docs/v3/stores/filesystem/{v1.0.rst => index.rst} (99%) diff --git a/docs/conf.py b/docs/conf.py index 67c48a4c..1ed61a03 100644 --- a/docs/conf.py +++ b/docs/conf.py @@ -88,4 +88,11 @@ redirects = { "index": "specs.html", "v3/core/v3.0.html": "./index.html", + "v3/codecs/blosc/v1.0.rst": "./index.html", + "v3/codecs/bytes/v1.0.rst": "./index.html", + "v3/codecs/crc32c/v1.0.rst": "./index.html", + "v3/codecs/gzip/v1.0.rst": "./index.html", + "v3/codecs/sharding-indexed/v1.0.rst": "./index.html", + "v3/codecs/transpose/v1.0.rst": "./index.html", + "v3/stores/filesystem/v1.0.rst": "./index.html", } diff --git a/docs/v3/codecs/blosc/v1.0.rst b/docs/v3/codecs/blosc/index.rst similarity index 99% rename from docs/v3/codecs/blosc/v1.0.rst rename to docs/v3/codecs/blosc/index.rst index 44f1a655..107cf03b 100644 --- a/docs/v3/codecs/blosc/v1.0.rst +++ b/docs/v3/codecs/blosc/index.rst @@ -11,7 +11,7 @@ Corresponding ZEP: Issue tracking: `GitHub issues `_ Suggest an edit for this spec: - `GitHub editor `_ + `GitHub editor `_ Copyright 2020-Present Zarr core development team. This work is licensed under a `Creative Commons Attribution 3.0 Unported License diff --git a/docs/v3/codecs/bytes/v1.0.rst b/docs/v3/codecs/bytes/index.rst similarity index 98% rename from docs/v3/codecs/bytes/v1.0.rst rename to docs/v3/codecs/bytes/index.rst index ec3df685..e204ed32 100644 --- a/docs/v3/codecs/bytes/v1.0.rst +++ b/docs/v3/codecs/bytes/index.rst @@ -13,7 +13,7 @@ Corresponding ZEP: Issue tracking: `GitHub issues `_ Suggest an edit for this spec: - `GitHub editor `_ + `GitHub editor `_ Copyright 2020-Present Zarr core development team. This work is licensed under a `Creative Commons Attribution 3.0 Unported License diff --git a/docs/v3/codecs/crc32c/v1.0.rst b/docs/v3/codecs/crc32c/index.rst similarity index 98% rename from docs/v3/codecs/crc32c/v1.0.rst rename to docs/v3/codecs/crc32c/index.rst index 25bfdd86..a548fe46 100644 --- a/docs/v3/codecs/crc32c/v1.0.rst +++ b/docs/v3/codecs/crc32c/index.rst @@ -15,7 +15,7 @@ Corresponding ZEP: Issue tracking: `GitHub issues `_ Suggest an edit for this spec: - `GitHub editor `_ + `GitHub editor `_ Copyright 2022-Present `Zarr core development team `_. This work diff --git a/docs/v3/codecs/gzip/v1.0.rst b/docs/v3/codecs/gzip/index.rst similarity index 98% rename from docs/v3/codecs/gzip/v1.0.rst rename to docs/v3/codecs/gzip/index.rst index a62a2956..e69b3520 100644 --- a/docs/v3/codecs/gzip/v1.0.rst +++ b/docs/v3/codecs/gzip/index.rst @@ -11,7 +11,7 @@ Corresponding ZEP: Issue tracking: `GitHub issues `_ Suggest an edit for this spec: - `GitHub editor `_ + `GitHub editor `_ Copyright 2020-Present Zarr core development team. This work is licensed under a `Creative Commons Attribution 3.0 Unported License diff --git a/docs/v3/codecs/sharding-indexed/v1.0.rst b/docs/v3/codecs/sharding-indexed/index.rst similarity index 99% rename from docs/v3/codecs/sharding-indexed/v1.0.rst rename to docs/v3/codecs/sharding-indexed/index.rst index e7379e28..c2366bdd 100644 --- a/docs/v3/codecs/sharding-indexed/v1.0.rst +++ b/docs/v3/codecs/sharding-indexed/index.rst @@ -15,7 +15,7 @@ Corresponding ZEP: Issue tracking: `GitHub issues `_ Suggest an edit for this spec: - `GitHub editor `_ + `GitHub editor `_ Copyright 2022-Present `Zarr core development team `_. This work diff --git a/docs/v3/codecs/transpose/v1.0.rst b/docs/v3/codecs/transpose/index.rst similarity index 98% rename from docs/v3/codecs/transpose/v1.0.rst rename to docs/v3/codecs/transpose/index.rst index b01e6254..5e01764a 100644 --- a/docs/v3/codecs/transpose/v1.0.rst +++ b/docs/v3/codecs/transpose/index.rst @@ -13,7 +13,7 @@ Corresponding ZEP: Issue tracking: `GitHub issues `_ Suggest an edit for this spec: - `GitHub editor `_ + `GitHub editor `_ Copyright 2020-Present Zarr core development team. This work is licensed under a `Creative Commons Attribution 3.0 Unported License diff --git a/docs/v3/stores/filesystem/v1.0.rst b/docs/v3/stores/filesystem/index.rst similarity index 99% rename from docs/v3/stores/filesystem/v1.0.rst rename to docs/v3/stores/filesystem/index.rst index 3408c3cd..261d2dd8 100644 --- a/docs/v3/stores/filesystem/v1.0.rst +++ b/docs/v3/stores/filesystem/index.rst @@ -11,7 +11,7 @@ Corresponding ZEP: Issue tracking: `GitHub issues `_ Suggest an edit for this spec: - `GitHub editor `_ + `GitHub editor `_ Copyright 2019-Present Zarr core development team. This work is licensed under a `Creative Commons Attribution 3.0 Unported License From fa0bdf624e1a07e25a57d6db4cf3e58c02a38fa8 Mon Sep 17 00:00:00 2001 From: Josh Moore Date: Thu, 20 Feb 2025 18:12:44 +0100 Subject: [PATCH 26/64] Minor correction to a ref --- docs/v3/core/index.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/v3/core/index.rst b/docs/v3/core/index.rst index 3b56b858..e5e83780 100644 --- a/docs/v3/core/index.rst +++ b/docs/v3/core/index.rst @@ -1539,7 +1539,7 @@ Extension definition Extensions are defined in `metadata documents`_ either as objects or as short-hand names. If using an objection definition, the member ``name`` -MUST be a plain string which conforms to :refs:`extension name `. +MUST be a plain string which conforms to :ref:`extension name `. Optionally, the member ``configuration`` MAY be present but if so MUST be an object. From 3d867755717e837bcd5d4dd705b9f1c6842f3805 Mon Sep 17 00:00:00 2001 From: Josh Moore Date: Thu, 20 Feb 2025 18:15:32 +0100 Subject: [PATCH 27/64] Add chunk key and grid subdocuments --- docs/v3/chunk-grid.rst | 74 +---------- docs/v3/chunk-grids/regular-grid/index.rst | 117 ++++++++++++++++++ docs/v3/chunk-key-encoding.rst | 64 +--------- docs/v3/chunk-key-encodings/default/index.rst | 70 +++++++++++ docs/v3/chunk-key-encodings/v2/index.rst | 71 +++++++++++ docs/v3/core/index.rst | 14 ++- 6 files changed, 279 insertions(+), 131 deletions(-) create mode 100644 docs/v3/chunk-grids/regular-grid/index.rst create mode 100644 docs/v3/chunk-key-encodings/default/index.rst create mode 100644 docs/v3/chunk-key-encodings/v2/index.rst diff --git a/docs/v3/chunk-grid.rst b/docs/v3/chunk-grid.rst index e43a7326..ccdaeeb8 100644 --- a/docs/v3/chunk-grid.rst +++ b/docs/v3/chunk-grid.rst @@ -7,75 +7,13 @@ Chunk Grid The following documents specify chunk grids which SHOULD be implemented by all implementations. -Regular grids -------------- - -A regular grid is a type of grid where an array is divided into chunks -such that each chunk is a hyperrectangle of the same shape. The -dimensionality of the grid is the same as the dimensionality of the -array. Each chunk in the grid can be addressed by a tuple of positive -integers (`k`, `j`, `i`, ...) corresponding to the indices of the -chunk along each dimension. - -The origin element of a chunk has coordinates in the array space (`k` * -`dz`, `j` * `dy`, `i` * `dx`, ...) where (`dz`, `dy`, `dx`, ...) are -the chunk sizes along each dimension. -Thus the origin element of the chunk at grid index (0, 0, 0, -...) is at coordinate (0, 0, 0, ...) in the array space, i.e., the -grid is aligned with the origin of the array. If the length of any -array dimension is not perfectly divisible by the chunk length along -the same dimension, then the grid will overhang the edge of the array -space. - -The shape of the chunk grid will be (ceil(`z` / `dz`), ceil(`y` / -`dy`), ceil(`x` / `dx`), ...) where (`z`, `y`, `x`, ...) is the array -shape, "/" is the division operator and "ceil" is the ceiling -function. For example, if a 3 dimensional array has shape (10, 200, -3000), and has chunk shape (5, 20, 400), then the shape of the chunk -grid will be (2, 10, 8), meaning that there will be 2 chunks along the -first dimension, 10 along the second dimension, and 8 along the third -dimension. - -.. list-table:: Regular Grid Example - :header-rows: 1 - - * - Array Shape - - Chunk Shape - - Chunk Grid Shape - - Notes - * - (10, 200, 3000) - - (5, 20, 400) - - (2, 10, 8) - - The grid does overhang the edge of the array on the 3rd dimension. - -An element of an array with coordinates (`c`, `b`, `a`, ...) will -occur within the chunk at grid index (`c` // `dz`, `b` // `dy`, `a` // -`dx`, ...), where "//" is the floor division operator. The element -will have coordinates (`c` % `dz`, `b` % `dy`, `a` % `dx`, ...) within -that chunk, where "%" is the modulo operator. For example, if a -3 dimensional array has shape (10, 200, 3000), and has chunk shape -(5, 20, 400), then the element of the array with coordinates (7, 150, 900) -is contained within the chunk at grid index (1, 7, 2) and has coordinates -(2, 10, 100) within that chunk. - -The store key corresponding to a given grid cell is determined based on the -:ref:`array-metadata-chunk-key-encoding` member of the :ref:`array-metadata`. - -Note that this specification does not consider the case where the -chunk grid and the array space are not aligned at the origin vertices -of the array and the chunk at grid index (0, 0, 0, ...). However, -extensions may define variations on the regular grid type -such that the grid indices may include negative integers, and the -origin element of the array may occur at an arbitrary position within -any chunk, which is required to allow arrays to be extended by an -arbitrary length in a "negative" direction along any dimension. - -.. note:: Chunks at the border of an array always have the full chunk size, even when - the array only covers parts of it. For example, having an array with ``"shape": [30, 30]`` and - ``"chunk_shape": [16, 16]``, the chunk ``0,1`` would also contain unused values for the indices - ``0-16, 30-31``. When writing such chunks it is recommended to use the current fill value - for elements outside the bounds of the array. +.. toctree:: + :glob: + :maxdepth: 1 + :titlesonly: + :caption: Contents: + chunk-grids/*/* Extensions ---------- diff --git a/docs/v3/chunk-grids/regular-grid/index.rst b/docs/v3/chunk-grids/regular-grid/index.rst new file mode 100644 index 00000000..e167fd38 --- /dev/null +++ b/docs/v3/chunk-grids/regular-grid/index.rst @@ -0,0 +1,117 @@ + +.. _regulargrid-chunkgrid-v1: + +====================================== + Regular grid chunk grid (version 1.0) +====================================== + + **Editor's draft 26 July 2019** + +Specification URI: + https://zarr-specs.readthedocs.io/en/latest/v3/chunk-grids/regular-grid/v1.0.html +Corresponding ZEP: + `ZEP0001 — Zarr specification version 3 `_ +Issue tracking: + `GitHub issues `_ +Suggest an edit for this spec: + `GitHub editor `_ + +Copyright 2020-Present Zarr core development team. This work +is licensed under a `Creative Commons Attribution 3.0 Unported License +`_. + +---- + +Abstract +======== + +A regular grid is a type of grid where an array is divided into chunks +such that each chunk is a hyperrectangle of the same shape. The +dimensionality of the grid is the same as the dimensionality of the +array. Each chunk in the grid can be addressed by a tuple of positive +integers (`k`, `j`, `i`, ...) corresponding to the indices of the +chunk along each dimension. + +Description +=========== + +The origin element of a chunk has coordinates in the array space (`k` * +`dz`, `j` * `dy`, `i` * `dx`, ...) where (`dz`, `dy`, `dx`, ...) are +the chunk sizes along each dimension. +Thus the origin element of the chunk at grid index (0, 0, 0, +...) is at coordinate (0, 0, 0, ...) in the array space, i.e., the +grid is aligned with the origin of the array. If the length of any +array dimension is not perfectly divisible by the chunk length along +the same dimension, then the grid will overhang the edge of the array +space. + +The shape of the chunk grid will be (ceil(`z` / `dz`), ceil(`y` / +`dy`), ceil(`x` / `dx`), ...) where (`z`, `y`, `x`, ...) is the array +shape, "/" is the division operator and "ceil" is the ceiling +function. For example, if a 3 dimensional array has shape (10, 200, +3000), and has chunk shape (5, 20, 400), then the shape of the chunk +grid will be (2, 10, 8), meaning that there will be 2 chunks along the +first dimension, 10 along the second dimension, and 8 along the third +dimension. + +.. list-table:: Regular Grid Example + :header-rows: 1 + + * - Array Shape + - Chunk Shape + - Chunk Grid Shape + - Notes + * - (10, 200, 3000) + - (5, 20, 400) + - (2, 10, 8) + - The grid does overhang the edge of the array on the 3rd dimension. + +An element of an array with coordinates (`c`, `b`, `a`, ...) will +occur within the chunk at grid index (`c` // `dz`, `b` // `dy`, `a` // +`dx`, ...), where "//" is the floor division operator. The element +will have coordinates (`c` % `dz`, `b` % `dy`, `a` % `dx`, ...) within +that chunk, where "%" is the modulo operator. For example, if a +3 dimensional array has shape (10, 200, 3000), and has chunk shape +(5, 20, 400), then the element of the array with coordinates (7, 150, 900) +is contained within the chunk at grid index (1, 7, 2) and has coordinates +(2, 10, 100) within that chunk. + +The store key corresponding to a given grid cell is determined based on the +:ref:`array-metadata-chunk-key-encoding` member of the :ref:`array-metadata`. + +Note that this specification does not consider the case where the +chunk grid and the array space are not aligned at the origin vertices +of the array and the chunk at grid index (0, 0, 0, ...). However, +extensions may define variations on the regular grid type +such that the grid indices may include negative integers, and the +origin element of the array may occur at an arbitrary position within +any chunk, which is required to allow arrays to be extended by an +arbitrary length in a "negative" direction along any dimension. + +.. note:: Chunks at the border of an array always have the full chunk size, even when + the array only covers parts of it. For example, having an array with ``"shape": [30, 30]`` and + ``"chunk_shape": [16, 16]``, the chunk ``0,1`` would also contain unused values for the indices + ``0-16, 30-31``. When writing such chunks it is recommended to use the current fill value + for elements outside the bounds of the array. + + + +Status of this document +======================= + +ZEP0001 was accepted on May 15th, 2023 via https://github.com/zarr-developers/zarr-specs/issues/227. + + +Document conventions +==================== + +Conformance requirements are expressed with a combination of +descriptive assertions and [RFC2119]_ terminology. The key words +"MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", +"SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in the normative +parts of this document are to be interpreted as described in +[RFC2119]_. However, for readability, these words do not appear in all +uppercase letters in this specification. + +All of the text of this specification is normative except sections +explicitly marked as non-normative, examples, and notes. Examples in diff --git a/docs/v3/chunk-key-encoding.rst b/docs/v3/chunk-key-encoding.rst index a2f94a3f..568a5dd5 100644 --- a/docs/v3/chunk-key-encoding.rst +++ b/docs/v3/chunk-key-encoding.rst @@ -7,65 +7,13 @@ Chunk Key Encodings The following documents specify chunk key encodings which SHOULD be implemented by all implementations. -Core chunk key encodings ------------------------- +.. toctree:: + :glob: + :maxdepth: 1 + :titlesonly: + :caption: Contents: -The following encodings are defined: - -``default`` -^^^^^^^^^^^ - -The ``configuration`` object may contain one optional member, -``separator``, which must be either ``"/"`` or ``"."``. If not specified, -``separator`` defaults to ``"/"``. - -The key for a chunk with grid index (``k``, ``j``, ``i``, ...) is -formed by taking the initial prefix ``c``, and appending for each dimension: - -- the ``separator`` character, followed by, - -- the ASCII decimal string representation of the chunk index within that dimension. - -For example, in a 3 dimensional array, with a separator of ``/`` the identifier -for the chunk at grid index (1, 23, 45) is the string ``"c/1/23/45"``. With a -separator of ``.``, the identifier is the string ``"c.1.23.45"``. The initial prefix -``c`` ensures that metadata documents and chunks have separate prefixes. - -.. note:: A main difference with spec v2 is that the default chunk separator - changed from ``.`` to ``/``, as in N5. This decreases the maximum number of - items in hierarchical stores like directory stores. - -.. note:: Arrays may have 0 dimensions (when for example representing scalars), - in which case the coordinate of a chunk is the empty tuple, and the chunk key - will consist of the string ``c``. - -``v2`` -^^^^^^ - -The ``configuration`` object may contain one optional member, -``separator``, which must be either ``"/"`` or ``"."``. If not specified, -``separator`` defaults to ``"."``. - -The identifier for chunk with at least one dimension is formed by -concatenating for each dimension: - - - the ASCII decimal string representation of the chunk index within that - dimension, followed by - - - the ``separator`` character, except that it is omitted for the last - dimension. - -For example, in a 3 dimensional array, with a separator of ``.`` the identifier -for the chunk at grid index (1, 23, 45) is the string ``"1.23.45"``. With a -separator of ``/``, the identifier is the string ``"1/23/45"``. - -For chunk grids with 0 dimensions, the single chunk has the key ``"0"``. - -.. note:: - - This encoding is intended only to allow existing v2 arrays to be - converted to v3 without having to rename chunks. It is not recommended - to be used when writing new arrays. + chunk-key-encodings/*/* Extensions ---------- diff --git a/docs/v3/chunk-key-encodings/default/index.rst b/docs/v3/chunk-key-encodings/default/index.rst new file mode 100644 index 00000000..d17777d7 --- /dev/null +++ b/docs/v3/chunk-key-encodings/default/index.rst @@ -0,0 +1,70 @@ +.. _default-chunkkeyencoding-v1: + +========================================= + Default chunk key encoding (version 1.0) +========================================= + + **Editor's draft 26 July 2019** + +Specification URI: + https://zarr-specs.readthedocs.io/en/latest/v3/chunk-key-encodings/default/v1.0.html +Corresponding ZEP: + `ZEP0001 — Zarr specification version 3 `_ +Issue tracking: + `GitHub issues `_ +Suggest an edit for this spec: + `GitHub editor `_ + +Copyright 2020-Present Zarr core development team. This work +is licensed under a `Creative Commons Attribution 3.0 Unported License +`_. + +---- + +Description +=========== + +The ``configuration`` object may contain one optional member, +``separator``, which must be either ``"/"`` or ``"."``. If not specified, +``separator`` defaults to ``"/"``. + +The key for a chunk with grid index (``k``, ``j``, ``i``, ...) is +formed by taking the initial prefix ``c``, and appending for each dimension: + +- the ``separator`` character, followed by, + +- the ASCII decimal string representation of the chunk index within that dimension. + +For example, in a 3 dimensional array, with a separator of ``/`` the identifier +for the chunk at grid index (1, 23, 45) is the string ``"c/1/23/45"``. With a +separator of ``.``, the identifier is the string ``"c.1.23.45"``. The initial prefix +``c`` ensures that metadata documents and chunks have separate prefixes. + +.. note:: A main difference with spec v2 is that the default chunk separator + changed from ``.`` to ``/``, as in N5. This decreases the maximum number of + items in hierarchical stores like directory stores. + +.. note:: Arrays may have 0 dimensions (when for example representing scalars), + in which case the coordinate of a chunk is the empty tuple, and the chunk key + will consist of the string ``c``. + + +Status of this document +======================= + +ZEP0001 was accepted on May 15th, 2023 via https://github.com/zarr-developers/zarr-specs/issues/227. + + +Document conventions +==================== + +Conformance requirements are expressed with a combination of +descriptive assertions and [RFC2119]_ terminology. The key words +"MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", +"SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in the normative +parts of this document are to be interpreted as described in +[RFC2119]_. However, for readability, these words do not appear in all +uppercase letters in this specification. + +All of the text of this specification is normative except sections +explicitly marked as non-normative, examples, and notes. Examples in diff --git a/docs/v3/chunk-key-encodings/v2/index.rst b/docs/v3/chunk-key-encodings/v2/index.rst new file mode 100644 index 00000000..9788c044 --- /dev/null +++ b/docs/v3/chunk-key-encodings/v2/index.rst @@ -0,0 +1,71 @@ +.. _v2-chunkkeyencoding-v1: + +========================================= + v2 chunk key encoding (version 1.0) +========================================= + + **Editor's draft 26 July 2019** + +Specification URI: + https://zarr-specs.readthedocs.io/en/latest/v3/chunk-key-encodings/v2/v1.0.html +Corresponding ZEP: + `ZEP0001 — Zarr specification version 3 `_ +Issue tracking: + `GitHub issues `_ +Suggest an edit for this spec: + `GitHub editor `_ + +Copyright 2020-Present Zarr core development team. This work +is licensed under a `Creative Commons Attribution 3.0 Unported License +`_. + +---- + +Description +=========== + +The ``configuration`` object may contain one optional member, +``separator``, which must be either ``"/"`` or ``"."``. If not specified, +``separator`` defaults to ``"."``. + +The identifier for chunk with at least one dimension is formed by +concatenating for each dimension: + + - the ASCII decimal string representation of the chunk index within that + dimension, followed by + + - the ``separator`` character, except that it is omitted for the last + dimension. + +For example, in a 3 dimensional array, with a separator of ``.`` the identifier +for the chunk at grid index (1, 23, 45) is the string ``"1.23.45"``. With a +separator of ``/``, the identifier is the string ``"1/23/45"``. + +For chunk grids with 0 dimensions, the single chunk has the key ``"0"``. + +.. note:: + + This encoding is intended only to allow existing v2 arrays to be + converted to v3 without having to rename chunks. It is not recommended + to be used when writing new arrays. + + +Status of this document +======================= + +ZEP0001 was accepted on May 15th, 2023 via https://github.com/zarr-developers/zarr-specs/issues/227. + + +Document conventions +==================== + +Conformance requirements are expressed with a combination of +descriptive assertions and [RFC2119]_ terminology. The key words +"MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", +"SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in the normative +parts of this document are to be interpreted as described in +[RFC2119]_. However, for readability, these words do not appear in all +uppercase letters in this specification. + +All of the text of this specification is normative except sections +explicitly marked as non-normative, examples, and notes. Examples in diff --git a/docs/v3/core/index.rst b/docs/v3/core/index.rst index e5e83780..8e1014f0 100644 --- a/docs/v3/core/index.rst +++ b/docs/v3/core/index.rst @@ -501,7 +501,7 @@ mandatory names: The data type of the Zarr array. ``data_type`` is an :ref:`extension point` - and MUST conform to the :ref:`extension definition`. + and MUST conform to the :ref:`extension-definition`. If the data type is defined in this specification, then the value must be the data type @@ -516,7 +516,7 @@ mandatory names: The chunk grid of the Zarr array. ``chunk_grid`` is an :ref:`extension point` - and MUST conform to the :ref:`extension definition`. + and MUST conform to the :ref:`extension-definition`. If the chunk grid is a regular chunk grid as defined in this specification, then the value must be an object with the @@ -539,7 +539,7 @@ mandatory names: store. ``chunk_key_encoding`` is an :ref:`extension point` - and MUST conform to the :ref:`extension definition`. + and MUST conform to the :ref:`extension-definition`. .. _array-metadata-fill-value: @@ -615,7 +615,7 @@ mandatory names: Specifies a list of codecs to be used for encoding and decoding chunks. Each codec is an :ref:`extension point` - and MUST conform to the :ref:`extension definition`. + and MUST conform to the :ref:`extension-definition`. Because ``codecs`` MUST contain an ``array -> bytes`` codec, the list cannot be empty (See :ref:`codecs `). @@ -651,7 +651,7 @@ The following members are optional: Specifies a list of `storage transformers`_. Each storage transformer is an :ref:`extension point` - and MUST conform to the :ref:`extension definition`. + and MUST conform to the :ref:`extension-definition`. When the ``storage_transformers`` name is absent no storage transformer is used, same for an empty list. @@ -1534,6 +1534,8 @@ array storage transformer `array-metadata-storage-transformers`_ :ref: New extension points may be proposed to the Zarr community through the ZEP process. See `ZEP 0 `_ for more information. +.. _extension-definition: + Extension definition -------------------- @@ -1566,6 +1568,8 @@ set to ``"must_understand": false``. `must_understand=False` is not supported for the following extension points: data type, chunk grid, and chunk key encoding. +.. _extension-naming: + Extension naming ---------------- From 7cfa69ba7fd3d144fd6c0a9612866e9f62c63842 Mon Sep 17 00:00:00 2001 From: Josh Moore Date: Thu, 20 Feb 2025 19:14:10 +0100 Subject: [PATCH 28/64] Clarify ext pts vs exts --- docs/v3/core/index.rst | 31 +++++++++++++++---------------- 1 file changed, 15 insertions(+), 16 deletions(-) diff --git a/docs/v3/core/index.rst b/docs/v3/core/index.rst index 8e1014f0..8ff0545c 100644 --- a/docs/v3/core/index.rst +++ b/docs/v3/core/index.rst @@ -349,18 +349,17 @@ terminology for a use case of reading from an array: .. image:: terminology-read.excalidraw.png :width: 600 -*Extension points* +*Extension point* - Locations within a `metadata document_` where extension-related - metadata can be found. Current extension points are listed in the core spec, - e.g. `codecs`, `data_type`. See :ref:`extension points ` below. + A field in a `metadata document_` that can be extended to allow values + not defined in this specification. + See :ref:`extension points ` below. -*Extensions* +*Extension* - Components defined in a `metadata document`_ to - configure how metadata are interpreted by implementations. These - components include codecs, data types, chunk key encodings, chunk grids and - storage transformers. See :ref:`extension points ` below. + An implementation of an extension point which can be referenced + by :ref:`name `. + See the linked lists of extensions under :ref:`extension points ` below. *Core* @@ -677,11 +676,11 @@ The following members are optional: .. _array-metadata-extensions: -Extensions -^^^^^^^^^^ +Unknown +^^^^^^^ All other keys found in the metadata object MUST be interpreted -following the `extensions_section`_. +following the :ref:`Extensions section `. Example ^^^^^^^ @@ -804,11 +803,11 @@ Optional keys: pairs, where the key must be a string and the value can be an arbitrary JSON literal. Intended to allow storage of arbitrary user metadata. -Extensions -^^^^^^^^^^ +Unknown +^^^^^^^ All other keys found in the metadata object MUST be interpreted -following the `extensions_section`_. +following the :ref:`Extensions section `. Example ^^^^^^^ @@ -1539,7 +1538,7 @@ process. See `ZEP 0 `_ for more infor Extension definition -------------------- -Extensions are defined in `metadata documents`_ either as objects or as +In `metadata documents`_, extensions can be encoded either as objects or as short-hand names. If using an objection definition, the member ``name`` MUST be a plain string which conforms to :ref:`extension name `. Optionally, the member ``configuration`` MAY be present but if so MUST be From 9fdbd81c9916a82d5a72ca62190870cc8036b2ba Mon Sep 17 00:00:00 2001 From: Josh Moore Date: Fri, 21 Feb 2025 15:22:34 +0100 Subject: [PATCH 29/64] Clarify version policy applies independently to each page --- docs/v3/core/index.rst | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/docs/v3/core/index.rst b/docs/v3/core/index.rst index 8ff0545c..218c513d 100644 --- a/docs/v3/core/index.rst +++ b/docs/v3/core/index.rst @@ -132,10 +132,8 @@ with implementation B. Therefore, data is only marked with the respective major version, unknown features are auto-discovered via the metadata document. -Notably, this excludes extensions such as codecs, data types, chunk grids -and storage transformers from the compatibility of the core specification, as -well as store support. However, extensions and stores are also RECOMMENDED to -follow this stability policy. +:ref:`Extensions` defined in subpages of this specification +follow the same stability policy but do so with their own version number. Document conventions ==================== From 7eef2b30e6d7856bfe3e56e9519fe569df22034b Mon Sep 17 00:00:00 2001 From: Josh Moore Date: Fri, 21 Feb 2025 15:22:47 +0100 Subject: [PATCH 30/64] Make terms in ext list nicer --- docs/v3/core/index.rst | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/docs/v3/core/index.rst b/docs/v3/core/index.rst index 218c513d..7c9a5d50 100644 --- a/docs/v3/core/index.rst +++ b/docs/v3/core/index.rst @@ -1518,15 +1518,15 @@ Extension points Different types of extensions can exist and they can be grouped as follows: -=========== ======================= ====================================== ================================ -node_type extension point metadata definition list of core extensions -=========== ======================= ====================================== ================================ -array data type `array-metadata-data-type`_ :ref:`data-types-list` -array chunk grid `array-metadata-chunk-grid`_ :ref:`chunk-grid-list` -array chunk key encoding `array-metadata-chunk-key-encoding`_ :ref:`chunk-key-encoding-list` -array codecs `array-metadata-codecs`_ :ref:`codecs-list` -array storage transformer `array-metadata-storage-transformers`_ :ref:`storage-transformers-list` -=========== ======================= ====================================== ================================ +=========== ======================= ================================================================== ================================ +node_type extension point metadata definition list of core extensions +=========== ======================= ================================================================== ================================ +array data type :ref:`data-type ` :ref:`data-types-list` +array chunk grid :ref:`chunk-grid ` :ref:`chunk-grid-list` +array chunk key encoding :ref:`chunk-key-encoding ` :ref:`chunk-key-encoding-list` +array codecs :ref:`codecs ` :ref:`codecs-list` +array storage transformer :ref:`storage-transformers ` :ref:`storage-transformers-list` +=========== ======================= ================================================================== ================================ New extension points may be proposed to the Zarr community through the ZEP process. See `ZEP 0 `_ for more information. From 9e0f9d3b8bbd1c1b9dace4d9035ec77b6d28154b Mon Sep 17 00:00:00 2001 From: Josh Moore Date: Fri, 21 Feb 2025 15:28:52 +0100 Subject: [PATCH 31/64] Drop versions from spec URIs With ZEP9 we've decided to manage the versions *within* the specs rather than via the name of the specs themselves. This means that there will not be multiple files in the repository, one for each minor version of the spec. --- docs/v3/chunk-grids/regular-grid/index.rst | 2 +- docs/v3/chunk-key-encodings/default/index.rst | 2 +- docs/v3/chunk-key-encodings/v2/index.rst | 2 +- docs/v3/codecs/blosc/index.rst | 2 +- docs/v3/codecs/bytes/index.rst | 2 +- docs/v3/codecs/crc32c/index.rst | 2 +- docs/v3/codecs/gzip/index.rst | 2 +- docs/v3/codecs/sharding-indexed/index.rst | 2 +- docs/v3/codecs/transpose/index.rst | 2 +- docs/v3/core/index.rst | 4 ++-- docs/v3/stores/filesystem/index.rst | 2 +- 11 files changed, 12 insertions(+), 12 deletions(-) diff --git a/docs/v3/chunk-grids/regular-grid/index.rst b/docs/v3/chunk-grids/regular-grid/index.rst index e167fd38..8ad2f125 100644 --- a/docs/v3/chunk-grids/regular-grid/index.rst +++ b/docs/v3/chunk-grids/regular-grid/index.rst @@ -8,7 +8,7 @@ **Editor's draft 26 July 2019** Specification URI: - https://zarr-specs.readthedocs.io/en/latest/v3/chunk-grids/regular-grid/v1.0.html + https://zarr-specs.readthedocs.io/en/latest/v3/chunk-grids/regular-grid/ Corresponding ZEP: `ZEP0001 — Zarr specification version 3 `_ Issue tracking: diff --git a/docs/v3/chunk-key-encodings/default/index.rst b/docs/v3/chunk-key-encodings/default/index.rst index d17777d7..26f261bd 100644 --- a/docs/v3/chunk-key-encodings/default/index.rst +++ b/docs/v3/chunk-key-encodings/default/index.rst @@ -7,7 +7,7 @@ **Editor's draft 26 July 2019** Specification URI: - https://zarr-specs.readthedocs.io/en/latest/v3/chunk-key-encodings/default/v1.0.html + https://zarr-specs.readthedocs.io/en/latest/v3/chunk-key-encodings/default/ Corresponding ZEP: `ZEP0001 — Zarr specification version 3 `_ Issue tracking: diff --git a/docs/v3/chunk-key-encodings/v2/index.rst b/docs/v3/chunk-key-encodings/v2/index.rst index 9788c044..d75e64cf 100644 --- a/docs/v3/chunk-key-encodings/v2/index.rst +++ b/docs/v3/chunk-key-encodings/v2/index.rst @@ -7,7 +7,7 @@ **Editor's draft 26 July 2019** Specification URI: - https://zarr-specs.readthedocs.io/en/latest/v3/chunk-key-encodings/v2/v1.0.html + https://zarr-specs.readthedocs.io/en/latest/v3/chunk-key-encodings/v2/ Corresponding ZEP: `ZEP0001 — Zarr specification version 3 `_ Issue tracking: diff --git a/docs/v3/codecs/blosc/index.rst b/docs/v3/codecs/blosc/index.rst index 107cf03b..127a54e7 100644 --- a/docs/v3/codecs/blosc/index.rst +++ b/docs/v3/codecs/blosc/index.rst @@ -5,7 +5,7 @@ **Editor's draft 26 July 2019** Specification URI: - https://zarr-specs.readthedocs.io/en/latest/v3/codecs/blosc/v1.0.html + https://zarr-specs.readthedocs.io/en/latest/v3/codecs/blosc/ Corresponding ZEP: `ZEP0001 — Zarr specification version 3 `_ Issue tracking: diff --git a/docs/v3/codecs/bytes/index.rst b/docs/v3/codecs/bytes/index.rst index e204ed32..c9933c0f 100644 --- a/docs/v3/codecs/bytes/index.rst +++ b/docs/v3/codecs/bytes/index.rst @@ -7,7 +7,7 @@ **Editor's draft 26 July 2019** Specification URI: - https://zarr-specs.readthedocs.io/en/latest/v3/codecs/bytes/v1.0.html + https://zarr-specs.readthedocs.io/en/latest/v3/codecs/bytes/ Corresponding ZEP: `ZEP0001 — Zarr specification version 3 `_ Issue tracking: diff --git a/docs/v3/codecs/crc32c/index.rst b/docs/v3/codecs/crc32c/index.rst index a548fe46..4af234d5 100644 --- a/docs/v3/codecs/crc32c/index.rst +++ b/docs/v3/codecs/crc32c/index.rst @@ -5,7 +5,7 @@ ==================================== Specification URI: - https://zarr-specs.readthedocs.io/en/latest/v3/codecs/crc32c/v1.0.html + https://zarr-specs.readthedocs.io/en/latest/v3/codecs/crc32c/ Editors: * Jonathan Striebel (`@jstriebel `_), Scalable Minds * Norman Rzepka (`@normanrz `_), Scalable Minds diff --git a/docs/v3/codecs/gzip/index.rst b/docs/v3/codecs/gzip/index.rst index e69b3520..d9237a40 100644 --- a/docs/v3/codecs/gzip/index.rst +++ b/docs/v3/codecs/gzip/index.rst @@ -5,7 +5,7 @@ **Editor's draft 26 July 2019** Specification URI: - https://zarr-specs.readthedocs.io/en/latest/v3/codecs/gzip/v1.0.html + https://zarr-specs.readthedocs.io/en/latest/v3/codecs/gzip/ Corresponding ZEP: `ZEP0001 — Zarr specification version 3 `_ Issue tracking: diff --git a/docs/v3/codecs/sharding-indexed/index.rst b/docs/v3/codecs/sharding-indexed/index.rst index c2366bdd..e7f0a33d 100644 --- a/docs/v3/codecs/sharding-indexed/index.rst +++ b/docs/v3/codecs/sharding-indexed/index.rst @@ -5,7 +5,7 @@ Sharding codec (version 1.0) ========================================== Specification URI: - https://zarr-specs.readthedocs.io/en/latest/v3/codecs/sharding-indexed/v1.0.html + https://zarr-specs.readthedocs.io/en/latest/v3/codecs/sharding-indexed/ Editors: * Jonathan Striebel (`@jstriebel `_), Scalable Minds * Norman Rzepka (`@normanrz `_), Scalable Minds diff --git a/docs/v3/codecs/transpose/index.rst b/docs/v3/codecs/transpose/index.rst index 5e01764a..4083bede 100644 --- a/docs/v3/codecs/transpose/index.rst +++ b/docs/v3/codecs/transpose/index.rst @@ -7,7 +7,7 @@ **Editor's draft 26 July 2019** Specification URI: - https://zarr-specs.readthedocs.io/en/latest/v3/codecs/transpose/v1.0.html + https://zarr-specs.readthedocs.io/en/latest/v3/codecs/transpose/ Corresponding ZEP: `ZEP0001 — Zarr specification version 3 `_ Issue tracking: diff --git a/docs/v3/core/index.rst b/docs/v3/core/index.rst index 7c9a5d50..2d630318 100644 --- a/docs/v3/core/index.rst +++ b/docs/v3/core/index.rst @@ -2,11 +2,11 @@ .. _zarr-core-specification-v3: ====================================== - Zarr core specification (version 3.0) + Zarr core specification (version 3.1) ====================================== Specification URI: - https://zarr-specs.readthedocs.io/en/latest/v3/core/index.html + https://zarr-specs.readthedocs.io/en/latest/v3/core/ Editors: * Alistair Miles (`@alimanfoo `_), Wellcome Sanger Institute diff --git a/docs/v3/stores/filesystem/index.rst b/docs/v3/stores/filesystem/index.rst index 261d2dd8..d699673b 100644 --- a/docs/v3/stores/filesystem/index.rst +++ b/docs/v3/stores/filesystem/index.rst @@ -5,7 +5,7 @@ ================================= Specification URI: - https://zarr-specs.readthedocs.io/en/latest/v3/stores/filesystem/v1.0.html + https://zarr-specs.readthedocs.io/en/latest/v3/stores/filesystem/ Corresponding ZEP: `ZEP0001 — Zarr specification version 3 `_ Issue tracking: From 4e1bec818d136bbfb96fb9af8afd0d95c35a2615 Mon Sep 17 00:00:00 2001 From: Josh Moore Date: Fri, 21 Feb 2025 15:39:53 +0100 Subject: [PATCH 32/64] Fix plurality of chunk grids page --- docs/v3/chunk-grid.rst | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/v3/chunk-grid.rst b/docs/v3/chunk-grid.rst index ccdaeeb8..9a4ade0d 100644 --- a/docs/v3/chunk-grid.rst +++ b/docs/v3/chunk-grid.rst @@ -1,8 +1,8 @@ .. _chunk-grid-list: -========== -Chunk Grid -========== +=========== +Chunk Grids +=========== The following documents specify chunk grids which SHOULD be implemented by all implementations. From f2c977df553115c1b8339c7b10df3b0a67e2e041 Mon Sep 17 00:00:00 2001 From: Josh Moore Date: Fri, 21 Feb 2025 15:47:43 +0100 Subject: [PATCH 33/64] Unify all index pages into subdirectories --- docs/conf.py | 6 ++++++ docs/specs.rst | 12 ++++++------ docs/v3/{chunk-grid.rst => chunk-grids/index.rst} | 2 +- .../index.rst} | 2 +- docs/v3/{codecs.rst => codecs/index.rst} | 2 +- docs/v3/{data-types.rst => data-types/index.rst} | 0 .../index.rst} | 2 +- docs/v3/{stores.rst => stores/index.rst} | 2 +- 8 files changed, 17 insertions(+), 11 deletions(-) rename docs/v3/{chunk-grid.rst => chunk-grids/index.rst} (95%) rename docs/v3/{chunk-key-encoding.rst => chunk-key-encodings/index.rst} (94%) rename docs/v3/{codecs.rst => codecs/index.rst} (96%) rename docs/v3/{data-types.rst => data-types/index.rst} (100%) rename docs/v3/{array-storage-transformers.rst => storage-transformers/index.rst} (94%) rename docs/v3/{stores.rst => stores/index.rst} (96%) diff --git a/docs/conf.py b/docs/conf.py index 1ed61a03..4c9b647b 100644 --- a/docs/conf.py +++ b/docs/conf.py @@ -95,4 +95,10 @@ "v3/codecs/sharding-indexed/v1.0.rst": "./index.html", "v3/codecs/transpose/v1.0.rst": "./index.html", "v3/stores/filesystem/v1.0.rst": "./index.html", + "v3/chunk-grid.rst": "chunk-grids/index.rst", + "v3/chunk-key-encoding.rst": "chunk-key-encodings/index.html", + "v3/codecs.rst": "codecs/index.html", + "v3/data-types.rst": "data-types/index.html", + "v3/array-storage-transformers.rst": "storage-transformers/index.html", + "v3/stores.rst": "stores/index.html", } diff --git a/docs/specs.rst b/docs/specs.rst index d1097193..356a7a47 100644 --- a/docs/specs.rst +++ b/docs/specs.rst @@ -9,12 +9,12 @@ Specifications :caption: v3 Core - v3/codecs - v3/chunk-grid - v3/chunk-key-encoding - v3/data-types - v3/stores - v3/array-storage-transformers + v3/codecs/index + v3/chunk-grids/index + v3/chunk-key-encodings/index + v3/data-types/index + v3/stores/index + v3/storage-transformers/index .. toctree:: :maxdepth: 1 diff --git a/docs/v3/chunk-grid.rst b/docs/v3/chunk-grids/index.rst similarity index 95% rename from docs/v3/chunk-grid.rst rename to docs/v3/chunk-grids/index.rst index 9a4ade0d..0c9a2176 100644 --- a/docs/v3/chunk-grid.rst +++ b/docs/v3/chunk-grids/index.rst @@ -13,7 +13,7 @@ be implemented by all implementations. :titlesonly: :caption: Contents: - chunk-grids/*/* + */* Extensions ---------- diff --git a/docs/v3/chunk-key-encoding.rst b/docs/v3/chunk-key-encodings/index.rst similarity index 94% rename from docs/v3/chunk-key-encoding.rst rename to docs/v3/chunk-key-encodings/index.rst index 568a5dd5..68587518 100644 --- a/docs/v3/chunk-key-encoding.rst +++ b/docs/v3/chunk-key-encodings/index.rst @@ -13,7 +13,7 @@ be implemented by all implementations. :titlesonly: :caption: Contents: - chunk-key-encodings/*/* + */* Extensions ---------- diff --git a/docs/v3/codecs.rst b/docs/v3/codecs/index.rst similarity index 96% rename from docs/v3/codecs.rst rename to docs/v3/codecs/index.rst index 52f9407b..927b3758 100644 --- a/docs/v3/codecs.rst +++ b/docs/v3/codecs/index.rst @@ -13,7 +13,7 @@ be implemented by all implementations. :titlesonly: :caption: Contents: - codecs/*/* + */* Extensions ---------- diff --git a/docs/v3/data-types.rst b/docs/v3/data-types/index.rst similarity index 100% rename from docs/v3/data-types.rst rename to docs/v3/data-types/index.rst diff --git a/docs/v3/array-storage-transformers.rst b/docs/v3/storage-transformers/index.rst similarity index 94% rename from docs/v3/array-storage-transformers.rst rename to docs/v3/storage-transformers/index.rst index c095bd6c..30b8f19f 100644 --- a/docs/v3/array-storage-transformers.rst +++ b/docs/v3/storage-transformers/index.rst @@ -15,7 +15,7 @@ Array Storage Transformers :titlesonly: :caption: Contents: - array-storage-transformers/*/* + */* Currently, no core storage transformers are defined by this specification. diff --git a/docs/v3/stores.rst b/docs/v3/stores/index.rst similarity index 96% rename from docs/v3/stores.rst rename to docs/v3/stores/index.rst index 67ab4733..a3c33491 100644 --- a/docs/v3/stores.rst +++ b/docs/v3/stores/index.rst @@ -13,7 +13,7 @@ be implemented by all implementations. :titlesonly: :caption: Contents: - stores/*/* + */* .. note:: Stores are *not* extension points since they define the mechanism From d8c88ece7f8f4cc0dd6c308c07ce8d957625c011 Mon Sep 17 00:00:00 2001 From: Josh Moore Date: Fri, 21 Feb 2025 15:50:24 +0100 Subject: [PATCH 34/64] Catch a few last references of URIs rather than URLs --- docs/v3/core/index.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/v3/core/index.rst b/docs/v3/core/index.rst index 2d630318..acaced5b 100644 --- a/docs/v3/core/index.rst +++ b/docs/v3/core/index.rst @@ -1129,7 +1129,7 @@ list of codecs defined for an array MAY contain codecs which are defined in separate specifications. In order to refer to codecs in array metadata documents, each codec must have a unique identifier, which is either a known "`raw name `_" or -a "`URI-based name `_" as defined under :ref:`extensions_section`. +a "`URL-based name `_" as defined under :ref:`extensions_section`. For ease of discovery, it is recommended that codec specifications are contributed to the registry of extensions @@ -1570,7 +1570,7 @@ data type, chunk grid, and chunk key encoding. Extension naming ---------------- -The `name` field of an extension can take two forms: **raw names** and **URI-based names**. +The `name` field of an extension can take two forms: **raw names** and **URL-based names**. .. _extension-naming-raw-names: From 3844fc91ff2f7e6eb804fb4f6a5527311e0c3f4c Mon Sep 17 00:00:00 2001 From: Josh Moore Date: Fri, 21 Feb 2025 15:59:53 +0100 Subject: [PATCH 35/64] Improve changelog --- docs/v3/core/index.rst | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/docs/v3/core/index.rst b/docs/v3/core/index.rst index acaced5b..6a1d4eec 100644 --- a/docs/v3/core/index.rst +++ b/docs/v3/core/index.rst @@ -1732,8 +1732,11 @@ by time. - Clarification of extensions. `PR #330 `_. With this change, - it is now possible to register new names for extension objects as well as use - URL. + it is now possible to register new names or even use URLs for extensions. + Additionally, extensions may be marked with `must_understand=False` in case + a non-implementing library can safely ignore them. + Please see the new :ref:`Extensions section ` + for details. Changes after Provisional Acceptance ------------------------------------ From 447ad8c9449d06407ad9a12885b1b9703fe5c0ef Mon Sep 17 00:00:00 2001 From: Josh Moore Date: Fri, 21 Feb 2025 18:00:11 +0100 Subject: [PATCH 36/64] Add Norman --- docs/v3/core/index.rst | 1 + 1 file changed, 1 insertion(+) diff --git a/docs/v3/core/index.rst b/docs/v3/core/index.rst index 6a1d4eec..f9517f03 100644 --- a/docs/v3/core/index.rst +++ b/docs/v3/core/index.rst @@ -11,6 +11,7 @@ Specification URI: Editors: * Alistair Miles (`@alimanfoo `_), Wellcome Sanger Institute * Jonathan Striebel (`@jstriebel `_), Scalable Minds + * Norman Rzepka (`@normanrz `_), Scalable Minds * Jeremy Maitin-Shepard (`@jbms `_), Google * Josh Moore(`@joshmoore `_), German BioImaging From 6de00f1136336a67ba5749465b5ca7c6437ee625 Mon Sep 17 00:00:00 2001 From: Josh Moore Date: Mon, 24 Feb 2025 11:08:23 +0100 Subject: [PATCH 37/64] Apply suggestions from code review Co-authored-by: Norman Rzepka --- docs/v3/chunk-grids/regular-grid/index.rst | 2 +- docs/v3/core/index.rst | 10 +++++----- 2 files changed, 6 insertions(+), 6 deletions(-) diff --git a/docs/v3/chunk-grids/regular-grid/index.rst b/docs/v3/chunk-grids/regular-grid/index.rst index 8ad2f125..74a089b9 100644 --- a/docs/v3/chunk-grids/regular-grid/index.rst +++ b/docs/v3/chunk-grids/regular-grid/index.rst @@ -2,7 +2,7 @@ .. _regulargrid-chunkgrid-v1: ====================================== - Regular grid chunk grid (version 1.0) + Regular chunk grid (version 1.0) ====================================== **Editor's draft 26 July 2019** diff --git a/docs/v3/core/index.rst b/docs/v3/core/index.rst index f9517f03..9c1875cd 100644 --- a/docs/v3/core/index.rst +++ b/docs/v3/core/index.rst @@ -13,7 +13,7 @@ Editors: * Jonathan Striebel (`@jstriebel `_), Scalable Minds * Norman Rzepka (`@normanrz `_), Scalable Minds * Jeremy Maitin-Shepard (`@jbms `_), Google - * Josh Moore(`@joshmoore `_), German BioImaging + * Josh Moore (`@joshmoore `_), German BioImaging Corresponding ZEP: `ZEP0001 — Zarr specification version 3 `_ @@ -350,7 +350,7 @@ terminology for a use case of reading from an array: *Extension point* - A field in a `metadata document_` that can be extended to allow values + A field in a `metadata document`_ that can be extended to allow values not defined in this specification. See :ref:`extension points ` below. @@ -1136,10 +1136,10 @@ recommended that codec specifications are contributed to the registry of extensions (`zarr-extensions`_). -A codec specification must declare the codec identifier, and describe +A codec specification MUST declare the codec identifier, and describe (or cite documents that describe) the encoding and decoding algorithms and the format of the encoded data. -A codec may have configuration parameters which modify the behaviour +A codec MAY have configuration parameters which modify the behaviour of the codec in some way. For example, a compression codec may have a compression level parameter, which is an integer that affects the resulting compression ratio of the data. Configuration parameters must @@ -1590,7 +1590,7 @@ maintainer team reserves the right to refuse name assignment at its own discretion. - **Example:** ``zstd`` -- **Acceptd regex:** ``^[a-z0-9-_.]+$`` +- **Accepted regex:** ``^[a-z0-9-_.]+$`` .. _extension-naming-url-based-names: From a791ae536d57526de4c45cf7b6e6b943e601cbd0 Mon Sep 17 00:00:00 2001 From: Norman Rzepka Date: Mon, 24 Feb 2025 15:58:03 +0100 Subject: [PATCH 38/64] fill_value --- docs/v3/core/index.rst | 57 ++++++------------------------------ docs/v3/data-types/index.rst | 54 +++++++++++++++++++++++++++++++++- 2 files changed, 62 insertions(+), 49 deletions(-) diff --git a/docs/v3/core/index.rst b/docs/v3/core/index.rst index 9c1875cd..3ff87fa9 100644 --- a/docs/v3/core/index.rst +++ b/docs/v3/core/index.rst @@ -547,55 +547,10 @@ mandatory names: Provides an element value to use for uninitialised portions of the Zarr array. - The permitted values depend on the data type: + The permitted values depend on the data type. Fill values for core + data types are listed in :ref:`fill-value-list`. - ``bool`` - The value must be a JSON boolean (``false`` or ``true``). - - Integers (``{uint,int}{8,16,32,64}``) - The value must be a JSON number with no fraction or exponent part that is - within the representable range of the data type. - - IEEE 754 floating point numbers (``float{16,32,64}``) - The value may be either: - - - A JSON number, that will be rounded to the nearest representable value. - - - A JSON string of the form: - - - ``"Infinity"``, denoting positive infinity; - - ``"-Infinity"``, denoting negative infinity; - - ``"NaN"``, denoting thenot-a-number (NaN) value where the sign bit is - 0 (positive), the most significant bit (MSB) of the mantissa is 1, and - all other bits of the mantissa are zero; - - ``"0xYYYYYYYY"``, specifying the byte representation of the floating - point number as an unsigned integer. For example, for ``float32``, - ``"NaN"`` is equivalent to ``"0x7fc00000"``. This representation is - the only way to specify a NaN value other than the specific NaN value - denoted by ``"NaN"``. - - .. warning:: - - While this NaN syntax is consistent with the syntax accepted by the - C99 ``strtod`` function, C99 leaves the meaning of the NaN payload - string implementation defined, which may not match the Zarr - definition. - - Complex numbers (``complex{64,128}``) - The value must be a two-element array, specifying the real and imaginary - components respectively, where each component is specified as defined - above for floating point number. - - For example, ``[1, 2]`` indicates ``1 + 2i`` and ``["-Infinity", "NaN"]`` - indicates a complex number with real component of -inf and imaginary - component of NaN. - - Raw data types (``r``) - An array of integers, with length equal to ````, where each integer is - in the range ``[0, 255]``. - - Extensions to the spec that define new data types must also define the JSON - fill value representation. + Extension data types MUST also define the JSON fill value representation. .. note:: @@ -1529,6 +1484,8 @@ array codecs :ref:`codecs ` array storage transformer :ref:`storage-transformers ` :ref:`storage-transformers-list` =========== ======================= ================================================================== ================================ +Note, that ``fill_value`` is not its own extension point, but is dependent on the data type. + New extension points may be proposed to the Zarr community through the ZEP process. See `ZEP 0 `_ for more information. @@ -1665,6 +1622,10 @@ For extensions with URL-based names, it is RECOMMENDED that the URL resolve to a specification of the extension. Additionally, URL-based extensions MAY also register themselves under the `zarr-extensions`_ repository for better discovery. +Because the ``fill_value`` metadata key is dependent on the data type, +extension data types SHOULD specify permitted values for the ``fill_value`` in +their specification. + Implementation Notes ==================== diff --git a/docs/v3/data-types/index.rst b/docs/v3/data-types/index.rst index 4ce9adcf..de65fba3 100644 --- a/docs/v3/data-types/index.rst +++ b/docs/v3/data-types/index.rst @@ -14,7 +14,7 @@ Core data types :header-rows: 1 * - Identifier - - Numerical type + - Numerical Type * - ``bool`` - Boolean * - ``int8`` @@ -46,6 +46,58 @@ Core data types * - ``r*`` (Optional) - raw bits, variable size given by ``*``, limited to be a multiple of 8 +.. _fill-value-list: + +Permitted fill values +^^^^^^^^^^^^^^^^^^^^^ + +The permitted values depend on the data type: + + ``bool`` + The value must be a JSON boolean (``false`` or ``true``). + + Integers (``{uint,int}{8,16,32,64}``) + The value must be a JSON number with no fraction or exponent part that is + within the representable range of the data type. + + IEEE 754 floating point numbers (``float{16,32,64}``) + The value may be either: + + - A JSON number, that will be rounded to the nearest representable value. + + - A JSON string of the form: + + - ``"Infinity"``, denoting positive infinity; + - ``"-Infinity"``, denoting negative infinity; + - ``"NaN"``, denoting thenot-a-number (NaN) value where the sign bit is + 0 (positive), the most significant bit (MSB) of the mantissa is 1, and + all other bits of the mantissa are zero; + - ``"0xYYYYYYYY"``, specifying the byte representation of the floating + point number as an unsigned integer. For example, for ``float32``, + ``"NaN"`` is equivalent to ``"0x7fc00000"``. This representation is + the only way to specify a NaN value other than the specific NaN value + denoted by ``"NaN"``. + + .. warning:: + + While this NaN syntax is consistent with the syntax accepted by the + C99 ``strtod`` function, C99 leaves the meaning of the NaN payload + string implementation defined, which may not match the Zarr + definition. + + Complex numbers (``complex{64,128}``) + The value must be a two-element array, specifying the real and imaginary + components respectively, where each component is specified as defined + above for floating point number. + + For example, ``[1, 2]`` indicates ``1 + 2i`` and ``["-Infinity", "NaN"]`` + indicates a complex number with real component of -inf and imaginary + component of NaN. + + Raw data types (``r``) + An array of integers, with length equal to ````, where each integer is + in the range ``[0, 255]``. + Extensions ---------- From 4ba9e3e1c45d5d8ce4cea5bf6bd877f97afebf54 Mon Sep 17 00:00:00 2001 From: Norman Rzepka Date: Mon, 24 Feb 2025 16:12:29 +0100 Subject: [PATCH 39/64] versioning --- docs/v3/chunk-grids/regular-grid/index.rst | 12 ++++++------ docs/v3/chunk-key-encodings/default/index.rst | 12 ++++++------ docs/v3/chunk-key-encodings/v2/index.rst | 12 ++++++------ docs/v3/codecs/blosc/index.rst | 10 +++++----- docs/v3/codecs/bytes/index.rst | 10 +++++----- docs/v3/codecs/crc32c/index.rst | 10 ++++++---- docs/v3/codecs/gzip/index.rst | 10 +++++----- docs/v3/codecs/sharding-indexed/index.rst | 10 ++++++---- docs/v3/codecs/transpose/index.rst | 10 +++++----- docs/v3/core/index.rst | 8 +++++--- docs/v3/stores/filesystem/index.rst | 8 +++++--- 11 files changed, 60 insertions(+), 52 deletions(-) diff --git a/docs/v3/chunk-grids/regular-grid/index.rst b/docs/v3/chunk-grids/regular-grid/index.rst index 74a089b9..e9a1fa45 100644 --- a/docs/v3/chunk-grids/regular-grid/index.rst +++ b/docs/v3/chunk-grids/regular-grid/index.rst @@ -1,12 +1,12 @@ -.. _regulargrid-chunkgrid-v1: +.. _regular-chunkgrid: -====================================== - Regular chunk grid (version 1.0) -====================================== - - **Editor's draft 26 July 2019** +================== +Regular chunk grid +================== +Version: + 1.0 Specification URI: https://zarr-specs.readthedocs.io/en/latest/v3/chunk-grids/regular-grid/ Corresponding ZEP: diff --git a/docs/v3/chunk-key-encodings/default/index.rst b/docs/v3/chunk-key-encodings/default/index.rst index 26f261bd..82c99055 100644 --- a/docs/v3/chunk-key-encodings/default/index.rst +++ b/docs/v3/chunk-key-encodings/default/index.rst @@ -1,11 +1,11 @@ -.. _default-chunkkeyencoding-v1: +.. _default-chunkkeyencoding: -========================================= - Default chunk key encoding (version 1.0) -========================================= - - **Editor's draft 26 July 2019** +========================== +Default chunk key encoding +========================== +Version: + 1.0 Specification URI: https://zarr-specs.readthedocs.io/en/latest/v3/chunk-key-encodings/default/ Corresponding ZEP: diff --git a/docs/v3/chunk-key-encodings/v2/index.rst b/docs/v3/chunk-key-encodings/v2/index.rst index d75e64cf..5b7abf9a 100644 --- a/docs/v3/chunk-key-encodings/v2/index.rst +++ b/docs/v3/chunk-key-encodings/v2/index.rst @@ -1,11 +1,11 @@ -.. _v2-chunkkeyencoding-v1: +.. _v2-chunkkeyencoding: -========================================= - v2 chunk key encoding (version 1.0) -========================================= - - **Editor's draft 26 July 2019** +===================== +v2 chunk key encoding +===================== +Version: + 1.0 Specification URI: https://zarr-specs.readthedocs.io/en/latest/v3/chunk-key-encodings/v2/ Corresponding ZEP: diff --git a/docs/v3/codecs/blosc/index.rst b/docs/v3/codecs/blosc/index.rst index 127a54e7..1c6dc6fc 100644 --- a/docs/v3/codecs/blosc/index.rst +++ b/docs/v3/codecs/blosc/index.rst @@ -1,9 +1,9 @@ -=========================== - Blosc codec (version 1.0) -=========================== - - **Editor's draft 26 July 2019** +=========== +Blosc codec +=========== +Version: + 1.0 Specification URI: https://zarr-specs.readthedocs.io/en/latest/v3/codecs/blosc/ Corresponding ZEP: diff --git a/docs/v3/codecs/bytes/index.rst b/docs/v3/codecs/bytes/index.rst index c9933c0f..3dc339a9 100644 --- a/docs/v3/codecs/bytes/index.rst +++ b/docs/v3/codecs/bytes/index.rst @@ -1,11 +1,11 @@ .. _bytes-codec-v1: -============================ - Bytes codec (version 1.0) -============================ - - **Editor's draft 26 July 2019** +=========== +Bytes codec +=========== +Version: + 1.0 Specification URI: https://zarr-specs.readthedocs.io/en/latest/v3/codecs/bytes/ Corresponding ZEP: diff --git a/docs/v3/codecs/crc32c/index.rst b/docs/v3/codecs/crc32c/index.rst index 4af234d5..8ef05da6 100644 --- a/docs/v3/codecs/crc32c/index.rst +++ b/docs/v3/codecs/crc32c/index.rst @@ -1,9 +1,11 @@ -.. _crc32c-codec-v1: +.. _crc32c-codec: -==================================== - CRC32C checksum codec (version 1.0) -==================================== +===================== +CRC32C checksum codec +===================== +Version: + 1.0 Specification URI: https://zarr-specs.readthedocs.io/en/latest/v3/codecs/crc32c/ Editors: diff --git a/docs/v3/codecs/gzip/index.rst b/docs/v3/codecs/gzip/index.rst index d9237a40..62bab8c8 100644 --- a/docs/v3/codecs/gzip/index.rst +++ b/docs/v3/codecs/gzip/index.rst @@ -1,9 +1,9 @@ -========================== - Gzip codec (version 1.0) -========================== - - **Editor's draft 26 July 2019** +========== +Gzip codec +========== +Version: + 1.0 Specification URI: https://zarr-specs.readthedocs.io/en/latest/v3/codecs/gzip/ Corresponding ZEP: diff --git a/docs/v3/codecs/sharding-indexed/index.rst b/docs/v3/codecs/sharding-indexed/index.rst index e7f0a33d..e42ffd29 100644 --- a/docs/v3/codecs/sharding-indexed/index.rst +++ b/docs/v3/codecs/sharding-indexed/index.rst @@ -1,9 +1,11 @@ -.. _sharding-indexed-codec-v1: +.. _sharding-indexed-codec: -========================================== -Sharding codec (version 1.0) -========================================== +============== +Sharding codec +============== +Version: + 1.0 Specification URI: https://zarr-specs.readthedocs.io/en/latest/v3/codecs/sharding-indexed/ Editors: diff --git a/docs/v3/codecs/transpose/index.rst b/docs/v3/codecs/transpose/index.rst index 4083bede..75f75305 100644 --- a/docs/v3/codecs/transpose/index.rst +++ b/docs/v3/codecs/transpose/index.rst @@ -1,11 +1,11 @@ .. _transpose-codec-v1: -============================== - Transpose codec (version 1.0) -============================== - - **Editor's draft 26 July 2019** +=============== +Transpose codec +=============== +Version: + 1.0 Specification URI: https://zarr-specs.readthedocs.io/en/latest/v3/codecs/transpose/ Corresponding ZEP: diff --git a/docs/v3/core/index.rst b/docs/v3/core/index.rst index 3ff87fa9..b7f2710c 100644 --- a/docs/v3/core/index.rst +++ b/docs/v3/core/index.rst @@ -1,10 +1,12 @@ .. This file is in restructured text format: https://docutils.sourceforge.io/rst.html .. _zarr-core-specification-v3: -====================================== - Zarr core specification (version 3.1) -====================================== +======================= +Zarr core specification +======================= +Version: + 3.1 Specification URI: https://zarr-specs.readthedocs.io/en/latest/v3/core/ diff --git a/docs/v3/stores/filesystem/index.rst b/docs/v3/stores/filesystem/index.rst index d699673b..5733bc41 100644 --- a/docs/v3/stores/filesystem/index.rst +++ b/docs/v3/stores/filesystem/index.rst @@ -1,9 +1,11 @@ .. _file-system-store-v1: -================================= - File system store (version 1.0) -================================= +================= +File system store +================= +Version: + 1.0 Specification URI: https://zarr-specs.readthedocs.io/en/latest/v3/stores/filesystem/ Corresponding ZEP: From 6e7da25d9927575b515a7950363b6f0c0f5878e8 Mon Sep 17 00:00:00 2001 From: Josh Moore Date: Sat, 1 Mar 2025 17:20:14 +0100 Subject: [PATCH 40/64] Make must_understand a section --- docs/v3/core/index.rst | 20 +++++++++++++++++++- 1 file changed, 19 insertions(+), 1 deletion(-) diff --git a/docs/v3/core/index.rst b/docs/v3/core/index.rst index b7f2710c..212b5fc8 100644 --- a/docs/v3/core/index.rst +++ b/docs/v3/core/index.rst @@ -783,6 +783,7 @@ The group metadata object must not contain any other names. Those are reserved for future versions of this specification. An implementation must fail to open zarr hierarchies or groups with unknown metadata fields, with the exception of objects with a ``"must_understand": false`` key-value pair. +See :ref:`extension-definition-must-understand` for more information. Node names @@ -1496,8 +1497,15 @@ process. See `ZEP 0 `_ for more infor Extension definition -------------------- +.. _extension-definition-object: + +Objects +^^^^^^^ + In `metadata documents`_, extensions can be encoded either as objects or as -short-hand names. If using an objection definition, the member ``name`` +short-hand names. + +If using an objection definition, the member ``name`` MUST be a plain string which conforms to :ref:`extension name `. Optionally, the member ``configuration`` MAY be present but if so MUST be an object. @@ -1509,10 +1517,20 @@ For example:: "configuration": { ... } # optional } +.. _extension-definition-short-hand-name: + +Short-hand names +^^^^^^^^^^^^^^^^ + Instead of extension objects, short-hand names MAY be used if no configuration metadata is required. They are equivalent to extension objects with just a `name` key. +.. _extension-definition-must-understand: + +`must_understand` +^^^^^^^^^^^^^^^^^ + If such an object is present, the field `must_understand` is implicitly set to `True` and an object MAY explicitly set `must_understand=False` if implementations can ignore its presence. From 8fd39d26c3d5d7d0962beec6e22543da324b4faf Mon Sep 17 00:00:00 2001 From: Josh Moore Date: Sat, 1 Mar 2025 17:21:48 +0100 Subject: [PATCH 41/64] start with a lower-case letter --- docs/v3/core/index.rst | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/docs/v3/core/index.rst b/docs/v3/core/index.rst index 212b5fc8..22c20e4f 100644 --- a/docs/v3/core/index.rst +++ b/docs/v3/core/index.rst @@ -1559,7 +1559,8 @@ Raw names are centrally registered names which can be used without prefix. Raw names MUST be assigned within a central repository. Raw names are unique and immutable. -Raw names MUST be composed of lower case letters a-z, numerals 0-9, underscores, dashes, and dots. +Raw names MUST start with one lower case letter a-z and then be followed +by only lower case letters a-z, numerals 0-9, underscores, dashes, and dots. Raw name assignment is managed through the `zarr-extensions`_ Github repository, where extensions and their specification are listed. The Zarr Steering Council or by delegation a @@ -1567,7 +1568,7 @@ maintainer team reserves the right to refuse name assignment at its own discretion. - **Example:** ``zstd`` -- **Accepted regex:** ``^[a-z0-9-_.]+$`` +- **Accepted regex:** ``^[a-z][a-z0-9-_.]+$`` .. _extension-naming-url-based-names: From 65e46f8592475f41572a7519292bdb3e5e83f5b6 Mon Sep 17 00:00:00 2001 From: Josh Moore Date: Sat, 1 Mar 2025 17:27:09 +0100 Subject: [PATCH 42/64] unify plurality of ext lists --- docs/v3/codecs/index.rst | 2 +- docs/v3/core/index.rst | 8 ++++---- docs/v3/data-types/index.rst | 2 +- docs/v3/storage-transformers/index.rst | 2 +- docs/v3/stores/index.rst | 2 +- 5 files changed, 8 insertions(+), 8 deletions(-) diff --git a/docs/v3/codecs/index.rst b/docs/v3/codecs/index.rst index 927b3758..549af4e1 100644 --- a/docs/v3/codecs/index.rst +++ b/docs/v3/codecs/index.rst @@ -1,4 +1,4 @@ -.. _codecs-list: +.. _codec-list: ====== Codecs diff --git a/docs/v3/core/index.rst b/docs/v3/core/index.rst index 22c20e4f..a097bd10 100644 --- a/docs/v3/core/index.rst +++ b/docs/v3/core/index.rst @@ -846,7 +846,7 @@ should be interpreted. This specification defines a limited set of data types to represent boolean values, integers, and floating point -numbers. These can be found under :ref:`Data Types`. +numbers. These can be found under :ref:`Data Types`. All of the data types defined here have a fixed size, in the sense that all values require the same number of bytes. @@ -1480,11 +1480,11 @@ Different types of extensions can exist and they can be grouped as follows: =========== ======================= ================================================================== ================================ node_type extension point metadata definition list of core extensions =========== ======================= ================================================================== ================================ -array data type :ref:`data-type ` :ref:`data-types-list` +array data type :ref:`data-type ` :ref:`data-type-list` array chunk grid :ref:`chunk-grid ` :ref:`chunk-grid-list` array chunk key encoding :ref:`chunk-key-encoding ` :ref:`chunk-key-encoding-list` -array codecs :ref:`codecs ` :ref:`codecs-list` -array storage transformer :ref:`storage-transformers ` :ref:`storage-transformers-list` +array codecs :ref:`codecs ` :ref:`codec-list` +array storage transformer :ref:`storage-transformers ` :ref:`storage-transformer-list` =========== ======================= ================================================================== ================================ Note, that ``fill_value`` is not its own extension point, but is dependent on the data type. diff --git a/docs/v3/data-types/index.rst b/docs/v3/data-types/index.rst index de65fba3..5d32f150 100644 --- a/docs/v3/data-types/index.rst +++ b/docs/v3/data-types/index.rst @@ -1,4 +1,4 @@ -.. _data-types-list: +.. _data-type-list: ========== Data Types diff --git a/docs/v3/storage-transformers/index.rst b/docs/v3/storage-transformers/index.rst index 30b8f19f..2041e6fe 100644 --- a/docs/v3/storage-transformers/index.rst +++ b/docs/v3/storage-transformers/index.rst @@ -1,4 +1,4 @@ -.. _storage-transformers-list: +.. _storage-transformer-list: ========================== Array Storage Transformers diff --git a/docs/v3/stores/index.rst b/docs/v3/stores/index.rst index a3c33491..c97dea2c 100644 --- a/docs/v3/stores/index.rst +++ b/docs/v3/stores/index.rst @@ -1,4 +1,4 @@ -.. _stores-list: +.. _store-list: ====== Stores From 90243d50113c287e1b272b8952c2516e1955dcb6 Mon Sep 17 00:00:00 2001 From: Josh Moore Date: Sat, 1 Mar 2025 17:35:04 +0100 Subject: [PATCH 43/64] Improve URL description --- docs/v3/core/index.rst | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/docs/v3/core/index.rst b/docs/v3/core/index.rst index a097bd10..88f63016 100644 --- a/docs/v3/core/index.rst +++ b/docs/v3/core/index.rst @@ -1575,9 +1575,10 @@ discretion. URL-based names ^^^^^^^^^^^^^^^ -URL-based names delegate name registration to the domain name system (DNS). +URL-based names delegate the assignment of unique identifiers to +the well-established addressing mechanism of the web. -URL-based named MAY be used by any extension without further coordination. +URL-based names are decentralized and MAY be used by any extension without further coordination. Entities defining a URL-based name SHOULD have appropriate authority over the URL. A persistent redirecting URL like PURL MAY be used. URLs have been chosen due to their potential for being self-documenting. From 73cce52c69e6cdc97e23ff918193917661db2374 Mon Sep 17 00:00:00 2001 From: Josh Moore Date: Sat, 1 Mar 2025 17:38:33 +0100 Subject: [PATCH 44/64] discourage top-level must_understand=false --- docs/v3/core/index.rst | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/docs/v3/core/index.rst b/docs/v3/core/index.rst index 88f63016..ce473bc0 100644 --- a/docs/v3/core/index.rst +++ b/docs/v3/core/index.rst @@ -1472,6 +1472,8 @@ Extensions This section describes how additional functionality can be defined for Zarr datasets by the `metadata documents`_. +.. _extension-points: + Extension points ---------------- @@ -1543,6 +1545,9 @@ set to ``"must_understand": false``. `must_understand=False` is not supported for the following extension points: data type, chunk grid, and chunk key encoding. +Use of `must_understand=False` to add top-level keys is discouraged in favor +of the explicit use of :ref:`extension-points`. + .. _extension-naming: Extension naming From a0c3170b23e8c4162f1cf898434b526c77508a3b Mon Sep 17 00:00:00 2001 From: Josh Moore Date: Sat, 1 Mar 2025 19:28:58 +0100 Subject: [PATCH 45/64] Add author guidance --- docs/v3/core/index.rst | 35 ++++++++++++++++++++++++++++++++++- 1 file changed, 34 insertions(+), 1 deletion(-) diff --git a/docs/v3/core/index.rst b/docs/v3/core/index.rst index ce473bc0..c804f06d 100644 --- a/docs/v3/core/index.rst +++ b/docs/v3/core/index.rst @@ -1653,13 +1653,46 @@ Because the ``fill_value`` metadata key is dependent on the data type, extension data types SHOULD specify permitted values for the ``fill_value`` in their specification. + +.. _extension-guidance: + +Guidance for extension authors +------------------------------ + +*This section is non-normative and provides assistance for the authors of +extensions, especially those who are just getting started.* + +* If you are just getting started, use the URL of your work-in-progress as an + identifier for your extension. The GitHub link, including the branch if you + would like, makes a fine choice. This says to the community that this is a + draft, and if they are interested in the details, they can follow the URL to + find out more. + +* When developing an extension for which you intend to register a short name, + you may wish to test it using the short name even before you have registered + it. However, you MUST register the name before using the extension for + non-test purposes/for purposes where interoperability with other + implementations/users is a concern. + +* If you are implementing a well-known extension like a data type or codec that + is already referred to by name in the community, you may want to check the `zarr-extensions`_ + repository to see if someone has already implemented the extension. + +* For raw names that are coming from well-known projects, use the same prefix followed + by a dot for requesting your raw name, e.g. "numcodecs.". Other examples of prefixes can + be found in the `zarr-extensions`_ repository. + +* If you migrate your URL-based extension to a new location, try to redirect the + previous URL to the new location or document the migration. Similarly, if you + register a raw name extension after having used an URL-based extension in production, + cross-link the two pages. + Implementation Notes ==================== This section is non-normative and presents notes from implementers about cases that need to be carefully considered but do not strictly fall into the spec. - Resizing -------- From dbeb796b09110d6f92a27dfc604eaf617cf93c46 Mon Sep 17 00:00:00 2001 From: Norman Rzepka Date: Sat, 1 Mar 2025 21:58:03 +0100 Subject: [PATCH 46/64] change extension example --- docs/v3/core/index.rst | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/docs/v3/core/index.rst b/docs/v3/core/index.rst index c804f06d..696b4de7 100644 --- a/docs/v3/core/index.rst +++ b/docs/v3/core/index.rst @@ -1612,14 +1612,14 @@ The following example of array metadata demonstrates these extension naming sche { "zarr_format": 3, - "data_type": "https://example.com/zarr/string", // URL-based name, short-hand name + "data_type": "string", // raw name, short-hand name "chunk_key_encoding": { "name": "default", // core "configuration": { "separator": "." } }, "codecs": [ { - "name": "https://numcodecs.dev/vlen-utf8" // URL-based name + "name": "vlen-utf8" // raw name }, { "name": "zstd", // raw name @@ -1632,8 +1632,7 @@ The following example of array metadata demonstrates these extension naming sche }, "shape": [ 128 ], "dimension_names": [ "x" ], - "attributes": { ... }, - "storage_transformers": [] + "attributes": { ... } } Extension specifications From 27246486ece3ae24dae2645f1c5d5bcb12f71846 Mon Sep 17 00:00:00 2001 From: Ryan Abernathey Date: Thu, 6 Mar 2025 12:30:48 -0500 Subject: [PATCH 47/64] use namespaced names --- docs/v3/core/index.rst | 72 +++++++++++++++--------------------------- 1 file changed, 25 insertions(+), 47 deletions(-) diff --git a/docs/v3/core/index.rst b/docs/v3/core/index.rst index 696b4de7..e35722b2 100644 --- a/docs/v3/core/index.rst +++ b/docs/v3/core/index.rst @@ -1087,8 +1087,9 @@ To allow for flexibility to define and implement new codecs, the list of codecs defined for an array MAY contain codecs which are defined in separate specifications. In order to refer to codecs in array metadata documents, each codec must have a unique identifier, which is either -a known "`raw name `_" or -a "`URL-based name `_" as defined under :ref:`extensions_section`. +a known "`raw name `_" (for registered extensions) or +a "`namespaced extension `_" (for private / +experimental extensions)as defined under :ref:`extensions_section`. For ease of discovery, it is recommended that codec specifications are contributed to the registry of extensions @@ -1515,7 +1516,7 @@ an object. For example:: { - "name": "", # "raw name" or URL-based name + "name": "", # "raw name" or namespaced name "configuration": { ... } # optional } @@ -1553,19 +1554,22 @@ of the explicit use of :ref:`extension-points`. Extension naming ---------------- -The `name` field of an extension can take two forms: **raw names** and **URL-based names**. +The `name` field of an extension can take two forms: **raw names** and **namespaced names**. .. _extension-naming-raw-names: Raw names ^^^^^^^^^ -Raw names are centrally registered names which can be used without prefix. +Raw names consist of a single string that is unique within the Zarr ecosystem, with no prefix. +Raw names are intended for well-known extensions aimed at broad adoption and maximum interoperability. Raw names MUST be assigned within a central repository. Raw names are unique and immutable. Raw names MUST start with one lower case letter a-z and then be followed -by only lower case letters a-z, numerals 0-9, underscores, dashes, and dots. +by only lower case letters a-z, numerals 0-9, underscores, and dashes. +Raw names MUST NOT use a dot character `.`, to avoid confusion with namespaced extensions. + Raw name assignment is managed through the `zarr-extensions`_ Github repository, where extensions and their specification are listed. The Zarr Steering Council or by delegation a @@ -1573,27 +1577,20 @@ maintainer team reserves the right to refuse name assignment at its own discretion. - **Example:** ``zstd`` -- **Accepted regex:** ``^[a-z][a-z0-9-_.]+$`` +- **Accepted regex:** ``^[a-z][a-z0-9-_]+$`` -.. _extension-naming-url-based-names: +.. _extension-naming-namespaced-names: -URL-based names -^^^^^^^^^^^^^^^ +Namespaced names +^^^^^^^^^^^^^^^^ -URL-based names delegate the assignment of unique identifiers to -the well-established addressing mechanism of the web. +Namespaced names are intended for private extensions and for experimental and development purposes. +Namespaced names start with a prefix of one or more parts, each separated by the `.` character. -URL-based names are decentralized and MAY be used by any extension without further coordination. -Entities defining a URL-based name SHOULD have appropriate -authority over the URL. A persistent redirecting URL like PURL MAY be used. -URLs have been chosen due to their potential for being self-documenting. -While a URL SHOULD resolve to a human-readable -explanation of the extension, preferably including a JSON schema definition -of the extension metadata, implementations are not expected to resolve -URLs during processing. +Namespaced names are not centrally managed and MAY be used by any extension without coordination. -- **Example:** ``https://example.com/zarr3/consolidated-metadata`` -- **Accepted regex:** ``^https?:\/\/[^/?#]+[^?#]*$`` +- **Example:** ``myorg.my-private-extension`` +- **Accepted regex:** ``^([a-z][a-z0-9-_]+\.)+[a-z][a-z0-9-_]+$`` Extension versioning -------------------- @@ -1601,7 +1598,7 @@ Extension versioning Extensions with **raw names** SHOULD follow the compatibility and versioning v3 `stability policy`_. -For extensions with **URL-based names**, there are no guarantees in terms of +For extensions with **namespaced names**, there are no guarantees in terms of versioning or compatibility. However, preserving backwards-compatibility is strongly encouraged. @@ -1644,10 +1641,6 @@ facilitates multiple implementations of an extension. For extensions with raw names, the `zarr-extensions`_ repository SHOULD either contain the specification or link to it. -For extensions with URL-based names, it is RECOMMENDED that the URL resolve to -a specification of the extension. Additionally, URL-based extensions MAY also register -themselves under the `zarr-extensions`_ repository for better discovery. - Because the ``fill_value`` metadata key is dependent on the data type, extension data types SHOULD specify permitted values for the ``fill_value`` in their specification. @@ -1661,31 +1654,16 @@ Guidance for extension authors *This section is non-normative and provides assistance for the authors of extensions, especially those who are just getting started.* -* If you are just getting started, use the URL of your work-in-progress as an - identifier for your extension. The GitHub link, including the branch if you - would like, makes a fine choice. This says to the community that this is a - draft, and if they are interested in the details, they can follow the URL to - find out more. +* If you are just getting started, use a namespaced extension for your extension name. + As you extension matures, you may consider registering it using a Raw name. -* When developing an extension for which you intend to register a short name, - you may wish to test it using the short name even before you have registered - it. However, you MUST register the name before using the extension for - non-test purposes/for purposes where interoperability with other - implementations/users is a concern. +* If you intend to distribute data widely using your extension, you SHOULD register your + extension using Raw name, rather than a namespaced name, in the extension repository. * If you are implementing a well-known extension like a data type or codec that is already referred to by name in the community, you may want to check the `zarr-extensions`_ repository to see if someone has already implemented the extension. -* For raw names that are coming from well-known projects, use the same prefix followed - by a dot for requesting your raw name, e.g. "numcodecs.". Other examples of prefixes can - be found in the `zarr-extensions`_ repository. - -* If you migrate your URL-based extension to a new location, try to redirect the - previous URL to the new location or document the migration. Similarly, if you - register a raw name extension after having used an URL-based extension in production, - cross-link the two pages. - Implementation Notes ==================== @@ -1753,7 +1731,7 @@ by time. - Clarification of extensions. `PR #330 `_. With this change, - it is now possible to register new names or even use URLs for extensions. + it is now possible to add user-defined extensions. Additionally, extensions may be marked with `must_understand=False` in case a non-implementing library can safely ignore them. Please see the new :ref:`Extensions section ` From 5c03a247665bc864b349db4d3c81d95a7e40391a Mon Sep 17 00:00:00 2001 From: Ryan Abernathey Date: Thu, 13 Mar 2025 09:03:57 -0600 Subject: [PATCH 48/64] add URI as names section --- docs/v3/core/index.rst | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/docs/v3/core/index.rst b/docs/v3/core/index.rst index e35722b2..7711f5ef 100644 --- a/docs/v3/core/index.rst +++ b/docs/v3/core/index.rst @@ -1592,6 +1592,14 @@ Namespaced names are not centrally managed and MAY be used by any extension with - **Example:** ``myorg.my-private-extension`` - **Accepted regex:** ``^([a-z][a-z0-9-_]+\.)+[a-z][a-z0-9-_]+$`` +URIs as names +------------- + +In an earlier draft of this spec, the name of an extension codec was required to be a URI that +dereferences to a human-readable codec specification. +That is now discouraged for new extensions; either raw names or namespaced names should be used instead. +However, for backwards compatibility with existing extensions, URI names are permitted. + Extension versioning -------------------- From 1ba5e415d2c51e5e474167f91572bb9c0da47da0 Mon Sep 17 00:00:00 2001 From: Josh Moore Date: Sun, 16 Mar 2025 14:24:50 +0100 Subject: [PATCH 49/64] Update docs/v3/core/index.rst Co-authored-by: Davis Bennett --- docs/v3/core/index.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/v3/core/index.rst b/docs/v3/core/index.rst index 7711f5ef..041ddd86 100644 --- a/docs/v3/core/index.rst +++ b/docs/v3/core/index.rst @@ -1666,7 +1666,7 @@ extensions, especially those who are just getting started.* As you extension matures, you may consider registering it using a Raw name. * If you intend to distribute data widely using your extension, you SHOULD register your - extension using Raw name, rather than a namespaced name, in the extension repository. + extension using a Raw name, rather than a namespaced name, in the extension repository. * If you are implementing a well-known extension like a data type or codec that is already referred to by name in the community, you may want to check the `zarr-extensions`_ From 9eea54e4bdc85b3ca9d7cb501932364d7ae44d37 Mon Sep 17 00:00:00 2001 From: Josh Moore Date: Sun, 16 Mar 2025 16:38:44 +0100 Subject: [PATCH 50/64] Minor updates to front matter --- docs/v3/core/index.rst | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/docs/v3/core/index.rst b/docs/v3/core/index.rst index 041ddd86..1fd28b8b 100644 --- a/docs/v3/core/index.rst +++ b/docs/v3/core/index.rst @@ -17,8 +17,9 @@ Editors: * Jeremy Maitin-Shepard (`@jbms `_), Google * Josh Moore (`@joshmoore `_), German BioImaging -Corresponding ZEP: - `ZEP0001 — Zarr specification version 3 `_ +Corresponding ZEPs: + * `ZEP0001 — Zarr specification version 3 `_ + * `ZEP0009 — Zarr extension naming `_ Issue tracking: `GitHub issues `_ @@ -42,7 +43,7 @@ This specification defines the Zarr format for N-dimensional typed arrays. Status of this document ======================= -ZEP0001 was accepted on May 15th, 2023 via https://github.com/zarr-developers/zarr-specs/issues/227. + * ZEP0001 was accepted on May 15th, 2023 via https://github.com/zarr-developers/zarr-specs/issues/227. Introduction @@ -125,8 +126,8 @@ implementing a specification ``X.Y`` can be considered compatible with all datasets which only use features contained in version ``X.Y``. For example, spec ``X.1`` adds core feature "foo" compared to ``X.0``. Assuming -implementation A implements ``X.1`` and implementation B implements ``X.0``. -Data using feature "foo" can only be read with implementation A. B fails to open +implementation A implements ``X.1`` and implementation B implements ``X.0``, +data using feature "foo" can only be read with implementation A. B fails to open it, as the key "foo" is unknown. Data not using "foo" can be used with both implementations, even if it's written @@ -689,7 +690,7 @@ above, but using a (currently made up) extension data type:: "node_type": "array", "shape": [10000, 1000], "data_type": { - "name": "datetime", + "name": "urn:example:datetime", "configuration": { "unit": "ns" } From 40732c09c937e62e37d26abcd9b2fadce6dea80c Mon Sep 17 00:00:00 2001 From: Josh Moore Date: Sun, 16 Mar 2025 17:13:33 +0100 Subject: [PATCH 51/64] Move fill_value to data_type section --- docs/v3/core/index.rst | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/docs/v3/core/index.rst b/docs/v3/core/index.rst index 1fd28b8b..b846a9b4 100644 --- a/docs/v3/core/index.rst +++ b/docs/v3/core/index.rst @@ -509,6 +509,10 @@ mandatory names: identifier provided as a string. For example, ``"float64"`` for little-endian 64-bit floating point number. + Because the ``fill_value`` metadata key is dependent on the data type, + extension data types SHOULD specify permitted values for the ``fill_value`` in + their specification. + .. _array-metadata-chunk-grid: ``chunk_grid`` @@ -1650,11 +1654,6 @@ facilitates multiple implementations of an extension. For extensions with raw names, the `zarr-extensions`_ repository SHOULD either contain the specification or link to it. -Because the ``fill_value`` metadata key is dependent on the data type, -extension data types SHOULD specify permitted values for the ``fill_value`` in -their specification. - - .. _extension-guidance: Guidance for extension authors From 7bb1cb1106c892e77901d8033f174dc29bede4be Mon Sep 17 00:00:00 2001 From: Josh Moore Date: Mon, 17 Mar 2025 08:25:48 +0100 Subject: [PATCH 52/64] Rename sections --- docs/v3/core/index.rst | 151 +++++++++++++++++++++++++---------------- 1 file changed, 93 insertions(+), 58 deletions(-) diff --git a/docs/v3/core/index.rst b/docs/v3/core/index.rst index b846a9b4..bd407368 100644 --- a/docs/v3/core/index.rst +++ b/docs/v3/core/index.rst @@ -1091,10 +1091,8 @@ Extension codecs To allow for flexibility to define and implement new codecs, the list of codecs defined for an array MAY contain codecs which are defined in separate specifications. In order to refer to codecs in array metadata -documents, each codec must have a unique identifier, which is either -a known "`raw name `_" (for registered extensions) or -a "`namespaced extension `_" (for private / -experimental extensions)as defined under :ref:`extensions_section`. +documents, each codec must have a conformant identifier as specified under +"`extension naming`_" below. For ease of discovery, it is recommended that codec specifications are contributed to the registry of extensions @@ -1521,8 +1519,8 @@ an object. For example:: { - "name": "", # "raw name" or namespaced name - "configuration": { ... } # optional + "name": "", # conformant name + "configuration": { ... } # optional object } .. _extension-definition-short-hand-name: @@ -1539,8 +1537,8 @@ objects with just a `name` key. `must_understand` ^^^^^^^^^^^^^^^^^ -If such an object is present, the field `must_understand` is implicitly set to -`True` and an object MAY explicitly set `must_understand=False` if +An extension object is interpreted to have an implicit field `must_understand` set to +`True`, unless otherwise stated. An extension object MAY explicitly set `must_understand=False` if implementations can ignore its presence. An implementation MUST fail to open Zarr groups or arrays if any @@ -1559,56 +1557,111 @@ of the explicit use of :ref:`extension-points`. Extension naming ---------------- -The `name` field of an extension can take two forms: **raw names** and **namespaced names**. +The `name` field of an extension can take two forms: **registered names** (as simple strings) +and **unregistered names** (as URIs). -.. _extension-naming-raw-names: +.. _extension-naming-registered-names: -Raw names -^^^^^^^^^ +Registered names +^^^^^^^^^^^^^^^^ -Raw names consist of a single string that is unique within the Zarr ecosystem, with no prefix. -Raw names are intended for well-known extensions aimed at broad adoption and maximum interoperability. +Registered names consist of a single string that is unique within the Zarr ecosystem, with no prefix. +Registered names are intended for well-known extensions aimed at broad adoption and maximum interoperability. -Raw names MUST be assigned within a central repository. -Raw names are unique and immutable. -Raw names MUST start with one lower case letter a-z and then be followed -by only lower case letters a-z, numerals 0-9, underscores, and dashes. -Raw names MUST NOT use a dot character `.`, to avoid confusion with namespaced extensions. +Registered names MUST be assigned within a central repository. +Registered names are unique and immutable. +Registered names MUST start with one lower case letter a-z and then be followed +by only lower case letters a-z, numerals 0-9, underscores, dots and dashes. -Raw name assignment is managed through the `zarr-extensions`_ +Registered name assignment is managed through the `zarr-extensions`_ Github repository, where extensions and their specification are listed. The Zarr Steering Council or by delegation a maintainer team reserves the right to refuse name assignment at its own discretion. - **Example:** ``zstd`` -- **Accepted regex:** ``^[a-z][a-z0-9-_]+$`` +- **Accepted regex:** ``^[a-z][a-z0-9-_.]+$`` -.. _extension-naming-namespaced-names: +.. _extension-naming-unregistered-names: -Namespaced names -^^^^^^^^^^^^^^^^ +Unregistered names +^^^^^^^^^^^^^^^^^^ + +Unregistered names are intended for private extensions and for experimental and development purposes. + +Unregistered names are not centrally managed and MAY be used by any extension without coordination. + +Unregistered names consist of URIs, which +are prefixed with a scheme beginning with a letter and followed by +any number of letters, numbers, plus symbols, dashes or dots and then followed by a colon. + +- **Identifying regex:** ``^([a-z][a-z0-9-_]+\.)+[a-z][a-z0-9-_]+:`` + ``[A-Za-z][A-Za-z0-9+\-.]*`` + +URIs (`Uniform Resource Identifiers `_) +are a well-known mechanism to identify resources on the internet. +(`Uniform Resource Locators `_) are one class of URIs which +provide a mechanism for resolving resources. + +In previous versions of the v3 spec, the name of an extension was required to +be a URI that dereferences to a human-readable codec specification, i.e. a URL. +That is now discouraged for new extensions, though, for backwards compatibility +with existing extensions, URLs names are still permitted. + +Instead, extension names SHOULD either be registered names or simpler URNs. +URNs (`Uniform Resource Names `_) +are persistent identifiers assigned within defined namespaces. + +TODO: The goal of using URI identifiers is to provide a large and flexible namespace which +balances the needs of developers building new extensions with a extensible mechanism +which the Zarr community can make use of in the years to come. We understand there may +be several reasons that someone would not want to register a name. + +.. _extension-guidance: + +Guidance for extension authors +------------------------------ + +*This section is non-normative and provides assistance for the authors of +extensions, especially those who are just getting started.* + +Below will find +guidance how best to get started. + +* **Local development**: Authors looking to define a name for local development + purposes should prefix their extensions with ``urn:x-`` for "experimental". + +* **Proprietary extensions**: Authors looking + +* **UUID**: ``urn:uuid:...`` + +Nevertheless, the Zarr maintainers endeavor to make the registration of names as +straight-forward as possible. We encourage all authors to make use of the extensions +repository to prevent duplicate efforts across the community where possible. -Namespaced names are intended for private extensions and for experimental and development purposes. -Namespaced names start with a prefix of one or more parts, each separated by the `.` character. -Namespaced names are not centrally managed and MAY be used by any extension without coordination. -- **Example:** ``myorg.my-private-extension`` -- **Accepted regex:** ``^([a-z][a-z0-9-_]+\.)+[a-z][a-z0-9-_]+$`` -URIs as names -------------- -In an earlier draft of this spec, the name of an extension codec was required to be a URI that -dereferences to a human-readable codec specification. -That is now discouraged for new extensions; either raw names or namespaced names should be used instead. -However, for backwards compatibility with existing extensions, URI names are permitted. +* If you are just getting started, use a namespaced extension for your extension name. + As you extension matures, you may consider registering it using a registered name. + +* If you intend to distribute data widely using your extension, you SHOULD register your + extension using a registered name, rather than a namespaced name, in the extension repository. + +* If you are implementing a well-known extension like a data type or codec that + is already referred to by name in the community, you may want to check the `zarr-extensions`_ + repository to see if someone has already implemented the extension. + +.. note:: + The simple form of the registered names can be thought of as a short-hand + for a URN prefixed with ``urn:zarr:``. Formal registration with + `IANA `_ will not change the validity of the simple form. Extension versioning -------------------- -Extensions with **raw names** SHOULD follow the +Extensions with **registered names** SHOULD follow the compatibility and versioning v3 `stability policy`_. For extensions with **namespaced names**, there are no guarantees in terms of @@ -1622,17 +1675,17 @@ The following example of array metadata demonstrates these extension naming sche { "zarr_format": 3, - "data_type": "string", // raw name, short-hand name + "data_type": "string", // registered, short-hand name "chunk_key_encoding": { "name": "default", // core "configuration": { "separator": "." } }, "codecs": [ { - "name": "vlen-utf8" // raw name + "name": "vlen-utf8" // registered name }, { - "name": "zstd", // raw name + "name": "zstd", // registered name "configuration": { ... } } ], @@ -1651,27 +1704,9 @@ Extension specifications Extensions SHOULD have a published specification. A published specification facilitates multiple implementations of an extension. -For extensions with raw names, the `zarr-extensions`_ repository +For extensions with registered names, the `zarr-extensions`_ repository SHOULD either contain the specification or link to it. -.. _extension-guidance: - -Guidance for extension authors ------------------------------- - -*This section is non-normative and provides assistance for the authors of -extensions, especially those who are just getting started.* - -* If you are just getting started, use a namespaced extension for your extension name. - As you extension matures, you may consider registering it using a Raw name. - -* If you intend to distribute data widely using your extension, you SHOULD register your - extension using a Raw name, rather than a namespaced name, in the extension repository. - -* If you are implementing a well-known extension like a data type or codec that - is already referred to by name in the community, you may want to check the `zarr-extensions`_ - repository to see if someone has already implemented the extension. - Implementation Notes ==================== From 7b1baff897308947138741ec26e1640ef1039898 Mon Sep 17 00:00:00 2001 From: Josh Moore Date: Tue, 25 Mar 2025 11:15:25 +0100 Subject: [PATCH 53/64] Merge multiple edits --- docs/v3/core/index.rst | 99 +++++++++++++++++++++++++----------------- 1 file changed, 59 insertions(+), 40 deletions(-) diff --git a/docs/v3/core/index.rst b/docs/v3/core/index.rst index bd407368..288e78c6 100644 --- a/docs/v3/core/index.rst +++ b/docs/v3/core/index.rst @@ -1473,8 +1473,14 @@ Storage transformers may be stacked to combine different functionalities: Extensions ========== -This section describes how additional functionality can be defined -for Zarr datasets by the `metadata documents`_. +Additional functionality and features can be enabled in Zarr datasets through +extensions defined in `metadata documents`_. Each extension corresponds to a +specific extension point, such as data types or codecs. Extensions may include +optional configuration, which can be provided via structured objects. Proper +naming is essential for cross-implementation interoperability, ensuring +extensions are recognized and used consistently. This section outlines +available extension points, the structural constraints on extensions, and +naming conventions. .. _extension-points: @@ -1557,8 +1563,11 @@ of the explicit use of :ref:`extension-points`. Extension naming ---------------- -The `name` field of an extension can take two forms: **registered names** (as simple strings) -and **unregistered names** (as URIs). +The `name` field of an extension is an identifier taking one of two forms: +**registered names** (as simple strings) and **unregistered names** (as URIs). + +Implementations SHOULD be able to resolve multiple names to the same +implementation to support unregistered names which are subsequently registered. .. _extension-naming-registered-names: @@ -1567,20 +1576,22 @@ Registered names Registered names consist of a single string that is unique within the Zarr ecosystem, with no prefix. Registered names are intended for well-known extensions aimed at broad adoption and maximum interoperability. - -Registered names MUST be assigned within a central repository. Registered names are unique and immutable. Registered names MUST start with one lower case letter a-z and then be followed by only lower case letters a-z, numerals 0-9, underscores, dots and dashes. -Registered name assignment is managed through the `zarr-extensions`_ -Github repository, where extensions and their specification are listed. +Registered names MUST be assigned within a central repository, `zarr-extensions`_ +a Github repository, where extensions and their specification are listed. The Zarr Steering Council or by delegation a maintainer team reserves the right to refuse name assignment at its own discretion. -- **Example:** ``zstd`` - **Accepted regex:** ``^[a-z][a-z0-9-_.]+$`` +- **Valid examples:** + - ``zstd`` + - ``numcodecs.adler32`` +- **Invalid examples:** + - ``foo/bar`` .. _extension-naming-unregistered-names: @@ -1596,26 +1607,33 @@ are prefixed with a scheme beginning with a letter and followed by any number of letters, numbers, plus symbols, dashes or dots and then followed by a colon. - **Identifying regex:** ``^([a-z][a-z0-9-_]+\.)+[a-z][a-z0-9-_]+:`` - ``[A-Za-z][A-Za-z0-9+\-.]*`` + + + TODO: The goal of using URI identifiers is to provide a large and flexible namespace which + balances the needs of developers building new extensions with a extensible mechanism + which the Zarr community can make use of in the years to come. We understand there may + be several reasons that someone would not want to register a name. + URIs (`Uniform Resource Identifiers `_) -are a well-known mechanism to identify resources on the internet. -(`Uniform Resource Locators `_) are one class of URIs which -provide a mechanism for resolving resources. +are a well-known mechanism to identify resources on the internet and extension authors are +encouraged to explore further documentation on which identifiers might best express their intent. +Aware that not all extension developers will want to immediately register a name, +the goal of using URI identifiers is to provide a large and flexible namespace which +balances the needs of developers building new extensions with a extensible mechanism +which the Zarr community can make use of in the years to come. + +URLs (`Uniform Resource Locators `_) are one class of URIs which +provide a mechanism for resolving resources. In previous versions of the v3 spec, the name of an extension was required to be a URI that dereferences to a human-readable codec specification, i.e. a URL. That is now discouraged for new extensions, though, for backwards compatibility with existing extensions, URLs names are still permitted. -Instead, extension names SHOULD either be registered names or simpler URNs. +Instead, extension names SHOULD either be registered names as specified above or URNs. URNs (`Uniform Resource Names `_) -are persistent identifiers assigned within defined namespaces. - -TODO: The goal of using URI identifiers is to provide a large and flexible namespace which -balances the needs of developers building new extensions with a extensible mechanism -which the Zarr community can make use of in the years to come. We understand there may -be several reasons that someone would not want to register a name. +are simpler persistent identifiers assigned within defined namespaces. .. _extension-guidance: @@ -1625,33 +1643,34 @@ Guidance for extension authors *This section is non-normative and provides assistance for the authors of extensions, especially those who are just getting started.* -Below will find -guidance how best to get started. +TODO Below will find guidance how best to get started. * **Local development**: Authors looking to define a name for local development - purposes should prefix their extensions with ``urn:x-`` for "experimental". - -* **Proprietary extensions**: Authors looking - -* **UUID**: ``urn:uuid:...`` - -Nevertheless, the Zarr maintainers endeavor to make the registration of names as -straight-forward as possible. We encourage all authors to make use of the extensions -repository to prevent duplicate efforts across the community where possible. - +purposes should prefix their extensions with ``urn:x-``. This prefix defines an +"experimental" name. As such an extension matures, authors might consider registering +a new name for it. Implementations should check both for the unregistered as well +as the registered named. +* **Proprietary extensions**: Authors looking to create proprietary extensions +which are only interpretable within their own institutions are encouraged to +take ownership of their "own" namespace, ``urn:x-company`` or ``urn:x-domain.name``. +* **Complete opaquness**: Authors looking for a prefix which is communicates +*nothing* to implementations MAY use the prefix ``urn:uuid:...`` following +by following by a valid +UUID (`Universally Unique Identifier `_). +* If you are implementing a well-known extension like a data type or codec that +is already referred to by name in the community, you may want to check the `zarr-extensions`_ +repository to see if someone has already implemented the extension. -* If you are just getting started, use a namespaced extension for your extension name. - As you extension matures, you may consider registering it using a registered name. +* Authors intending to create significant amounts of data or widely distributed data +should consider registering all extensions in the extension registry to TODO -* If you intend to distribute data widely using your extension, you SHOULD register your - extension using a registered name, rather than a namespaced name, in the extension repository. +The Zarr maintainers endeavor to make the registration of names as +straight-forward as possible. We encourage all authors to make use of the extensions +repository to prevent duplicate efforts across the community where possible. -* If you are implementing a well-known extension like a data type or codec that - is already referred to by name in the community, you may want to check the `zarr-extensions`_ - repository to see if someone has already implemented the extension. .. note:: The simple form of the registered names can be thought of as a short-hand @@ -1664,7 +1683,7 @@ Extension versioning Extensions with **registered names** SHOULD follow the compatibility and versioning v3 `stability policy`_. -For extensions with **namespaced names**, there are no guarantees in terms of +For extensions with **unregistered names**, there are no guarantees in terms of versioning or compatibility. However, preserving backwards-compatibility is strongly encouraged. From aa9d87ebe1a7f23d471f5d6fbdfd4daaa20cd7f8 Mon Sep 17 00:00:00 2001 From: Josh Moore Date: Wed, 26 Mar 2025 10:18:19 +0100 Subject: [PATCH 54/64] cleanup --- docs/v3/core/index.rst | 56 ++++++++++++++++++++---------------------- 1 file changed, 27 insertions(+), 29 deletions(-) diff --git a/docs/v3/core/index.rst b/docs/v3/core/index.rst index 288e78c6..4549c84d 100644 --- a/docs/v3/core/index.rst +++ b/docs/v3/core/index.rst @@ -1092,7 +1092,7 @@ To allow for flexibility to define and implement new codecs, the list of codecs defined for an array MAY contain codecs which are defined in separate specifications. In order to refer to codecs in array metadata documents, each codec must have a conformant identifier as specified under -"`extension naming`_" below. +"`extension naming `_" below. For ease of discovery, it is recommended that codec specifications are contributed to the registry of extensions @@ -1602,19 +1602,12 @@ Unregistered names are intended for private extensions and for experimental and Unregistered names are not centrally managed and MAY be used by any extension without coordination. -Unregistered names consist of URIs, which -are prefixed with a scheme beginning with a letter and followed by -any number of letters, numbers, plus symbols, dashes or dots and then followed by a colon. +Unregistered names consist of URIs, which are prefixed with a scheme beginning +with a letter and followed by any number of letters, numbers, plus symbols, +dashes or dots and then followed by a colon. - **Identifying regex:** ``^([a-z][a-z0-9-_]+\.)+[a-z][a-z0-9-_]+:`` - - TODO: The goal of using URI identifiers is to provide a large and flexible namespace which - balances the needs of developers building new extensions with a extensible mechanism - which the Zarr community can make use of in the years to come. We understand there may - be several reasons that someone would not want to register a name. - - URIs (`Uniform Resource Identifiers `_) are a well-known mechanism to identify resources on the internet and extension authors are encouraged to explore further documentation on which identifiers might best express their intent. @@ -1643,34 +1636,39 @@ Guidance for extension authors *This section is non-normative and provides assistance for the authors of extensions, especially those who are just getting started.* -TODO Below will find guidance how best to get started. +Recognizing that there are diverse considerations in choosing an extension +name, guidance is provided below based on generic scenarios. Extension authors +who are still unsure of how best to choose a name are welcome to open an issue +on the zarr-specs repository. * **Local development**: Authors looking to define a name for local development -purposes should prefix their extensions with ``urn:x-``. This prefix defines an -"experimental" name. As such an extension matures, authors might consider registering -a new name for it. Implementations should check both for the unregistered as well -as the registered named. + purposes should prefix their extensions with ``urn:x-``. This prefix defines + an "experimental" name. As such an extension matures, authors might consider + registering a new name for it. Implementations should check both for the + unregistered as well as the registered named. * **Proprietary extensions**: Authors looking to create proprietary extensions -which are only interpretable within their own institutions are encouraged to -take ownership of their "own" namespace, ``urn:x-company`` or ``urn:x-domain.name``. + which are only interpretable within their own institutions are encouraged to + take ownership of their "own" namespace, ``urn:x-company`` or + ``urn:x-domain.name``. -* **Complete opaquness**: Authors looking for a prefix which is communicates -*nothing* to implementations MAY use the prefix ``urn:uuid:...`` following -by following by a valid -UUID (`Universally Unique Identifier `_). - -* If you are implementing a well-known extension like a data type or codec that -is already referred to by name in the community, you may want to check the `zarr-extensions`_ -repository to see if someone has already implemented the extension. - -* Authors intending to create significant amounts of data or widely distributed data -should consider registering all extensions in the extension registry to TODO +* **Complete opaqueness**: Authors looking for a prefix which is communicates + *nothing* to implementations MAY use the prefix ``urn:uuid:...`` following by + following by a valid UUID (`Universally Unique Identifier + `_). The Zarr maintainers endeavor to make the registration of names as straight-forward as possible. We encourage all authors to make use of the extensions repository to prevent duplicate efforts across the community where possible. +* **Well-known extensions**: If you are implementing a well-known extension + like a data type or codec that is already referred to by name in the + community, you may want to check the `zarr-extensions`_ repository to see if + someone has already implemented the extension. + +* **Production extensions**: Authors intending to create significant amounts of + data or widely distributed data should consider registering all extensions in + the extension registry to increase the long-term maintainability of the data. .. note:: The simple form of the registered names can be thought of as a short-hand From fac16fb85ecea067c679c0c0e2ad78ae7920565d Mon Sep 17 00:00:00 2001 From: Josh Moore Date: Wed, 26 Mar 2025 10:20:45 +0100 Subject: [PATCH 55/64] Fix objection typo --- docs/v3/core/index.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/v3/core/index.rst b/docs/v3/core/index.rst index 4549c84d..6df9d647 100644 --- a/docs/v3/core/index.rst +++ b/docs/v3/core/index.rst @@ -1517,7 +1517,7 @@ Objects In `metadata documents`_, extensions can be encoded either as objects or as short-hand names. -If using an objection definition, the member ``name`` +If using an object definition, the member ``name`` MUST be a plain string which conforms to :ref:`extension name `. Optionally, the member ``configuration`` MAY be present but if so MUST be an object. From b24440e8dd48566e5cb3d4d3ecb7799bb9829599 Mon Sep 17 00:00:00 2001 From: Josh Moore Date: Mon, 10 Mar 2025 13:13:50 +0100 Subject: [PATCH 56/64] Minor fix --- docs/v3/core/index.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/v3/core/index.rst b/docs/v3/core/index.rst index 6df9d647..08f4731f 100644 --- a/docs/v3/core/index.rst +++ b/docs/v3/core/index.rst @@ -504,8 +504,8 @@ mandatory names: ``data_type`` is an :ref:`extension point` and MUST conform to the :ref:`extension-definition`. - If the data type is defined in - this specification, then the value must be the data type + If the data type is defined in :ref:`this specification `, + then the value must be the data type identifier provided as a string. For example, ``"float64"`` for little-endian 64-bit floating point number. From 2d684a4a20dd85d06462e1b19a73b0c9fb2181b2 Mon Sep 17 00:00:00 2001 From: Josh Moore Date: Sat, 29 Mar 2025 15:14:24 +0100 Subject: [PATCH 57/64] Apply suggestions from code review Co-authored-by: Ryan Abernathey Co-authored-by: Norman Rzepka --- docs/v3/core/index.rst | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/docs/v3/core/index.rst b/docs/v3/core/index.rst index e3f79ab2..8186a3d8 100644 --- a/docs/v3/core/index.rst +++ b/docs/v3/core/index.rst @@ -1581,7 +1581,8 @@ Registered names are unique and immutable. Registered names MUST start with one lower case letter a-z and then be followed by only lower case letters a-z, numerals 0-9, underscores, dots and dashes. -Registered names MUST be assigned within a central repository, `zarr-extensions`_ +Prior to release in any implementation, +registered names MUST be assigned within a central repository, `zarr-extensions`_ a Github repository, where extensions and their specification are listed. The Zarr Steering Council or by delegation a maintainer team reserves the right to refuse name assignment at its own @@ -1593,6 +1594,7 @@ discretion. - ``numcodecs.adler32`` - **Invalid examples:** - ``foo/bar`` + - ``foo:bar`` .. _extension-naming-unregistered-names: @@ -1603,14 +1605,15 @@ Unregistered names are intended for private extensions and for experimental and Unregistered names are not centrally managed and MAY be used by any extension without coordination. -Unregistered names consist of URIs, which are prefixed with a scheme beginning +Unregistered names MUST be URIs, which are prefixed with a scheme beginning with a letter and followed by any number of letters, numbers, plus symbols, dashes or dots and then followed by a colon. +The use of URI names ensures that unregistered extensions will never conflict with or override registered extensions. - **Identifying regex:** ``^([a-z][a-z0-9-_]+\.)+[a-z][a-z0-9-_]+:`` URIs (`Uniform Resource Identifiers `_) -are a well-known mechanism to identify resources on the internet and extension authors are +are a well-known mechanism to identify abstract or phyiscal resources and extension authors are encouraged to explore further documentation on which identifiers might best express their intent. Aware that not all extension developers will want to immediately register a name, From bf12d64f72e4c98ca7722052865b73e77a6d5ce2 Mon Sep 17 00:00:00 2001 From: Josh Moore Date: Sat, 29 Mar 2025 15:34:13 +0100 Subject: [PATCH 58/64] Introduction of tag: Co-authored-by: Ryan Abernathey --- docs/v3/core/index.rst | 13 ++++++++++--- 1 file changed, 10 insertions(+), 3 deletions(-) diff --git a/docs/v3/core/index.rst b/docs/v3/core/index.rst index 8186a3d8..be24e182 100644 --- a/docs/v3/core/index.rst +++ b/docs/v3/core/index.rst @@ -1628,9 +1628,16 @@ be a URI that dereferences to a human-readable codec specification, i.e. a URL. That is now discouraged for new extensions, though, for backwards compatibility with existing extensions, URLs names are still permitted. -Instead, extension names SHOULD either be registered names as specified above or URNs. -URNs (`Uniform Resource Names `_) -are simpler persistent identifiers assigned within defined namespaces. +Instead, new unregistered extension names SHOULD use the [Tag URI scheme](https://datatracker.ietf.org/doc/html/rfc4151). +The Tag URI scheme has four principal requirements: +> - Identifiers are likely to be unique across space and time, and come from a practically inexhaustible supply. +> - Identifiers are relatively convenient for humans to mint (create), read, type, remember etc. +> - No central registration is necessary, at least for holders of domain names or email addresses; and there is negligible cost to mint each new identifier. +> - The identifiers are independent of any particular resolution scheme. + +These requirements are aligned well with the needs of Zarr extension developers. + +An example of a Tag URI for a Zarr extension is `tag:josh@openmicroscopy.org,2025-03:experimental-new-dtype`. .. _extension-guidance: From 97c483b3c28ccdb1dcb05276c178926d5c2868de Mon Sep 17 00:00:00 2001 From: Josh Moore Date: Sat, 29 Mar 2025 16:06:39 +0100 Subject: [PATCH 59/64] Update docs/v3/core/index.rst Co-authored-by: Ryan Abernathey --- docs/v3/core/index.rst | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/docs/v3/core/index.rst b/docs/v3/core/index.rst index be24e182..e8c29dc5 100644 --- a/docs/v3/core/index.rst +++ b/docs/v3/core/index.rst @@ -1659,9 +1659,8 @@ on the zarr-specs repository. unregistered as well as the registered named. * **Proprietary extensions**: Authors looking to create proprietary extensions - which are only interpretable within their own institutions are encouraged to - take ownership of their "own" namespace, ``urn:x-company`` or - ``urn:x-domain.name``. + for internal, non-public use are encouraged to use a Tag URI. For example + ``tag:mycompany.com,2025-03-27:top-secret``. * **Complete opaqueness**: Authors looking for a prefix which is communicates *nothing* to implementations MAY use the prefix ``urn:uuid:...`` following by From 6c1a027d0811deea13982c3cc9024b7605795ecc Mon Sep 17 00:00:00 2001 From: Josh Moore Date: Wed, 2 Apr 2025 19:32:21 +0200 Subject: [PATCH 60/64] Reduce to minimal change --- docs/v3/core/index.rst | 110 +++++++---------------------------------- 1 file changed, 17 insertions(+), 93 deletions(-) diff --git a/docs/v3/core/index.rst b/docs/v3/core/index.rst index e8c29dc5..5a3c3947 100644 --- a/docs/v3/core/index.rst +++ b/docs/v3/core/index.rst @@ -1564,80 +1564,34 @@ of the explicit use of :ref:`extension-points`. Extension naming ---------------- -The `name` field of an extension is an identifier taking one of two forms: -**registered names** (as simple strings) and **unregistered names** (as URIs). - -Implementations SHOULD be able to resolve multiple names to the same -implementation to support unregistered names which are subsequently registered. +The `name` field of an extension is an identifier that has been registered +prior to release in any implementation within the `zarr-extensions`_ Github +repository, where extensions and their specification are listed. The Zarr +Steering Council or by delegation a maintainer team reserves the right to +refuse name assignment at its own discretion. .. _extension-naming-registered-names: -Registered names -^^^^^^^^^^^^^^^^ - -Registered names consist of a single string that is unique within the Zarr ecosystem, with no prefix. +Registered names consist of a single string that is unique within the Zarr ecosystem. Registered names are intended for well-known extensions aimed at broad adoption and maximum interoperability. Registered names are unique and immutable. + Registered names MUST start with one lower case letter a-z and then be followed by only lower case letters a-z, numerals 0-9, underscores, dots and dashes. -Prior to release in any implementation, -registered names MUST be assigned within a central repository, `zarr-extensions`_ -a Github repository, where extensions and their specification are listed. -The Zarr Steering Council or by delegation a -maintainer team reserves the right to refuse name assignment at its own -discretion. - - **Accepted regex:** ``^[a-z][a-z0-9-_.]+$`` - **Valid examples:** - - ``zstd`` - - ``numcodecs.adler32`` + - ``zstd`` + - ``numcodecs.adler32`` - **Invalid examples:** - - ``foo/bar`` - - ``foo:bar`` - -.. _extension-naming-unregistered-names: - -Unregistered names -^^^^^^^^^^^^^^^^^^ - -Unregistered names are intended for private extensions and for experimental and development purposes. - -Unregistered names are not centrally managed and MAY be used by any extension without coordination. - -Unregistered names MUST be URIs, which are prefixed with a scheme beginning -with a letter and followed by any number of letters, numbers, plus symbols, -dashes or dots and then followed by a colon. -The use of URI names ensures that unregistered extensions will never conflict with or override registered extensions. - -- **Identifying regex:** ``^([a-z][a-z0-9-_]+\.)+[a-z][a-z0-9-_]+:`` - -URIs (`Uniform Resource Identifiers `_) -are a well-known mechanism to identify abstract or phyiscal resources and extension authors are -encouraged to explore further documentation on which identifiers might best express their intent. + - ``foo/bar`` + - ``foo:bar`` -Aware that not all extension developers will want to immediately register a name, -the goal of using URI identifiers is to provide a large and flexible namespace which -balances the needs of developers building new extensions with a extensible mechanism -which the Zarr community can make use of in the years to come. - -URLs (`Uniform Resource Locators `_) are one class of URIs which -provide a mechanism for resolving resources. -In previous versions of the v3 spec, the name of an extension was required to -be a URI that dereferences to a human-readable codec specification, i.e. a URL. -That is now discouraged for new extensions, though, for backwards compatibility -with existing extensions, URLs names are still permitted. - -Instead, new unregistered extension names SHOULD use the [Tag URI scheme](https://datatracker.ietf.org/doc/html/rfc4151). -The Tag URI scheme has four principal requirements: -> - Identifiers are likely to be unique across space and time, and come from a practically inexhaustible supply. -> - Identifiers are relatively convenient for humans to mint (create), read, type, remember etc. -> - No central registration is necessary, at least for holders of domain names or email addresses; and there is negligible cost to mint each new identifier. -> - The identifiers are independent of any particular resolution scheme. - -These requirements are aligned well with the needs of Zarr extension developers. - -An example of a Tag URI for a Zarr extension is `tag:josh@openmicroscopy.org,2025-03:experimental-new-dtype`. +.. note:: + In previous versions of the v3 spec, the name of an extension was required + to be a URI. That is now discouraged for new extensions, though, for + backwards compatibility with existing extensions, URIs names are still + permitted. .. _extension-guidance: @@ -1647,26 +1601,6 @@ Guidance for extension authors *This section is non-normative and provides assistance for the authors of extensions, especially those who are just getting started.* -Recognizing that there are diverse considerations in choosing an extension -name, guidance is provided below based on generic scenarios. Extension authors -who are still unsure of how best to choose a name are welcome to open an issue -on the zarr-specs repository. - -* **Local development**: Authors looking to define a name for local development - purposes should prefix their extensions with ``urn:x-``. This prefix defines - an "experimental" name. As such an extension matures, authors might consider - registering a new name for it. Implementations should check both for the - unregistered as well as the registered named. - -* **Proprietary extensions**: Authors looking to create proprietary extensions - for internal, non-public use are encouraged to use a Tag URI. For example - ``tag:mycompany.com,2025-03-27:top-secret``. - -* **Complete opaqueness**: Authors looking for a prefix which is communicates - *nothing* to implementations MAY use the prefix ``urn:uuid:...`` following by - following by a valid UUID (`Universally Unique Identifier - `_). - The Zarr maintainers endeavor to make the registration of names as straight-forward as possible. We encourage all authors to make use of the extensions repository to prevent duplicate efforts across the community where possible. @@ -1680,20 +1614,10 @@ repository to prevent duplicate efforts across the community where possible. data or widely distributed data should consider registering all extensions in the extension registry to increase the long-term maintainability of the data. -.. note:: - The simple form of the registered names can be thought of as a short-hand - for a URN prefixed with ``urn:zarr:``. Formal registration with - `IANA `_ will not change the validity of the simple form. - Extension versioning -------------------- -Extensions with **registered names** SHOULD follow the -compatibility and versioning v3 `stability policy`_. - -For extensions with **unregistered names**, there are no guarantees in terms of -versioning or compatibility. However, preserving backwards-compatibility is -strongly encouraged. +Registered extensions SHOULD follow the compatibility and versioning `stability policy`_. Extension example ----------------- From 8dc779628bf0974514fd8b56d5c907d967c61526 Mon Sep 17 00:00:00 2001 From: Josh Moore Date: Wed, 2 Apr 2025 20:39:23 +0200 Subject: [PATCH 61/64] Add feedback from ZSC --- docs/v3/core/index.rst | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/docs/v3/core/index.rst b/docs/v3/core/index.rst index 5a3c3947..efc3be6b 100644 --- a/docs/v3/core/index.rst +++ b/docs/v3/core/index.rst @@ -1593,6 +1593,9 @@ by only lower case letters a-z, numerals 0-9, underscores, dots and dashes. backwards compatibility with existing extensions, URIs names are still permitted. + A proposal to additionally support multiple registration mechanisms is under + discussion in https://github.com/zarr-developers/zarr-specs/pull/330 . + .. _extension-guidance: Guidance for extension authors @@ -1605,9 +1608,14 @@ The Zarr maintainers endeavor to make the registration of names as straight-forward as possible. We encourage all authors to make use of the extensions repository to prevent duplicate efforts across the community where possible. -* **Well-known extensions**: If you are implementing a well-known extension +* **During development**: Authors should use whatever name makes sense + for their extension, provided it is not already reserved in the registry. + Once there is a working implementation of the extension (e.g. a PR to an + existing Zarr implementation), the extension should be submitted to the registry. + +* **Well-known extensions**: Authors implementing a well-known extension like a data type or codec that is already referred to by name in the - community, you may want to check the `zarr-extensions`_ repository to see if + community may want to check the `zarr-extensions`_ repository to see if someone has already implemented the extension. * **Production extensions**: Authors intending to create significant amounts of From 4ac01268f8d41454eabc2d375322dbce57182cbf Mon Sep 17 00:00:00 2001 From: Josh Moore Date: Wed, 2 Apr 2025 21:04:13 +0200 Subject: [PATCH 62/64] Correct whitespace --- docs/v3/core/index.rst | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/docs/v3/core/index.rst b/docs/v3/core/index.rst index efc3be6b..132dae07 100644 --- a/docs/v3/core/index.rst +++ b/docs/v3/core/index.rst @@ -1588,10 +1588,10 @@ by only lower case letters a-z, numerals 0-9, underscores, dots and dashes. - ``foo:bar`` .. note:: - In previous versions of the v3 spec, the name of an extension was required - to be a URI. That is now discouraged for new extensions, though, for - backwards compatibility with existing extensions, URIs names are still - permitted. + In previous versions of the v3 spec, the name of an extension was required + to be a URI. That is now discouraged for new extensions, though, for + backwards compatibility with existing extensions, URIs names are still + permitted. A proposal to additionally support multiple registration mechanisms is under discussion in https://github.com/zarr-developers/zarr-specs/pull/330 . From dd90a3fc9250ecc7d88a4d091a8d004407855363 Mon Sep 17 00:00:00 2001 From: Josh Moore Date: Wed, 2 Apr 2025 21:13:21 +0200 Subject: [PATCH 63/64] Remove confusingly redundant must_understand block from groups --- docs/v3/core/index.rst | 9 ++------- 1 file changed, 2 insertions(+), 7 deletions(-) diff --git a/docs/v3/core/index.rst b/docs/v3/core/index.rst index 132dae07..ea261022 100644 --- a/docs/v3/core/index.rst +++ b/docs/v3/core/index.rst @@ -765,6 +765,8 @@ Optional keys: pairs, where the key must be a string and the value can be an arbitrary JSON literal. Intended to allow storage of arbitrary user metadata. +.. _group-metadata-extensions: + Unknown ^^^^^^^ @@ -785,13 +787,6 @@ For example, the JSON document below defines a group:: } } -The group metadata object must not contain any other names. Those are reserved -for future versions of this specification. An implementation must fail to open -zarr hierarchies or groups with unknown metadata fields, with the exception of -objects with a ``"must_understand": false`` key-value pair. -See :ref:`extension-definition-must-understand` for more information. - - Node names ========== From 7ee6d317bc240ee343b490eca0b9477bde595dec Mon Sep 17 00:00:00 2001 From: Josh Moore Date: Thu, 17 Apr 2025 13:57:13 +0200 Subject: [PATCH 64/64] Make chunk-key-encoding info a warning Co-authored-by: jakirkham --- docs/v3/chunk-key-encodings/v2/index.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/v3/chunk-key-encodings/v2/index.rst b/docs/v3/chunk-key-encodings/v2/index.rst index 5b7abf9a..2b92a5d6 100644 --- a/docs/v3/chunk-key-encodings/v2/index.rst +++ b/docs/v3/chunk-key-encodings/v2/index.rst @@ -43,7 +43,7 @@ separator of ``/``, the identifier is the string ``"1/23/45"``. For chunk grids with 0 dimensions, the single chunk has the key ``"0"``. -.. note:: +.. warning:: This encoding is intended only to allow existing v2 arrays to be converted to v3 without having to rename chunks. It is not recommended