44Sharding codec (version 1.0)
55==========================================
66-----------------------------
7- Editor's draft 18 02 2022
7+ Editor's draft 23 03 2023
88-----------------------------
99
1010Specification URI:
@@ -39,15 +39,15 @@ Motivation
3939==========
4040
4141In many cases, it becomes inefficient or impractical to store a large number of
42- chunks in single files or objects due to the design constraints of the
42+ chunks as separate files or objects due to the design constraints of the
4343underlying storage. For example, the file block size and maximum inode number
4444restrict the usage of numerous small files for typical file systems, also cloud
4545storage such as S3, GCS, and various distributed filesystems do not efficiently
4646handle large numbers of small files or objects.
4747
4848Increasing the chunk size works only up to a certain point, as chunk sizes need
49- to be small for read and write efficiency requirements, for example to stream
50- data in browser-based visualization software.
49+ to be small for read efficiency requirements, for example to stream data in
50+ browser-based visualization software.
5151
5252Therefore, chunks may need to be smaller than the minimum size of one storage
5353key. In those cases, it is efficient to store objects at a more coarse
@@ -85,10 +85,12 @@ Sharding can be configured per array in the :ref:`array-metadata` as follows:
8585 {
8686 "name": "sharding_indexed"
8787 "configuration": {
88- "chunk_shape": [
89- 32,
90- 32
91- ],
88+ "chunk_grid": {
89+ "name": "regular",
90+ "configuration": {
91+ "chunk_shape": [32, 32]
92+ }
93+ }
9294 "codecs": [
9395 {
9496 "name": "gzip",
@@ -102,12 +104,13 @@ Sharding can be configured per array in the :ref:`array-metadata` as follows:
102104 ]
103105 }
104106
105- ``chunk_shape ``
107+ ``chunk_grid ``
106108
107- An array of integers providing the shape of inner chunks in a shard for each
108- dimension of the Zarr array. The length of the array must match the length
109- of the array metadata ``shape `` entry. The each integer must by divisible by
110- the ``chunk_shape `` of the array as defined in the ``chunk_grid `` metadata.
109+ Specifies the chunk grid of the inner chunks. The value must be an object,
110+ as specified in the :ref: `_array-metadata `.
111+ Currently, only the ``"regular" `` chunk grid is supported for inner chunks.
112+ Each integer of the inner ``chunk_shape `` must by divisible by the shape of
113+ the shard (i.e, ``chunk_shape `` of the array for regular chunk grids).
111114 For example, an inner chunk shape of ``[32, 2] `` with an outer chunk shape
112115 ``[64, 64] `` indicates that 64 chunks are combined in one shard, 2 along the
113116 first dimension, and for each of those 32 along the second dimension.
@@ -165,8 +168,6 @@ Any configuration parameters for the write strategy must not be part of the
165168metadata document; instead they need to be configured at runtime, as this is
166169implementation specific.
167170
168- Currently, only the ``regular `` chunk grid is supported.
169-
170171
171172Implementation notes
172173====================
0 commit comments