@@ -144,11 +144,11 @@ Questions that still need to be resolved
144144We solicit feedback on the following area during the RFC period of this first
145145draft.
146146
147- - Should core metadata and user attributes be stored together or separate documents? ([GH72](https://github.com/zarr-developers/zarr-specs/issues/72))
148- large metadata documents.
147+ - Should core metadata and user attributes be stored together or separate documents?
148+ (See https://github.com/zarr-developers/zarr-specs/issues/72)
149149 - extensions and ``must_understand = True `` might be too restrictive. Work a
150- draft implementation with extensions and
151- see how far we can go. Possible list of extensions to implement :
150+ We propose to develop a draft implementation with extensions and
151+ see how far we can go. A possible list of extensions to include :
152152
153153 - Boolean
154154 - Complex
@@ -159,17 +159,18 @@ draft.
159159 See https://github.com/zarr-developers/zarr-specs/issues/89 for discussion on
160160 the topic.
161161
162- - Node name case sensitivity: The node name is now case sensitive, this may
162+ - Node name case sensitivity: The node name is now case sensitive. This may
163163 make store implementation more complicated as some backends might not be
164164 (like some specific filesystem / object store), and we may want to
165165 recommend a standard escaping mechanism in those cases.
166166 https://github.com/zarr-developers/zarr-specs/issues/57
167167
168- - Node name character set: Same as above but unlike the previous point where we
168+ - Node name character set: We
169169 solicit feedback on whether store implementation should support full unicode.
170170 https://github.com/zarr-developers/zarr-specs/issues/56
171171
172- - Should named dimensions be part of the core metadata spec? https://github.com/zarr-developers/zarr-specs/issues/73
172+ - Should named dimensions be part of the core metadata spec?
173+ https://github.com/zarr-developers/zarr-specs/issues/73
173174
174175
175176Document conventions
@@ -394,7 +395,7 @@ node names:
394395
395396* must not be the empty string ("")
396397
397- * must consist only of characters in the sets ``a-z ``, ``A-Z ``, ``0-9 ``,
398+ * must use only characters in the sets ``a-z ``, ``A-Z ``, ``0-9 ``,
398399 ``-_. ``
399400
400401* must not be a string composed only of period characters, e.g. "." or
@@ -563,7 +564,7 @@ other type sizes in later versions of this specification.
563564 ways to encode variable length and we want to keep flexibility. While we seem
564565 to agree that for random access the most likely contender is to have two
565566 arrays, one with the actual variable length data and one with fixed size
566- (pointer + length) to the variable size data we do not want to commit to such
567+ (pointer + length) to the variable size data, we do not want to commit to such
567568 a structure.
568569
569570
@@ -594,21 +595,21 @@ A regular grid is a type of grid where an array is divided into chunks
594595such that each chunk is a hyperrectangle of the same shape. The
595596dimensionality of the grid is the same as the dimensionality of the
596597array. Each chunk in the grid can be addressed by a tuple of positive
597- integers (`i `, `j `, `k `, ...) corresponding to the indices of the
598+ integers (`k `, `j `, `i `, ...) corresponding to the indices of the
598599chunk along each dimension.
599600
600- The origin vertex of a chunk has coordinates in the array space (`i ` *
601- `dx `, `j ` * `dy `, `k ` * `dz `, ...) where (`dx `, `dy `, `dz `, ...) are
602- the grid spacings along each dimension, also known as the chunk
603- shape. Thus the origin vertex of the chunk at grid index (0, 0, 0,
601+ The origin vertex of a chunk has coordinates in the array space (`k ` *
602+ `dz `, `j ` * `dy `, `i ` * `dx `, ...) where (`dz `, `dy `, `dx `, ...) are
603+ the chunk sizes along each dimension.
604+ Thus the origin vertex of the chunk at grid index (0, 0, 0,
604605...) is at coordinate (0, 0, 0, ...) in the array space, i.e., the
605606grid is aligned with the origin of the array. If the length of any
606607array dimension is not perfectly divisible by the chunk length along
607608the same dimension, then the grid will overhang the edge of the array
608609space.
609610
610- The shape of the chunk grid will be (ceil(`x ` / `dx `), ceil(`y ` /
611- `dy `), ceil(`z ` / `dz `), ...) where (`x `, `y `, `z `, ...) is the array
611+ The shape of the chunk grid will be (ceil(`z ` / `dz `), ceil(`y ` /
612+ `dy `), ceil(`x ` / `dx `), ...) where (`z `, `y `, `x `, ...) is the array
612613shape, "/" is the division operator and "ceil" is the ceiling
613614function. For example, if a 3 dimensional array has shape (10, 200,
6146153000), and has chunk shape (5, 20, 400), then the shape of the chunk
@@ -628,18 +629,18 @@ dimension.
628629 - (2, 10, 8)
629630 - The grid does overhang the edge of the array on the 3rd dimension.
630631
631- An element of an array with coordinates (`a `, `b `, `c `, ...) will
632- occur within the chunk at grid index (`a ` // `dx `, `b ` // `dy `, `c ` //
633- `dz `, ...), where "//" is the floor division operator. The element
634- will have coordinates (`a ` % `dx `, `b ` % `dy `, `c ` % `dz `, ...) within
632+ An element of an array with coordinates (`c `, `b `, `a `, ...) will
633+ occur within the chunk at grid index (`c ` // `dz `, `b ` // `dy `, `a ` //
634+ `dx `, ...), where "//" is the floor division operator. The element
635+ will have coordinates (`c ` % `dz `, `b ` % `dy `, `a ` % `dx `, ...) within
635636that chunk, where "%" is the modulo operator. For example, if a
6366373 dimensional array has shape (10, 200, 3000), and has chunk shape
637638(5, 20, 400), then the element of the array with coordinates (7, 150, 900)
638639is contained within the chunk at grid index (1, 7, 2) and has coordinates
639640(2, 10, 100) within that chunk.
640641
641642
642- The identifier for chunk with grid index (``i ``, ``j ``, ``k ``, ...) is
643+ The identifier for chunk with grid index (``k ``, ``j ``, ``i ``, ...) is
643644formed by joining together ASCII string representations of each index
644645using a separator and prefixed with the character ``c ``. The default value for
645646the separator is the slash character, ``/ ``, but this may be configured by
@@ -693,10 +694,10 @@ organised into a sequence such that the last dimension of the array is
693694the fastest changing dimension, also known as "row-major" order. This
694695layout is only applicable to arrays with fixed size data types.
695696
696- For example, for a two-dimensional array with chunk shape (`dx `, `dy `),
697+ For example, for a two-dimensional array with chunk shape (`dy `, `dx `),
697698the binary values for a given chunk are taken from chunk elements in
698- the order (0, 0), (0, 1), (0, 2), ..., (`dx ` - 1, `dy ` - 3), (`dx ` - 1, `dy ` -
699- 2), (`dx ` - 1, `dy ` - 1).
699+ the order (0, 0), (0, 1), (0, 2), ..., (`dy ` - 1, `dx ` - 3), (`dy ` - 1, `dx ` -
700+ 2), (`dy ` - 1, `dx ` - 1).
700701
701702F contiguous memory layout
702703--------------------------
@@ -707,10 +708,10 @@ is the fastest changing dimension, also known as "column-major"
707708order. This layout is only applicable to arrays with fixed size data
708709types.
709710
710- For example, for a two-dimensional array with chunk shape (`dx `,
711- `dy `), the binary values for a given chunk are taken from chunk
712- elements in the order (0, 0), (1, 0), (2, 0), ..., (`dx ` - 3, `dy ` -
713- 1), (`dx ` - 2, `dy ` - 1), (`dx ` - 1, `dy ` - 1).
711+ For example, for a two-dimensional array with chunk shape (`dy `,
712+ `dx `), the binary values for a given chunk are taken from chunk
713+ elements in the order (0, 0), (1, 0), (2, 0), ..., (`dy ` - 3, `dx ` -
714+ 1), (`dy ` - 2, `dx ` - 1), (`dy ` - 1, `dx ` - 1).
714715
715716
716717Chunk encoding
@@ -988,7 +989,7 @@ following mandatory names:
988989 then said extension is responsible for interpreting the value of
989990 ``fill_value `` and return a suitable type that can be used.
990991
991- For core `` data_type `` which `` fill_value `` are not permitted in JSON or
992+ For core data types for which fill values are not permitted in JSON or
992993 for which decimal representation could be lossy, a string representing of
993994 the binary (starting with ``0b ``) or hexadecimal value (starting with
994995 ``0x ``) is accepted. This string must include all leading or trailing
@@ -1004,7 +1005,13 @@ following mandatory names:
10041005``attributes ``
10051006
10061007 The value must be an object. The object may contain any name/value
1007- pairs.
1008+ pairs. Intended to allow storage of arbitrary user metadata
1009+
1010+
1011+ .. note :: The question of whether core metadata and user attributes should be
1012+ stored together or in separate documents is a topic of ongoing discussion.
1013+ (See https://github.com/zarr-developers/zarr-specs/issues/72.)
1014+
10081015
10091016The following names are optional:
10101017
@@ -1084,11 +1091,11 @@ chunking as above, but using an extension data type::
10841091
10851092.. note ::
10861093 comparison with spec v2,
1087- ``dtype `` have been renamed to ``data_type ``,
1088- ``chunks `` have been renamed to ``chunk_grid ``,
1089- ``order `` have been renamed to ``chunk_memory_layout ``,
1090- ``filters `` have been removed,
1091- ``zarr_format `` have been removed,
1094+ ``dtype `` has been renamed to ``data_type ``,
1095+ ``chunks `` has been renamed to ``chunk_grid ``,
1096+ ``order `` has been renamed to ``chunk_memory_layout ``,
1097+ ``filters `` has been removed,
1098+ ``zarr_format `` has been removed,
10921099
10931100
10941101Group metadata
@@ -1110,7 +1117,7 @@ For example, the JSON document below defines an explicit group::
11101117
11111118.. note ::
11121119
1113- Groups cannot have extensions attached to them as of spec v3.0 Allowing
1120+ Groups cannot have extensions attached to them as of spec v3.0. Allowing
11141121 groups to have extensions would force any implementation to sequentially
11151122 traverse the store hierarchy in order to check for extensions, which would
11161123 defeat the purpose of a flat namespace and concurrent access.
@@ -1119,7 +1126,7 @@ For example, the JSON document below defines an explicit group::
11191126
11201127.. note ::
11211128
1122- A group does not need a metadata document to exists, see implicit groups.
1129+ A group does not need a metadata document to exist. (See implicit groups.)
11231130
11241131
11251132
@@ -1376,8 +1383,8 @@ concatenating "data/root/" and the chunk identifier.
13761383 - Chunk grid indices
13771384 - Data key
13781385 * - `/foo/baz `
1379- - `(0 , 0) `
1380- - `data/root/foo/baz/c0 /0 `
1386+ - `(1 , 0) `
1387+ - `data/root/foo/baz/c1 /0 `
13811388
13821389
13831390
@@ -1389,8 +1396,8 @@ Let `P` be an arbitrary hierarchy path.
13891396Let ``array_meta_key(P) `` be the array metadata key for `P `. Let
13901397``group_meta_key(P) `` be the group metadata key for `P `.
13911398
1392- Let ``data_key(P, i, j, ...) `` be the data key for `P ` for the chunk
1393- with grid coordinates (`i `, `j `, ...).
1399+ Let ``data_key(P, j, i ...) `` be the data key for `P ` for the chunk
1400+ with grid coordinates (`j `, `i `, ...).
13941401
13951402Let "+" be the string concatenation operator.
13961403
@@ -1424,16 +1431,16 @@ Let "+" be the string concatenation operator.
14241431
14251432**Store element values in an array **
14261433
1427- To store element in an array at path `P ` and coordinate (`i `, `j `,
1428- ...), perform ``set(data_key(P, i, j , ...), value) ``, where
1434+ To store element in an array at path `P ` and coordinate (`j `, `i `,
1435+ ...), perform ``set(data_key(P, j, i , ...), value) ``, where
14291436 `value ` is the serialisation of the corresponding chunk, encoded
14301437 according to the information in the array metadata stored under
14311438 the key ``array_meta_key(P) ``.
14321439
14331440**Retrieve element values in an array **
14341441
14351442 To retrieve element in an array at path `P ` and coordinate (`i `,
1436- `j `, ...), perform ``get(data_key(P, i, j , ...), value) ``. The returned
1443+ `j `, ...), perform ``get(data_key(P, j, i , ...), value) ``. The returned
14371444 value is the serialisation of the corresponding chunk, encoded
14381445 according to the array metadata stored at ``array_meta_key(P) ``.
14391446
@@ -1507,7 +1514,7 @@ in mostly two categories:
15071514 - Core data type extensions – for example adding the ability to store fixed size
15081515 types such as complex or datetime in chunks. These are directly declared in the
15091516 array metadata ``data_type `` key.
1510- - Arrays extensions – non rectilinear grids, and variable length types.
1517+ - Array extensions – non rectilinear grids, and variable length types.
15111518
15121519There are no group extensions in Zarr v3.0.
15131520
0 commit comments