Skip to content

Commit fc26ec7

Browse files
committed
docs review
1 parent d2b68a2 commit fc26ec7

File tree

2 files changed

+21
-14
lines changed

2 files changed

+21
-14
lines changed

docs/spec/v2.rst

Lines changed: 18 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -64,22 +64,24 @@ compression_opts
6464
compression library.
6565
fill_value
6666
A scalar value providing the default value to use for uninitialized
67-
portions of the array.
67+
portions of the array, or ``null`` if no fill_value is to be used.
6868
order
6969
Either "C" or "F", defining the layout of bytes within each chunk of the
7070
array. "C" means row-major order, i.e., the last dimension varies fastest;
7171
"F" means column-major order, i.e., the first dimension varies fastest.
7272
filters
73-
TODO
73+
A list of JSON objects providing filter configurations, or ``null`` if no
74+
filters are to be applied. Each filter configuration object MUST contain a
75+
``"name"`` key identifying the filter to be used.
7476

7577
Other keys MUST NOT be present within the metadata object.
7678

7779
For example, the JSON object below defines a 2-dimensional array of 64-bit
7880
little-endian floating point numbers with 10000 rows and 10000 columns, divided
7981
into chunks of 1000 rows and 1000 columns (so there will be 100 chunks in total
8082
arranged in a 10 by 10 grid). Within each chunk the data are laid out in C
81-
contiguous order, and each chunk is compressed using the Blosc compression
82-
library::
83+
contiguous order. Each chunk is encoded using a delta filter and compressed
84+
using the Blosc compression library prior to storage::
8385

8486
{
8587
"chunks": [
@@ -93,9 +95,9 @@ library::
9395
"shuffle": 1
9496
},
9597
"dtype": "<f8",
96-
"fill_value": null,
98+
"fill_value": "NaN",
9799
"filters": [
98-
{"name": "delta", "enc_dtype": "<f4", "dec_dtype": "<f8"}
100+
{"name": "delta", "dtype": "<f8", "astype": "<f4"}
99101
],
100102
"order": "C",
101103
"shape": [
@@ -147,7 +149,6 @@ Positive Infinity ``"Infinity"``
147149
Negative Infinity ``"-Infinity"``
148150
================= ===============
149151

150-
151152
Chunks
152153
~~~~~~
153154

@@ -184,7 +185,12 @@ contents of any chunk region falling outside the array are undefined.
184185
Filters
185186
~~~~~~~
186187

187-
TODO
188+
Optionally a sequence of one or more filters can be used to transform chunk
189+
data prior to compression. When storing data, filters are applied in the order
190+
specified in array metadata to encode data, then the encoded data are passed to
191+
the primary compressor. When retrieving data, stored chunk data are
192+
decompressed by the primary compressor then decoded using filters in the
193+
reverse order.
188194

189195
Hierarchies
190196
-----------
@@ -463,7 +469,8 @@ Changes in version 2
463469
* Added support for storing multiple arrays in the same store and organising
464470
arrays into hierarchies using groups.
465471
* Array metadata is now stored under the ".zarray" key instead of the "meta"
466-
key
472+
key.
467473
* Custom attributes are now stored under the ".zattrs" key instead of the
468-
"attrs" key
469-
* TODO filters
474+
"attrs" key.
475+
* Added support for filters.
476+
* Changed encoding of "fill_value" field within array metadata.

zarr/filters.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -42,14 +42,14 @@ class DeltaFilter(object):
4242
--------
4343
>>> import zarr
4444
>>> import numpy as np
45-
>>> x = np.arange(100, 120, 2, dtype='f8')
46-
>>> f = zarr.DeltaFilter(dtype='f8', astype='i1')
45+
>>> x = np.arange(100, 120, 2, dtype='i8')
46+
>>> f = zarr.DeltaFilter(dtype='i8', astype='i1')
4747
>>> y = f.encode(x)
4848
>>> y
4949
array([100, 2, 2, 2, 2, 2, 2, 2, 2, 2], dtype=int8)
5050
>>> z = f.decode(y)
5151
>>> z
52-
array([ 100., 102., 104., 106., 108., 110., 112., 114., 116., 118.])
52+
array([100, 102, 104, 106, 108, 110, 112, 114, 116, 118])
5353
5454
""" # flake8: noqa
5555

0 commit comments

Comments
 (0)