Skip to content

Commit f3edf12

Browse files
authored
Merge branch 'main' into speed-up-tests
2 parents f98bc02 + 22634ea commit f3edf12

28 files changed

+557
-437
lines changed

.gitignore

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -51,7 +51,7 @@ coverage.xml
5151

5252
# Sphinx documentation
5353
docs/_build/
54-
docs/_autoapi
54+
docs/api
5555
docs/data
5656
data
5757
data.zip

docs/Makefile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -52,7 +52,7 @@ help:
5252
.PHONY: clean
5353
clean:
5454
rm -rf $(BUILDDIR)/*
55-
rm -rf $(BUILDDIR)/../_autoapi
55+
rm -rf $(BUILDDIR)/../api
5656

5757
.PHONY: html
5858
html:

docs/api/index.rst

Lines changed: 0 additions & 7 deletions
This file was deleted.

docs/conf.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -57,7 +57,7 @@
5757
autoapi_add_toctree_entry = False
5858
autoapi_generate_api_docs = True
5959
autoapi_member_order = "groupwise"
60-
autoapi_root = "_autoapi"
60+
autoapi_root = "api"
6161
autoapi_keep_files = True
6262
autoapi_options = [ 'members', 'undoc-members', 'show-inheritance', 'show-module-summary', 'imported-members', ]
6363

@@ -108,6 +108,7 @@ def skip_submodules(
108108
"release": "developers/release.html",
109109
"roadmap": "developers/roadmap.html",
110110
"installation": "user-guide/installation.html",
111+
"api": "api/zarr/index"
111112
}
112113

113114
# The language for content autogenerated by Sphinx. Refer to documentation

docs/index.rst

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ Zarr-Python
1010

1111
quickstart
1212
user-guide/index
13-
api/index
13+
API reference <api/zarr/index>
1414
developers/index
1515
developers/release
1616
about
@@ -25,7 +25,7 @@ Zarr-Python
2525

2626
Zarr-Python is a Python library for reading and writing Zarr groups and arrays. Highlights include:
2727

28-
* Specification support for both Zarr v2 and v3.
28+
* Specification support for both Zarr format 2 and 3.
2929
* Create and read from N-dimensional arrays using NumPy-like semantics.
3030
* Flexible storage enables reading and writing from local, cloud and in-memory stores.
3131
* High performance: Enables fast I/O with support for asynchronous I/O and multi-threading.
@@ -81,12 +81,12 @@ Zarr-Python is a Python library for reading and writing Zarr groups and arrays.
8181

8282
+++
8383

84-
.. button-ref:: api/index
84+
.. button-ref:: api/zarr/index
8585
:expand:
8686
:color: dark
8787
:click-parent:
8888

89-
To the API reference guide
89+
To the API reference
9090

9191
.. grid-item-card::
9292
:img-top: _static/index_contribute.svg

docs/user-guide/arrays.rst

Lines changed: 25 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -168,8 +168,8 @@ argument accepted by all array creation functions. For example::
168168
>>> data = np.arange(100000000, dtype='int32').reshape(10000, 10000)
169169
>>> z = zarr.create_array(store='data/example-5.zarr', shape=data.shape, dtype=data.dtype, chunks=(1000, 1000), compressors=compressors)
170170
>>> z[:] = data
171-
>>> z.metadata.codecs
172-
[BytesCodec(endian=<Endian.little: 'little'>), BloscCodec(typesize=4, cname=<BloscCname.zstd: 'zstd'>, clevel=3, shuffle=<BloscShuffle.bitshuffle: 'bitshuffle'>, blocksize=0)]
171+
>>> z.compressors
172+
(BloscCodec(typesize=4, cname=<BloscCname.zstd: 'zstd'>, clevel=3, shuffle=<BloscShuffle.bitshuffle: 'bitshuffle'>, blocksize=0),)
173173

174174
This array above will use Blosc as the primary compressor, using the Zstandard
175175
algorithm (compression level 3) internally within Blosc, and with the
@@ -188,7 +188,9 @@ which can be used to print useful diagnostics, e.g.::
188188
Order : C
189189
Read-only : False
190190
Store type : LocalStore
191-
Codecs : [{'endian': <Endian.little: 'little'>}, {'typesize': 4, 'cname': <BloscCname.zstd: 'zstd'>, 'clevel': 3, 'shuffle': <BloscShuffle.bitshuffle: 'bitshuffle'>, 'blocksize': 0}]
191+
Filters : ()
192+
Serializer : BytesCodec(endian=<Endian.little: 'little'>)
193+
Compressors : (BloscCodec(typesize=4, cname=<BloscCname.zstd: 'zstd'>, clevel=3, shuffle=<BloscShuffle.bitshuffle: 'bitshuffle'>, blocksize=0),)
192194
No. bytes : 400000000 (381.5M)
193195

194196
The :func:`zarr.Array.info_complete` method inspects the underlying store and
@@ -203,7 +205,9 @@ prints additional diagnostics, e.g.::
203205
Order : C
204206
Read-only : False
205207
Store type : LocalStore
206-
Codecs : [{'endian': <Endian.little: 'little'>}, {'typesize': 4, 'cname': <BloscCname.zstd: 'zstd'>, 'clevel': 3, 'shuffle': <BloscShuffle.bitshuffle: 'bitshuffle'>, 'blocksize': 0}]
208+
Filters : ()
209+
Serializer : BytesCodec(endian=<Endian.little: 'little'>)
210+
Compressors : (BloscCodec(typesize=4, cname=<BloscCname.zstd: 'zstd'>, clevel=3, shuffle=<BloscShuffle.bitshuffle: 'bitshuffle'>, blocksize=0),)
207211
No. bytes : 400000000 (381.5M)
208212
No. bytes stored : 9696302
209213
Storage ratio : 41.3
@@ -223,8 +227,8 @@ here is an array using Gzip compression, level 1::
223227
>>> data = np.arange(100000000, dtype='int32').reshape(10000, 10000)
224228
>>> z = zarr.create_array(store='data/example-6.zarr', shape=data.shape, dtype=data.dtype, chunks=(1000, 1000), compressors=zarr.codecs.GzipCodec(level=1))
225229
>>> z[:] = data
226-
>>> z.metadata.codecs
227-
[BytesCodec(endian=<Endian.little: 'little'>), GzipCodec(level=1)]
230+
>>> z.compressors
231+
(GzipCodec(level=1),)
228232

229233
Here is an example using LZMA from NumCodecs_ with a custom filter pipeline including LZMA's
230234
built-in delta filter::
@@ -236,23 +240,24 @@ built-in delta filter::
236240
>>> compressors = LZMA(filters=lzma_filters)
237241
>>> data = np.arange(100000000, dtype='int32').reshape(10000, 10000)
238242
>>> z = zarr.create_array(store='data/example-7.zarr', shape=data.shape, dtype=data.dtype, chunks=(1000, 1000), compressors=compressors)
239-
>>> z.metadata.codecs
240-
[BytesCodec(endian=<Endian.little: 'little'>), _make_bytes_bytes_codec.<locals>._Codec(codec_name='numcodecs.lzma', codec_config={'id': 'lzma', 'filters': [{'id': 3, 'dist': 4}, {'id': 33, 'preset': 1}]})]
243+
>>> z.compressors
244+
(_make_bytes_bytes_codec.<locals>._Codec(codec_name='numcodecs.lzma', codec_config={'id': 'lzma', 'filters': [{'id': 3, 'dist': 4}, {'id': 33, 'preset': 1}]}),)
241245

242246
The default compressor can be changed by setting the value of the using Zarr's
243247
:ref:`user-guide-config`, e.g.::
244248

245249
>>> with zarr.config.set({'array.v2_default_compressor.numeric': {'id': 'blosc'}}):
246250
... z = zarr.create_array(store={}, shape=(100000000,), chunks=(1000000,), dtype='int32', zarr_format=2)
247-
>>> z.metadata.filters
248-
>>> z.metadata.compressor
249-
Blosc(cname='lz4', clevel=5, shuffle=SHUFFLE, blocksize=0)
251+
>>> z.filters
252+
()
253+
>>> z.compressors
254+
(Blosc(cname='lz4', clevel=5, shuffle=SHUFFLE, blocksize=0),)
250255

251256
To disable compression, set ``compressors=None`` when creating an array, e.g.::
252257

253258
>>> z = zarr.create_array(store='data/example-8.zarr', shape=(100000000,), chunks=(1000000,), dtype='int32', compressors=None)
254-
>>> z.metadata.codecs
255-
[BytesCodec(endian=<Endian.little: 'little'>)]
259+
>>> z.compressors
260+
()
256261

257262
.. _user-guide-filters:
258263

@@ -287,7 +292,9 @@ Here is an example using a delta filter with the Blosc compressor::
287292
Order : C
288293
Read-only : False
289294
Store type : LocalStore
290-
Codecs : [{'codec_name': 'numcodecs.delta', 'codec_config': {'id': 'delta', 'dtype': 'int32'}}, {'endian': <Endian.little: 'little'>}, {'typesize': 4, 'cname': <BloscCname.zstd: 'zstd'>, 'clevel': 1, 'shuffle': <BloscShuffle.shuffle: 'shuffle'>, 'blocksize': 0}]
295+
Filters : (_make_array_array_codec.<locals>._Codec(codec_name='numcodecs.delta', codec_config={'id': 'delta', 'dtype': 'int32'}),)
296+
Serializer : BytesCodec(endian=<Endian.little: 'little'>)
297+
Compressors : (BloscCodec(typesize=4, cname=<BloscCname.zstd: 'zstd'>, clevel=1, shuffle=<BloscShuffle.shuffle: 'shuffle'>, blocksize=0),)
291298
No. bytes : 400000000 (381.5M)
292299

293300
For more information about available filter codecs, see the `Numcodecs
@@ -600,11 +607,13 @@ Sharded arrays can be created by providing the ``shards`` parameter to :func:`za
600607
Order : C
601608
Read-only : False
602609
Store type : LocalStore
603-
Codecs : [{'chunk_shape': (100, 100), 'codecs': ({'endian': <Endian.little: 'little'>}, {'level': 0, 'checksum': False}), 'index_codecs': ({'endian': <Endian.little: 'little'>}, {}), 'index_location': <ShardingCodecIndexLocation.end: 'end'>}]
610+
Filters : ()
611+
Serializer : BytesCodec(endian=<Endian.little: 'little'>)
612+
Compressors : (ZstdCodec(level=0, checksum=False),)
604613
No. bytes : 100000000 (95.4M)
605614
No. bytes stored : 3981060
606615
Storage ratio : 25.1
607-
Chunks Initialized : 100
616+
Shards Initialized : 100
608617

609618
In this example a shard shape of (1000, 1000) and a chunk shape of (100, 100) is used.
610619
This means that 10*10 chunks are stored in each shard, and there are 10*10 shards in total.

docs/user-guide/config.rst

Lines changed: 15 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,7 @@ Configuration options include the following:
2828

2929
- Default Zarr format ``default_zarr_version``
3030
- Default array order in memory ``array.order``
31-
- Default codecs ``array.v3_default_codecs`` and ``array.v2_default_compressor``
31+
- Default filters, serializers and compressors, e.g. ``array.v3_default_filters``, ``array.v3_default_serializer``, ``array.v3_default_compressors``, ``array.v2_default_filters`` and ``array.v2_default_compressor``
3232
- Whether empty chunks are written to storage ``array.write_empty_chunks``
3333
- Async and threading options, e.g. ``async.concurrency`` and ``threading.max_workers``
3434
- Selections of implementations of codecs, codec pipelines and buffers
@@ -54,19 +54,20 @@ This is the current default configuration::
5454
'v2_default_filters': {'bytes': [{'id': 'vlen-bytes'}],
5555
'numeric': None,
5656
'string': [{'id': 'vlen-utf8'}]},
57-
'v3_default_codecs': {'bytes': [{'name': 'vlen-bytes'},
58-
{'configuration': {'checksum': False,
59-
'level': 0},
60-
'name': 'zstd'}],
61-
'numeric': [{'configuration': {'endian': 'little'},
62-
'name': 'bytes'},
63-
{'configuration': {'checksum': False,
64-
'level': 0},
65-
'name': 'zstd'}],
66-
'string': [{'name': 'vlen-utf8'},
67-
{'configuration': {'checksum': False,
68-
'level': 0},
69-
'name': 'zstd'}]},
57+
'v3_default_compressors': {'bytes': [{'configuration': {'checksum': False,
58+
'level': 0},
59+
'name': 'zstd'}],
60+
'numeric': [{'configuration': {'checksum': False,
61+
'level': 0},
62+
'name': 'zstd'}],
63+
'string': [{'configuration': {'checksum': False,
64+
'level': 0},
65+
'name': 'zstd'}]},
66+
'v3_default_filters': {'bytes': [], 'numeric': [], 'string': []},
67+
'v3_default_serializer': {'bytes': {'name': 'vlen-bytes'},
68+
'numeric': {'configuration': {'endian': 'little'},
69+
'name': 'bytes'},
70+
'string': {'name': 'vlen-utf8'}},
7071
'write_empty_chunks': False},
7172
'async': {'concurrency': 10, 'timeout': None},
7273
'buffer': 'zarr.core.buffer.cpu.Buffer',

docs/user-guide/consolidated_metadata.rst

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -52,8 +52,8 @@ that can be used.:
5252
chunk_key_encoding=DefaultChunkKeyEncoding(name='default',
5353
separator='/'),
5454
fill_value=np.float64(0.0),
55-
codecs=[BytesCodec(endian=<Endian.little: 'little'>),
56-
ZstdCodec(level=0, checksum=False)],
55+
codecs=(BytesCodec(endian=<Endian.little: 'little'>),
56+
ZstdCodec(level=0, checksum=False)),
5757
attributes={},
5858
dimension_names=None,
5959
zarr_format=3,
@@ -65,8 +65,8 @@ that can be used.:
6565
chunk_key_encoding=DefaultChunkKeyEncoding(name='default',
6666
separator='/'),
6767
fill_value=np.float64(0.0),
68-
codecs=[BytesCodec(endian=<Endian.little: 'little'>),
69-
ZstdCodec(level=0, checksum=False)],
68+
codecs=(BytesCodec(endian=<Endian.little: 'little'>),
69+
ZstdCodec(level=0, checksum=False)),
7070
attributes={},
7171
dimension_names=None,
7272
zarr_format=3,
@@ -78,8 +78,8 @@ that can be used.:
7878
chunk_key_encoding=DefaultChunkKeyEncoding(name='default',
7979
separator='/'),
8080
fill_value=np.float64(0.0),
81-
codecs=[BytesCodec(endian=<Endian.little: 'little'>),
82-
ZstdCodec(level=0, checksum=False)],
81+
codecs=(BytesCodec(endian=<Endian.little: 'little'>),
82+
ZstdCodec(level=0, checksum=False)),
8383
attributes={},
8484
dimension_names=None,
8585
zarr_format=3,

docs/user-guide/extending.rst

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -10,8 +10,8 @@ Custom codecs
1010
-------------
1111

1212
.. note::
13-
This section explains how custom codecs can be created for Zarr version 3 data. For Zarr
14-
version 2, codecs should subclass the
13+
This section explains how custom codecs can be created for Zarr format 3 arrays. For Zarr
14+
format 2, codecs should subclass the
1515
`numcodecs.abc.Codec <https://numcodecs.readthedocs.io/en/stable/abc.html#numcodecs.abc.Codec>`_
1616
base class and register through
1717
`numcodecs.registry.register_codec <https://numcodecs.readthedocs.io/en/stable/registry.html#numcodecs.registry.register_codec>`_.
@@ -66,7 +66,7 @@ strongly recommended to prefix the codec identifier with a unique name. For exam
6666
the codecs from ``numcodecs`` are prefixed with ``numcodecs.``, e.g. ``numcodecs.delta``.
6767

6868
.. note::
69-
Note that the extension mechanism for the Zarr version 3 is still under development.
69+
Note that the extension mechanism for the Zarr format 3 is still under development.
7070
Requirements for custom codecs including the choice of codec identifiers might
7171
change in the future.
7272

docs/user-guide/groups.rst

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -109,7 +109,9 @@ property. E.g.::
109109
Order : C
110110
Read-only : False
111111
Store type : MemoryStore
112-
Codecs : [{'endian': <Endian.little: 'little'>}, {'level': 0, 'checksum': False}]
112+
Filters : ()
113+
Serializer : BytesCodec(endian=<Endian.little: 'little'>)
114+
Compressors : (ZstdCodec(level=0, checksum=False),)
113115
No. bytes : 8000000 (7.6M)
114116
No. bytes stored : 1432
115117
Storage ratio : 5586.6
@@ -123,7 +125,9 @@ property. E.g.::
123125
Order : C
124126
Read-only : False
125127
Store type : MemoryStore
126-
Codecs : [{'endian': <Endian.little: 'little'>}, {'level': 0, 'checksum': False}]
128+
Filters : ()
129+
Serializer : BytesCodec(endian=<Endian.little: 'little'>)
130+
Compressors : (ZstdCodec(level=0, checksum=False),)
127131
No. bytes : 4000000 (3.8M)
128132

129133
Groups also have the :func:`zarr.Group.tree` method, e.g.::

0 commit comments

Comments
 (0)