Skip to content

Commit d4c8d79

Browse files
committed
guide for config
1 parent 71ff5e7 commit d4c8d79

File tree

3 files changed

+84
-69
lines changed

3 files changed

+84
-69
lines changed

docs/user-guide/config.rst

Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,41 @@
1+
Runtime configuration
2+
=====================
3+
4+
The :mod:`zarr.core.config` module is responsible for managing the configuration of zarr
5+
and is based on the `donfig <https://github.com/pytroll/donfig>`_ Python library.
6+
7+
Configuration values can be set using code like the following:
8+
9+
.. code-block:: python
10+
11+
import zarr
12+
zarr.config.set({"array.order": "F"})
13+
14+
Alternatively, configuration values can be set using environment variables, e.g.
15+
``ZARR_ARRAY__ORDER=F``.
16+
17+
The configuration can also be read from a YAML file in standard locations.
18+
For more information, see the
19+
`donfig documentation <https://donfig.readthedocs.io/en/latest/>`_.
20+
21+
Configuration options include the following:
22+
23+
- Default Zarr format ``default_zarr_version``
24+
- Default array order in memory ``array.order``
25+
- Default codecs ``array.v3_default_codecs`` and ``array.v2_default_compressor``
26+
- Whether empty chunks are written to storage ``array.write_empty_chunks``
27+
- Async and threading options, e.g. ``async.concurrency`` and ``threading.max_workers``
28+
- Selections of implementations of codecs, codec pipelines and buffers
29+
30+
For selecting custom implementations of codecs, pipelines, buffers and ndbuffers,
31+
first register the implementations in the registry and then select them in the config.
32+
For example, an implementation of the bytes codec in a class "custompackage.NewBytesCodec",
33+
requires the value of ``codecs.bytes.name`` to be "custompackage.NewBytesCodec".
34+
35+
This is the current default configuration:
36+
37+
.. ipython:: python
38+
39+
import zarr
40+
41+
zarr.config.pprint()

docs/user-guide/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@ User Guide
1010
arrays
1111
groups
1212
storage
13+
config
1314
v3_migration
1415
todo
1516

src/zarr/core/config.py

Lines changed: 42 additions & 69 deletions
Original file line numberDiff line numberDiff line change
@@ -1,62 +1,30 @@
11
"""
2-
The config module is responsible for managing the configuration of zarr
3-
and is based on the `donfig <https://github.com/pytroll/donfig>`_ Python library.
4-
5-
Configuration values can be set using code like the following:
6-
7-
.. code-block:: python
8-
9-
import zarr
10-
zarr.config.set({"array.order": "F"})
11-
12-
Alternatively, configuration values can be set using environment variables, e.g.
13-
``ZARR_ARRAY__ORDER=F``.
14-
15-
The configuration can also be read from a YAML file in standard locations.
16-
For more information, see the
17-
`donfig documentation <https://donfig.readthedocs.io/en/latest/>`_.
18-
19-
Configuration options include the following:
20-
21-
- Default Zarr format ``default_zarr_version``
22-
- Default array order in memory ``array.order``
23-
- Async and threading options, e.g. ``async.concurrency`` and ``threading.max_workers``
24-
- Selections of implementations of codecs, codec pipelines and buffers
25-
26-
For selecting custom implementations of codecs, pipelines, buffers and ndbuffers,
27-
first register the implementations in the registry and then select them in the config.
28-
For example, an implementation of the bytes codec in a class "custompackage.NewBytesCodec",
29-
requires the value of ``codecs.bytes.name`` to be "custompackage.NewBytesCodec".
30-
31-
This is the current default configuration:
32-
33-
.. code-block:: python
34-
35-
{
36-
"default_zarr_version": 3,
37-
"array": {"order": "C"},
38-
"async": {"concurrency": 10, "timeout": None},
39-
"threading": {"max_workers": None},
40-
"json_indent": 2,
41-
"codec_pipeline": {
42-
"path": "zarr.core.codec_pipeline.BatchedCodecPipeline",
43-
"batch_size": 1,
44-
},
45-
"codecs": {
46-
"blosc": "zarr.codecs.blosc.BloscCodec",
47-
"gzip": "zarr.codecs.gzip.GzipCodec",
48-
"zstd": "zarr.codecs.zstd.ZstdCodec",
49-
"bytes": "zarr.codecs.bytes.BytesCodec",
50-
"endian": "zarr.codecs.bytes.BytesCodec",
51-
"crc32c": "zarr.codecs.crc32c_.Crc32cCodec",
52-
"sharding_indexed": "zarr.codecs.sharding.ShardingCodec",
53-
"transpose": "zarr.codecs.transpose.TransposeCodec",
54-
"vlen-utf8": "zarr.codecs.vlen_utf8.VLenUTF8Codec",
55-
"vlen-bytes": "zarr.codecs.vlen_utf8.VLenBytesCodec",
56-
},
57-
"buffer": "zarr.core.buffer.cpu.Buffer",
58-
"ndbuffer": "zarr.core.buffer.cpu.NDBuffer",
59-
}
2+
The config module is responsible for managing the configuration of zarr and is based on the Donfig python library.
3+
For selecting custom implementations of codecs, pipelines, buffers and ndbuffers, first register the implementations
4+
in the registry and then select them in the config.
5+
6+
Example:
7+
An implementation of the bytes codec in a class ``your.module.NewBytesCodec`` requires the value of ``codecs.bytes``
8+
to be ``your.module.NewBytesCodec``. Donfig can be configured programmatically, by environment variables, or from
9+
YAML files in standard locations.
10+
11+
.. code-block:: python
12+
13+
from your.module import NewBytesCodec
14+
from zarr.core.config import register_codec, config
15+
16+
register_codec("bytes", NewBytesCodec)
17+
config.set({"codecs.bytes": "your.module.NewBytesCodec"})
18+
19+
Instead of setting the value programmatically with ``config.set``, you can also set the value with an environment
20+
variable. The environment variable ``ZARR_CODECS__BYTES`` can be set to ``your.module.NewBytesCodec``. The double
21+
underscore ``__`` is used to indicate nested access.
22+
23+
.. code-block:: bash
24+
25+
export ZARR_CODECS__BYTES="your.module.NewBytesCodec"
26+
27+
For more information, see the Donfig documentation at https://github.com/pytroll/donfig.
6028
"""
6129

6230
from __future__ import annotations
@@ -71,7 +39,7 @@ class BadConfigError(ValueError):
7139

7240

7341
class Config(DConfig): # type: ignore[misc]
74-
"""Will collect configuration from config files and environment variables
42+
"""The Config will collect configuration from config files and environment variables
7543
7644
Example environment variables:
7745
Grabs environment variables of the form "ZARR_FOO__BAR_BAZ=123" and
@@ -89,21 +57,26 @@ def reset(self) -> None:
8957
self.refresh()
9058

9159

92-
# The config module is responsible for managing the configuration of zarr and is based on the Donfig python library.
93-
# For selecting custom implementations of codecs, pipelines, buffers and ndbuffers, first register the implementations
94-
# in the registry and then select them in the config.
95-
# e.g. an implementation of the bytes codec in a class "NewBytesCodec", requires the value of codecs.bytes.name to be
96-
# "NewBytesCodec".
97-
# Donfig can be configured programmatically, by environment variables, or from YAML files in standard locations
98-
# e.g. export ZARR_CODECS__BYTES__NAME="NewBytesCodec"
99-
# (for more information see github.com/pytroll/donfig)
100-
# Default values below point to the standard implementations of zarr-python
60+
# The default configuration for zarr
10161
config = Config(
10262
"zarr",
10363
defaults=[
10464
{
10565
"default_zarr_version": 3,
106-
"array": {"order": "C"},
66+
"array": {
67+
"order": "C",
68+
"write_empty_chunks": False,
69+
"v2_default_compressor": {
70+
"numeric": "zstd",
71+
"string": "vlen-utf8",
72+
"bytes": "vlen-bytes",
73+
},
74+
"v3_default_codecs": {
75+
"numeric": ["bytes", "zstd"],
76+
"string": ["vlen-utf8"],
77+
"bytes": ["vlen-bytes"],
78+
},
79+
},
10780
"async": {"concurrency": 10, "timeout": None},
10881
"threading": {"max_workers": None},
10982
"json_indent": 2,

0 commit comments

Comments
 (0)