Skip to content

Commit 639eb54

Browse files
committed
docs: add migration page to user guide
1 parent 01bc352 commit 639eb54

File tree

2 files changed

+205
-1
lines changed

2 files changed

+205
-1
lines changed

docs/user-guide/index.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,10 +11,10 @@ User Guide
1111
attributes
1212
storage
1313
config
14+
v3_migration
1415

1516
.. Coming soon
1617
installation
17-
v3_migration
1818
1919
Advanced Topics
2020
---------------

docs/user-guide/v3_migration.rst

Lines changed: 204 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,204 @@
1+
3.0 Migration Guide
2+
===================
3+
4+
Zarr-Python 3.0 introduces a number of changes to the API, including a number
5+
of significant breaking changes and pending deprecations.
6+
7+
This page provides a guide highlighting the most notable changes to help you
8+
migrate your code from version 2 to version 3.
9+
10+
Zarr-Python 3 represents a major refactor of the Zarr-Python codebase. Some of the
11+
goals motivating this refactor included:
12+
13+
* adding support for the Zarr V3 specification (alongside the Zarr V2 specification)
14+
* cleaning up internal and user facing APIs
15+
* improving performance (particularly in high latency storage environments like
16+
cloud object store)
17+
18+
Compatibility target
19+
--------------------
20+
21+
The goals described above necessitated some breaking changes to the API (hence the
22+
major version update), but we have attempted to maintain ~95% backwards compatibility
23+
in the most widely used parts of the API. This in the :class:`zarr.Array` and
24+
:class:`zarr.Group` classes and the "top-level API" (e.g. :func:`zarr.open_array` and
25+
:func:`zarr.open_group`).
26+
27+
Getting ready for 3.0
28+
---------------------
29+
30+
Ahead of the 3.0 release, we suggest projects that depend on Zarr-Python take the
31+
following actions:
32+
33+
1. Pin the supported Zarr-Python version to ``zarr>=2,<3``. This is a best practice
34+
and will protect your users from any incompatibilities that may arise during the
35+
release of Zarr-Python 3.0.
36+
2. Limit your imports from the Zarr-Python package. Most of the primary API ``zarr.*``
37+
will be compatible in 3.0. However, the following breaking API changes are planned:
38+
39+
- ``numcodecs.*`` will no longer be available in ``zarr.*``. To migrate, import codecs
40+
directly from ``numcodecs``:
41+
42+
.. code-block:: python
43+
44+
from numcodecs import Blosc
45+
# instead of:
46+
# from zarr import Blosc
47+
48+
- The ``zarr.v3_api_available`` feature flag is being removed. In Zarr-Python 3.0
49+
the v3 API is always available, so you shouldn't need to use this flag.
50+
- The following internal modules are being removed or significantly changed. If
51+
your application relies on imports from any of the below modules, you will need
52+
to either a) modify your application to no longer rely on these imports or b)
53+
vendor the parts of the specific modules that you need.
54+
55+
* ``zarr.attrs``
56+
* ``zarr.codecs``
57+
* ``zarr.context``
58+
* ``zarr.core``
59+
* ``zarr.hierarchy``
60+
* ``zarr.indexing``
61+
* ``zarr.meta``
62+
* ``zarr.meta_v1``
63+
* ``zarr.storage``
64+
* ``zarr.sync``
65+
* ``zarr.types``
66+
* ``zarr.util``
67+
* ``zarr.n5``
68+
69+
3. Test that your package works with v3. You can start testing against version 3 now
70+
(pre-releases are being published to PyPI weekly).
71+
4. Update the pin to zarr >=3
72+
73+
Continue using Zarr-Python 2
74+
----------------------------
75+
76+
Zarr-Python 2.x is still available, though we recommend migrating to Zarr-Python 3 for
77+
its improvements and new features. Security and bug fixes will be made to the 2.x series
78+
for at least 6 months following the first Zarr-Python 3 release.
79+
If you need to use the latest Zarr-Python 2 release, you can install it with:
80+
81+
.. code-block:: console
82+
83+
$ pip install "zarr==2.*"
84+
85+
Migration Guide
86+
---------------
87+
88+
The following sections provide details on the most important changes in Zarr-Python 3.
89+
90+
The Array class
91+
~~~~~~~~~~~~~~~
92+
93+
1. Disallow direct construction - use :func:`zarr.open_array` or :func:`zarr.create_array`
94+
instead of directly constructing the :class:`zarr.Array` class.
95+
96+
2. Defaulting to ``zarr_format=3`` - newly created arrays will use the version 3 of the
97+
Zarr specification. To continue using version 2, set ``zarr_format=2`` when creating arrays
98+
or set ``default_zarr_version=2`` in :ref:`config`.
99+
100+
The Group class
101+
~~~~~~~~~~~~~~~
102+
103+
1. Disallow direct construction - use :func:`zarr.open_group` or :func:`zarr.create_group`
104+
instead of directly constructing the :class:`zarr.Group` class.
105+
2. Deprecated most of the h5py compatibility methods. The following migration is suggested:
106+
107+
- Use :func:`zarr.Group.create_array` in place of :func:`zarr.Group.create_dataset`
108+
- Use :func:`zarr.Group.require_array` in place of :func:`zarr.Group.require_dataset`
109+
110+
The Store class
111+
~~~~~~~~~~~~~~~
112+
113+
Some of the biggest changes in Zarr-Python 3 are found in the ``Store`` class. The most notable changes to the Store API are:
114+
115+
1. Replaced the ``MutableMapping`` base class in favor of a custom abstract base class (:class:`zarr.abc.store.Store`).
116+
2. Switched to a primarily Async interface.
117+
118+
Beyond the changes store interface, a number of deprecated stores were also removed in Zarr-Python 3:
119+
120+
- ``N5Store``
121+
- ``DBMStore``
122+
- ``LMDBStore``
123+
- ``SQLiteStore``
124+
- ``MongoDBStore``
125+
- ``RedisStore``
126+
- ``ABSStore``
127+
128+
Dependencies Changes
129+
~~~~~~~~~~~~~~~~~~~~
130+
131+
- The new ``remote`` dependency group can be used to install a supported version of
132+
``fsspec``, required for remote data access.
133+
- The new ``gpu`` dependency group can be used to install a supported version of
134+
``cuda``, required for GPU functionality.
135+
- The ``jupyter`` optional dependency group has been removed, since v3 contains no
136+
jupyter specific functionality.
137+
138+
Configuration
139+
~~~~~~~~~~~~~
140+
141+
There is a new configuration system based on `donfig <https://github.com/pytroll/donfig>`_,
142+
which can be accessed via :mod:`zarr.core.config`.
143+
Configuration values can be set using code like the following:
144+
145+
.. code-block:: python
146+
147+
import zarr
148+
zarr.config.set({"array.order": "F"})
149+
150+
Alternatively, configuration values can be set using environment variables,
151+
e.g. ``ZARR_ARRAY__ORDER=F``.
152+
153+
Configuration options include the following:
154+
155+
- Default Zarr format ``default_zarr_version``
156+
- Default array order in memory ``array.order``
157+
- Default codecs ``array.v3_default_codecs`` and ``array.v2_default_compressor``
158+
- Whether empty chunks are written to storage ``array.write_empty_chunks``
159+
- Async and threading options, e.g. ``async.concurrency`` and ``threading.max_workers``
160+
- Selections of implementations of codecs, codec pipelines and buffers
161+
162+
Miscellaneous
163+
~~~~~~~~~~~~~
164+
165+
- The keyword argument ``zarr_version`` has been deprecated in favor of ``zarr_format``.
166+
167+
🚧 Work in Progress 🚧
168+
~~~~~~~~~~~~~~~~~~~~~~
169+
170+
Zarr-Python 3 is still under active development, and is not yet fully complete.
171+
The following list summarizes areas of the codebase that we expect to build out
172+
after the 3.0 release:
173+
174+
- The following functions / methods have not been ported to Zarr-Python 3 yet:
175+
176+
* :func:`zarr.copy`
177+
* :func:`zarr.copy_all`
178+
* :func:`zarr.copy_store`
179+
* :func:`zarr.Group.move`
180+
181+
- The following options in the top-level API have not been ported to Zarr-Python 3 yet.
182+
If these options are important to you, please open a
183+
`GitHub issue <https://github.com/zarr-developers/zarr-python/issues/new>`_ describing
184+
your use case.
185+
186+
* ``cache_attrs``
187+
* ``cache_metadata``
188+
* ``chunk_store``
189+
* ``meta_array``
190+
* ``object_codec``
191+
* ``synchronizer``
192+
* ``dimension_separator``
193+
194+
- The following features have not been ported to Zarr-Python 3 yet:
195+
196+
* Structured arrays / dtypes
197+
* Fixed-length strings
198+
* Object arrays
199+
* Ragged arrays
200+
* Datetimes and timedeltas
201+
* Groups and Arrays do not implement ``__enter__`` and ``__exit__`` protocols
202+
203+
The, currently outdated, documentation for some of these features has been maintained
204+
from Zarr-Python 2 in `V3 TO DOs <v3_todos.html>`_.

0 commit comments

Comments
 (0)