|
| 1 | +3.0 Migration Guide |
| 2 | +=================== |
| 3 | + |
| 4 | +Zarr-Python 3.0 introduces a number of changes to the API, including a number |
| 5 | +of significant breaking changes and pending deprecations. |
| 6 | + |
| 7 | +This page provides a guide highlighting the most notable changes to help you |
| 8 | +migrate your code from version 2 to version 3. |
| 9 | + |
| 10 | +Zarr-Python 3 represents a major refactor of the Zarr-Python codebase. Some of the |
| 11 | +goals motivating this refactor included: |
| 12 | + |
| 13 | +* adding support for the Zarr V3 specification (alongside the Zarr V2 specification) |
| 14 | +* cleaning up internal and user facing APIs |
| 15 | +* improving performance (particularly in high latency storage environments like |
| 16 | + cloud object store) |
| 17 | + |
| 18 | +Compatibility target |
| 19 | +-------------------- |
| 20 | + |
| 21 | +The goals described above necessitated some breaking changes to the API (hence the |
| 22 | +major version update), but we have attempted to maintain ~95% backwards compatibility |
| 23 | +in the most widely used parts of the API. This in the :class:`zarr.Array` and |
| 24 | +:class:`zarr.Group` classes and the "top-level API" (e.g. :func:`zarr.open_array` and |
| 25 | +:func:`zarr.open_group`). |
| 26 | + |
| 27 | +Getting ready for 3.0 |
| 28 | +--------------------- |
| 29 | + |
| 30 | +Ahead of the 3.0 release, we suggest projects that depend on Zarr-Python take the |
| 31 | +following actions: |
| 32 | + |
| 33 | +1. Pin the supported Zarr-Python version to ``zarr>=2,<3``. This is a best practice |
| 34 | + and will protect your users from any incompatibilities that may arise during the |
| 35 | + release of Zarr-Python 3.0. |
| 36 | +2. Limit your imports from the Zarr-Python package. Most of the primary API ``zarr.*`` |
| 37 | + will be compatible in 3.0. However, the following breaking API changes are planned: |
| 38 | + |
| 39 | + - ``numcodecs.*`` will no longer be available in ``zarr.*``. To migrate, import codecs |
| 40 | + directly from ``numcodecs``: |
| 41 | + |
| 42 | + .. code-block:: python |
| 43 | +
|
| 44 | + from numcodecs import Blosc |
| 45 | + # instead of: |
| 46 | + # from zarr import Blosc |
| 47 | +
|
| 48 | + - The ``zarr.v3_api_available`` feature flag is being removed. In Zarr-Python 3.0 |
| 49 | + the v3 API is always available, so you shouldn't need to use this flag. |
| 50 | + - The following internal modules are being removed or significantly changed. If |
| 51 | + your application relies on imports from any of the below modules, you will need |
| 52 | + to either a) modify your application to no longer rely on these imports or b) |
| 53 | + vendor the parts of the specific modules that you need. |
| 54 | + |
| 55 | + * ``zarr.attrs`` |
| 56 | + * ``zarr.codecs`` |
| 57 | + * ``zarr.context`` |
| 58 | + * ``zarr.core`` |
| 59 | + * ``zarr.hierarchy`` |
| 60 | + * ``zarr.indexing`` |
| 61 | + * ``zarr.meta`` |
| 62 | + * ``zarr.meta_v1`` |
| 63 | + * ``zarr.storage`` |
| 64 | + * ``zarr.sync`` |
| 65 | + * ``zarr.types`` |
| 66 | + * ``zarr.util`` |
| 67 | + * ``zarr.n5`` |
| 68 | + |
| 69 | +3. Test that your package works with v3. You can start testing against version 3 now |
| 70 | + (pre-releases are being published to PyPI weekly). |
| 71 | +4. Update the pin to zarr >=3 |
| 72 | + |
| 73 | +Continue using Zarr-Python 2 |
| 74 | +---------------------------- |
| 75 | + |
| 76 | +Zarr-Python 2.x is still available, though we recommend migrating to Zarr-Python 3 for |
| 77 | +its improvements and new features. Security and bug fixes will be made to the 2.x series |
| 78 | +for at least 6 months following the first Zarr-Python 3 release. |
| 79 | +If you need to use the latest Zarr-Python 2 release, you can install it with: |
| 80 | + |
| 81 | +.. code-block:: console |
| 82 | +
|
| 83 | + $ pip install "zarr==2.*" |
| 84 | +
|
| 85 | +Migration Guide |
| 86 | +--------------- |
| 87 | + |
| 88 | +The following sections provide details on the most important changes in Zarr-Python 3. |
| 89 | + |
| 90 | +The Array class |
| 91 | +~~~~~~~~~~~~~~~ |
| 92 | + |
| 93 | +1. Disallow direct construction - use :func:`zarr.open_array` or :func:`zarr.create_array` |
| 94 | + instead of directly constructing the :class:`zarr.Array` class. |
| 95 | + |
| 96 | +2. Defaulting to ``zarr_format=3`` - newly created arrays will use the version 3 of the |
| 97 | + Zarr specification. To continue using version 2, set ``zarr_format=2`` when creating arrays |
| 98 | + or set ``default_zarr_version=2`` in :ref:`config`. |
| 99 | + |
| 100 | +The Group class |
| 101 | +~~~~~~~~~~~~~~~ |
| 102 | + |
| 103 | +1. Disallow direct construction - use :func:`zarr.open_group` or :func:`zarr.create_group` |
| 104 | + instead of directly constructing the :class:`zarr.Group` class. |
| 105 | +2. Deprecated most of the h5py compatibility methods. The following migration is suggested: |
| 106 | + |
| 107 | + - Use :func:`zarr.Group.create_array` in place of :func:`zarr.Group.create_dataset` |
| 108 | + - Use :func:`zarr.Group.require_array` in place of :func:`zarr.Group.require_dataset` |
| 109 | + |
| 110 | +The Store class |
| 111 | +~~~~~~~~~~~~~~~ |
| 112 | + |
| 113 | +Some of the biggest changes in Zarr-Python 3 are found in the ``Store`` class. The most notable changes to the Store API are: |
| 114 | + |
| 115 | +1. Replaced the ``MutableMapping`` base class in favor of a custom abstract base class (:class:`zarr.abc.store.Store`). |
| 116 | +2. Switched to a primarily Async interface. |
| 117 | + |
| 118 | +Beyond the changes store interface, a number of deprecated stores were also removed in Zarr-Python 3: |
| 119 | + |
| 120 | +- ``N5Store`` |
| 121 | +- ``DBMStore`` |
| 122 | +- ``LMDBStore`` |
| 123 | +- ``SQLiteStore`` |
| 124 | +- ``MongoDBStore`` |
| 125 | +- ``RedisStore`` |
| 126 | +- ``ABSStore`` |
| 127 | + |
| 128 | +Dependencies Changes |
| 129 | +~~~~~~~~~~~~~~~~~~~~ |
| 130 | + |
| 131 | +- The new ``remote`` dependency group can be used to install a supported version of |
| 132 | + ``fsspec``, required for remote data access. |
| 133 | +- The new ``gpu`` dependency group can be used to install a supported version of |
| 134 | + ``cuda``, required for GPU functionality. |
| 135 | +- The ``jupyter`` optional dependency group has been removed, since v3 contains no |
| 136 | + jupyter specific functionality. |
| 137 | + |
| 138 | +Configuration |
| 139 | +~~~~~~~~~~~~~ |
| 140 | + |
| 141 | +There is a new configuration system based on `donfig <https://github.com/pytroll/donfig>`_, |
| 142 | +which can be accessed via :mod:`zarr.core.config`. |
| 143 | +Configuration values can be set using code like the following: |
| 144 | + |
| 145 | +.. code-block:: python |
| 146 | +
|
| 147 | + import zarr |
| 148 | + zarr.config.set({"array.order": "F"}) |
| 149 | +
|
| 150 | +Alternatively, configuration values can be set using environment variables, |
| 151 | +e.g. ``ZARR_ARRAY__ORDER=F``. |
| 152 | + |
| 153 | +Configuration options include the following: |
| 154 | + |
| 155 | +- Default Zarr format ``default_zarr_version`` |
| 156 | +- Default array order in memory ``array.order`` |
| 157 | +- Default codecs ``array.v3_default_codecs`` and ``array.v2_default_compressor`` |
| 158 | +- Whether empty chunks are written to storage ``array.write_empty_chunks`` |
| 159 | +- Async and threading options, e.g. ``async.concurrency`` and ``threading.max_workers`` |
| 160 | +- Selections of implementations of codecs, codec pipelines and buffers |
| 161 | + |
| 162 | +Miscellaneous |
| 163 | +~~~~~~~~~~~~~ |
| 164 | + |
| 165 | +- The keyword argument ``zarr_version`` has been deprecated in favor of ``zarr_format``. |
| 166 | + |
| 167 | +🚧 Work in Progress 🚧 |
| 168 | +~~~~~~~~~~~~~~~~~~~~~~ |
| 169 | + |
| 170 | +Zarr-Python 3 is still under active development, and is not yet fully complete. |
| 171 | +The following list summarizes areas of the codebase that we expect to build out |
| 172 | +after the 3.0 release: |
| 173 | + |
| 174 | +- The following functions / methods have not been ported to Zarr-Python 3 yet: |
| 175 | + |
| 176 | + * :func:`zarr.copy` |
| 177 | + * :func:`zarr.copy_all` |
| 178 | + * :func:`zarr.copy_store` |
| 179 | + * :func:`zarr.Group.move` |
| 180 | + |
| 181 | +- The following options in the top-level API have not been ported to Zarr-Python 3 yet. |
| 182 | + If these options are important to you, please open a |
| 183 | + `GitHub issue <https://github.com/zarr-developers/zarr-python/issues/new>`_ describing |
| 184 | + your use case. |
| 185 | + |
| 186 | + * ``cache_attrs`` |
| 187 | + * ``cache_metadata`` |
| 188 | + * ``chunk_store`` |
| 189 | + * ``meta_array`` |
| 190 | + * ``object_codec`` |
| 191 | + * ``synchronizer`` |
| 192 | + * ``dimension_separator`` |
| 193 | + |
| 194 | +- The following features have not been ported to Zarr-Python 3 yet: |
| 195 | + |
| 196 | + * Structured arrays / dtypes |
| 197 | + * Fixed-length strings |
| 198 | + * Object arrays |
| 199 | + * Ragged arrays |
| 200 | + * Datetimes and timedeltas |
| 201 | + * Groups and Arrays do not implement ``__enter__`` and ``__exit__`` protocols |
| 202 | + |
| 203 | +The, currently outdated, documentation for some of these features has been maintained |
| 204 | +from Zarr-Python 2 in `V3 TO DOs <v3_todos.html>`_. |
0 commit comments