Skip to content

Commit b96d607

Browse files
clarify lazy behaviour and eager loading chunks=None in open_*-functions (#10627)
* clarify lazy behaviour and eager loading chunks=None in open_*-functions * add whats-new.rst entry * Update xarray/backends/api.py Co-authored-by: Deepak Cherian <[email protected]>
1 parent e8b41fe commit b96d607

File tree

3 files changed

+24
-10
lines changed

3 files changed

+24
-10
lines changed

doc/whats-new.rst

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -57,6 +57,10 @@ Bug fixes
5757
Documentation
5858
~~~~~~~~~~~~~
5959

60+
- Clarify lazy behaviour and eager loading for ``chunks=None`` in :py:func:`~xarray.open_dataset`, :py:func:`~xarray.open_dataarray`, :py:func:`~xarray.open_datatree`, :py:func:`~xarray.open_groups` and :py:func:`~xarray.open_zarr` (:issue:`10612`, :pull:`10627`).
61+
By `Kai Mühlbauer <https://github.com/kmuehlbauer>`_.
62+
63+
6064

6165
Internal Changes
6266
~~~~~~~~~~~~~~~~

xarray/backends/api.py

Lines changed: 16 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -578,8 +578,10 @@ def open_dataset(
578578
579579
- ``chunks="auto"`` will use dask ``auto`` chunking taking into account the
580580
engine preferred chunks.
581-
- ``chunks=None`` skips using dask, which is generally faster for
582-
small arrays.
581+
- ``chunks=None`` skips using dask. This uses xarray's internally private
582+
:ref:`lazy indexing classes <internal design.lazy indexing>`,
583+
but data is eagerly loaded into memory as numpy arrays when accessed.
584+
This can be more efficient for smaller arrays or when large arrays are sliced before computation.
583585
- ``chunks=-1`` loads the data with dask using a single chunk for all arrays.
584586
- ``chunks={}`` loads the data with dask using the engine's preferred chunk
585587
size, generally identical to the format's chunk size. If not available, a
@@ -819,8 +821,10 @@ def open_dataarray(
819821
820822
- ``chunks='auto'`` will use dask ``auto`` chunking taking into account the
821823
engine preferred chunks.
822-
- ``chunks=None`` skips using dask, which is generally faster for
823-
small arrays.
824+
- ``chunks=None`` skips using dask. This uses xarray's internally private
825+
:ref:`lazy indexing classes <internal design.lazy indexing>`,
826+
but data is eagerly loaded into memory as numpy arrays when accessed.
827+
This can be more efficient for smaller arrays, though results may vary.
824828
- ``chunks=-1`` loads the data with dask using a single chunk for all arrays.
825829
- ``chunks={}`` loads the data with dask using engine preferred chunks if
826830
exposed by the backend, otherwise with a single chunk for all arrays.
@@ -1044,8 +1048,10 @@ def open_datatree(
10441048
10451049
- ``chunks="auto"`` will use dask ``auto`` chunking taking into account the
10461050
engine preferred chunks.
1047-
- ``chunks=None`` skips using dask, which is generally faster for
1048-
small arrays.
1051+
- ``chunks=None`` skips using dask. This uses xarray's internally private
1052+
:ref:`lazy indexing classes <internal design.lazy indexing>`,
1053+
but data is eagerly loaded into memory as numpy arrays when accessed.
1054+
This can be more efficient for smaller arrays, though results may vary.
10491055
- ``chunks=-1`` loads the data with dask using a single chunk for all arrays.
10501056
- ``chunks={}`` loads the data with dask using the engine's preferred chunk
10511057
size, generally identical to the format's chunk size. If not available, a
@@ -1288,8 +1294,10 @@ def open_groups(
12881294
12891295
- ``chunks="auto"`` will use dask ``auto`` chunking taking into account the
12901296
engine preferred chunks.
1291-
- ``chunks=None`` skips using dask, which is generally faster for
1292-
small arrays.
1297+
- ``chunks=None`` skips using dask. This uses xarray's internally private
1298+
:ref:`lazy indexing classes <internal design.lazy indexing>`,
1299+
but data is eagerly loaded into memory as numpy arrays when accessed.
1300+
This can be more efficient for smaller arrays, though results may vary.
12931301
- ``chunks=-1`` loads the data with dask using a single chunk for all arrays.
12941302
- ``chunks={}`` loads the data with dask using the engine's preferred chunk
12951303
size, generally identical to the format's chunk size. If not available, a

xarray/backends/zarr.py

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1370,8 +1370,10 @@ def open_zarr(
13701370
13711371
- ``chunks='auto'`` will use dask ``auto`` chunking taking into account the
13721372
engine preferred chunks.
1373-
- ``chunks=None`` skips using dask, which is generally faster for
1374-
small arrays.
1373+
- ``chunks=None`` skips using dask. This uses xarray's internally private
1374+
:ref:`lazy indexing classes <internal design.lazy indexing>`,
1375+
but data is eagerly loaded into memory as numpy arrays when accessed.
1376+
This can be more efficient for smaller arrays, though results may vary.
13751377
- ``chunks=-1`` loads the data with dask using a single chunk for all arrays.
13761378
- ``chunks={}`` loads the data with dask using engine preferred chunks if
13771379
exposed by the backend, otherwise with a single chunk for all arrays.

0 commit comments

Comments
 (0)