Skip to content

Commit 89f7de8

Browse files
hmaarrfkdcherian
andauthored
Lazy import dask.distributed to reduce import time of xarray (#7172)
* Lazy import testing and tutorial * Lazy import distributed to avoid a costly import * Revert changes to __init__ * Explain why we lazy import * Add release note * dask.distritubed.lock now supports blocking argument Co-authored-by: Deepak Cherian <[email protected]>
1 parent 7f1f911 commit 89f7de8

File tree

2 files changed

+19
-20
lines changed

2 files changed

+19
-20
lines changed

doc/whats-new.rst

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -61,8 +61,7 @@ Internal Changes
6161
~~~~~~~~~~~~~~~~
6262
- Doctests fail on any warnings (:pull:`7166`)
6363
By `Maximilian Roos <https://github.com/max-sixty>`_.
64-
65-
64+
- Improve import time by lazy loading ``dask.distributed`` (:pull: `7172`).
6665
- Explicitly specify ``longdouble=False`` in :py:func:`cftime.date2num` when
6766
encoding times to preserve existing behavior and prevent future errors when it
6867
is eventually set to ``True`` by default in cftime (:pull:`7171`). By

xarray/backends/locks.py

Lines changed: 18 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -11,11 +11,6 @@
1111
# no need to worry about serializing the lock
1212
SerializableLock = threading.Lock # type: ignore
1313

14-
try:
15-
from dask.distributed import Lock as DistributedLock
16-
except ImportError:
17-
DistributedLock = None # type: ignore
18-
1914

2015
# Locks used by multiple backends.
2116
# Neither HDF5 nor the netCDF-C library are thread-safe.
@@ -41,14 +36,6 @@ def _get_multiprocessing_lock(key):
4136
return multiprocessing.Lock()
4237

4338

44-
_LOCK_MAKERS = {
45-
None: _get_threaded_lock,
46-
"threaded": _get_threaded_lock,
47-
"multiprocessing": _get_multiprocessing_lock,
48-
"distributed": DistributedLock,
49-
}
50-
51-
5239
def _get_lock_maker(scheduler=None):
5340
"""Returns an appropriate function for creating resource locks.
5441
@@ -61,7 +48,23 @@ def _get_lock_maker(scheduler=None):
6148
--------
6249
dask.utils.get_scheduler_lock
6350
"""
64-
return _LOCK_MAKERS[scheduler]
51+
52+
if scheduler is None:
53+
return _get_threaded_lock
54+
elif scheduler == "threaded":
55+
return _get_threaded_lock
56+
elif scheduler == "multiprocessing":
57+
return _get_multiprocessing_lock
58+
elif scheduler == "distributed":
59+
# Lazy import distributed since it is can add a significant
60+
# amount of time to import
61+
try:
62+
from dask.distributed import Lock as DistributedLock
63+
except ImportError:
64+
DistributedLock = None # type: ignore
65+
return DistributedLock
66+
else:
67+
raise KeyError(scheduler)
6568

6669

6770
def _get_scheduler(get=None, collection=None) -> str | None:
@@ -128,15 +131,12 @@ def acquire(lock, blocking=True):
128131
if blocking:
129132
# no arguments needed
130133
return lock.acquire()
131-
elif DistributedLock is not None and isinstance(lock, DistributedLock):
132-
# distributed.Lock doesn't support the blocking argument yet:
133-
# https://github.com/dask/distributed/pull/2412
134-
return lock.acquire(timeout=0)
135134
else:
136135
# "blocking" keyword argument not supported for:
137136
# - threading.Lock on Python 2.
138137
# - dask.SerializableLock with dask v1.0.0 or earlier.
139138
# - multiprocessing.Lock calls the argument "block" instead.
139+
# - dask.distributed.Lock uses the blocking argument as the first one
140140
return lock.acquire(blocking)
141141

142142

0 commit comments

Comments
 (0)