-
Notifications
You must be signed in to change notification settings - Fork 36
Open
Description
Using Intake (intake-xarray) on a larger number of files, e.g. daily data for one or two decades, results in a too-many-files error.
Loading the same set of data with xarray.open_mfdatasets works just fine.
Versions:
python 3.8.2
xarray 0.16.1
intake 0.6.0
intake-xarray 0.4.0
For me, the total number of files was: 9464
The Intake catalog I have looks something like this:
metadata:
version: 1
plugins:
source:
- module: intake_xarray
sources:
daily_mean:
driver: netcdf
args:
urlpath: "{{ env(HOME) }}/path/to/data*.nc"
xarray_kwargs:
combine: by_coords
parallel: True
then, using
intake.open_catalog(path_to_catalog)["daily_mean"].to_dask().chunk({"time": -1, "longitude": 10, "latitude":10})
throws an error of the form of
OSError: [Errno 24] Too many open files: 'path/to/catalog.yml'
loading the same data with
xr.open_mfdataset("~/path/to/data*.nc", combine="by_coords", parallel=True).chunk({"time": -1, "longitude": 10, "latitude":10})
works just fine.
Metadata
Metadata
Assignees
Labels
No labels