BUG: Default dask option for open_virtual_mfdataset not parallelizing

I pointed out this possible bug in another more complex [issue](https://github.com/NASA-IMPACT/veda-odd/issues/196#issuecomment-3005330286) yesterday, but wanted to isolate this in a smaller reproducer. 

The basic issue is that using all defaults the `parallel='dask'` option does not actually parallelize the read at all! But when I manually initiate a distributed client before it works as expected.

This example here uses public CMIP6 netcdf data so it should be reproducible anywhere. The machine I used had 16 cores so theoretically opening the 4 files used here should not take longer than a single file if parallelization works perfectly.

```python
from obstore.store import S3Store
from virtualizarr.parsers import HDFParser
from virtualizarr import open_virtual_dataset, open_virtual_mfdataset

bucket = "s3://esgf-world/"
store = S3Store.from_url(url=bucket, skip_signature=True, region="us-east-2")


files = [
    "uo_Omon_CESM2_historical_r10i1p1f1_gn_185001-189912.nc",
    "uo_Omon_CESM2_historical_r10i1p1f1_gn_190001-194912.nc",
    "uo_Omon_CESM2_historical_r10i1p1f1_gn_195001-199912.nc",
    "uo_Omon_CESM2_historical_r10i1p1f1_gn_200001-201412.nc"
]
urls = [f"{bucket}CMIP6/CMIP/NCAR/CESM2/historical/r10i1p1f1/Omon/uo/gn/v20190313/{file}" for file in files]
```

When I time a single file read

```python
#baseline timing, single url
vds_single = open_virtual_dataset(urls[0], object_store=store, parser=HDFParser())
```
It takes
`Wall time: 24.5 s`

But for all 4 files

```python
#Using the default dask option for open_virtual_mfdataset
vds_dask_default_distributed = open_virtual_mfdataset(
    urls,
    object_store=store,
    parser=HDFParser(),
    parallel='dask',
    concat_dim="time",
    combine="nested",
    data_vars="minimal",
    coords="minimal",
    compat="override",
    join="exact",
)
vds_dask_default_distributed
```
I get 
`Wall time: 1min 16s`

**This tells me that the files are opened in serial**

Interestingly if I modify the first code cell to set up a distributed client (and with that LocalCluster):

```python
from obstore.store import S3Store
from virtualizarr.parsers import HDFParser
from virtualizarr import open_virtual_dataset, open_virtual_mfdataset
from dask.distributed import Client

#✨ the only line changed✨
client = Client()

bucket = "s3://esgf-world/"
store = S3Store.from_url(url=bucket, skip_signature=True, region="us-east-2")

files = [
    "uo_Omon_CESM2_historical_r10i1p1f1_gn_185001-189912.nc",
    "uo_Omon_CESM2_historical_r10i1p1f1_gn_190001-194912.nc",
    "uo_Omon_CESM2_historical_r10i1p1f1_gn_195001-199912.nc",
    "uo_Omon_CESM2_historical_r10i1p1f1_gn_200001-201412.nc"
]
urls = [f"{bucket}CMIP6/CMIP/NCAR/CESM2/historical/r10i1p1f1/Omon/uo/gn/v20190313/{file}" for file in files]
```
I get 22s and 26s respectively, which I interpret as a nicely parallel opening + some minor overhead for e.g. concat. 

My hunch is that maybe somewhere in the [DaskDelayedExecutor](https://github.com/zarr-developers/VirtualiZarr/blob/7143f40cbc5dc32cdd8958bec130933f912cf092/virtualizarr/parallel.py#L127-L224) something is picking up defaults from the environment and by accident this happens to be a serial executor? Just an idea.

This is not my main priority to solve right now (since lithops works great out of the box), but I wanted to materialize this here, and can probably look into it further in the next days if I find some time.

## Versions
```
virtualizarr==1.3.3.dev73+g7143f40 (from git+https://github.com/zarr-developers/VirtualiZarr.git@7143f40cbc5dc32cdd8958bec130933f912cf092)
obstore==0.7.0
dask==2025.5.1
distributed==2025.5.1
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: Default dask option for open_virtual_mfdataset not parallelizing #634

Versions

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

BUG: Default dask option for open_virtual_mfdataset not parallelizing #634

Description

Versions

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions