-
Notifications
You must be signed in to change notification settings - Fork 26
Description
No longer a bug report, instead a suggested improvement
What happened?
I am trying to download some NetCDF data from the CDS using Earthkit. When downloading the CMIP6 data I need, I get a folder with 3 files:
- provenance.json
- provenance.png
- tasmax_day_CMCC-ESM2_ssp585_r1i1p1f1_gn_20800101-20801231.nc
Earthkit then tries to read all these files, resulting in a warning message "Unknown file type, no reader available." followed by a file path to a .png file. It does correctly load the NetCDF file from the ZIP into memory and I can use it, but it seems like undesired behavior for it to try to load a PNG file (and potentially others) when I am specifying NetCDF.
I have tested this for multiple cases. Below are the example cases I have tried:
These examples threw the same warning
import earthkit.data
# Define request
dataset = "projections-cmip6"
request = {
'format': 'netcdf',
'temporal_resolution': 'daily',
'variable': "daily_maximum_near_surface_air_temperature",
'experiment': "ssp5_8_5",
'model': 'cmcc_esm2',
'year': "2080",
'month': "01",
'day': "01",
}
# Download data
ds = earthkit.data.from_source("cds", dataset, request)Warning: this is 700 MB
import earthkit.data
# Define request
dataset = "multi-origin-c3s-atlas",
request = {
"origin": "cmip5",
"experiment": "historical",
"domain": "global",
"period": "1850-2005",
"variable": "monthly_heavy_precipitation_days",
"bias_adjustment": "no_bias_adjustment"
}
# Download data
ds = earthkit.data.from_source("cds", dataset, request)This did work without the warning
import earthkit.data
# Define request
dataset = "projections-cmip5-daily-pressure-levels",
request = {
"experiment": "historical",
"variable": ["geopotential_height"],
"model": "access1_0",
"ensemble_member": "r1i1p1",
"period": ["19900101-19941231"]
}
# Download data
ds = earthkit.data.from_source("cds", dataset, request)Note: The example in this link gives the same warning: https://github.com/ecmwf/earthkit-plots/blob/develop/docs/examples/gallery/time-series/cmip6.ipynb
Note the following block of code does this but only reads the NetCDF using cdsapi.
# This code works
# This code works
from pathlib import Path
import cdsapi
import os
import xarray as xr
import zipfile
c = cdsapi.Client() # Key set up in .cdsapirc file
# Define request
dataset = "projections-cmip6"
request = {
'format': 'zip',
'temporal_resolution': 'daily',
'variable': "daily_maximum_near_surface_air_temperature",
'experiment': "ssp5_8_5",
'model': 'cmcc_esm2',
'year': "2080",
'month': "01",
'day': "01",
}
# Define filenames
dest = Path('./data')
os.makedirs(dest, exist_ok=True)
zip_path = dest / f"{dataset}_example.zip"
extract_path = zip_path.with_suffix("")
# Download data
c.retrieve(dataset, request, zip_path)
# Manually extract NetCDF file from ZIP
# Based on https://github.com/ecmwf-projects/c3s-atlas/blob/main/c3s_atlas/utils.py
with zipfile.ZipFile(zip_path , 'r') as zip_ref:
# Get filename inside ZIP
names = zip_ref.namelist() # In this example we know we're only downloading one
name_nc = [filename for filename in names if filename[-3:] == ".nc"][0]
# Extract
zip_ref.extract(name_nc, extract_path)
# Open dataset
ds = xr.open_dataset(extract_path/name_nc)
What are the steps to reproduce the bug?
import earthkit.data
# Define request
dataset = "projections-cmip6"
request = {
'format': 'netcdf',
'temporal_resolution': 'daily',
'variable': "daily_maximum_near_surface_air_temperature",
'experiment': "ssp5_8_5",
'model': 'cmcc_esm2',
'year': "2080",
'month': "01",
'day': "01",
}
# Download data
ds = earthkit.data.from_source("cds", dataset, request)Version
v0.11.2
Platform (OS and architecture)
Microsoft Windows 11 Enterprise 64-bit operating system, x64-based processor
Relevant log output
2025-08-12 13:36:44,084 INFO [2024-09-26T00:00:00] Watch our [Forum](https://forum.ecmwf.int/) for Announcements, news and other discussed topics.
2025-08-12 13:36:44,356 INFO [2024-09-26T00:00:00] Watch our [Forum](https://forum.ecmwf.int/) for Announcements, news and other discussed topics.
2025-08-12 13:36:45,168 INFO Request ID is 148d75e9-1a82-405b-89b5-caa1b1cd1208
2025-08-12 13:36:45,324 INFO status has been updated to accepted
2025-08-12 13:37:07,031 INFO status has been updated to successful
Unknown file type, no reader available. path=C:\Users\nr2\AppData\Local\Temp\tmpmeer6lxa\cds-d1477e4d87135c5a2a0dd362385cfa389fcf1e9d27c7d2c11c7d82684b59b703.d\provenance.png magic=b'\x89PNG\r\n\x1a\n\x00\x00\x00\rIHDR\x00\x00\n\xd3\x00\x00\x03"\x08\x02\x00\x00\x00\x99\xec9+\x00\x00\x00\x06bKGD\x00\xff\x00\xff\x00\xff\xa0\xbd\xa7\x93\x00\x00 \x00IDATx\x9c\xec\xddw' content_type=NoneAccompanying data
No response
Organisation
No response