-
Notifications
You must be signed in to change notification settings - Fork 28
Open
Description
I was using the notebook: download_merra2.ipynb and in the section: Setting up the DataFrame the following function raised an error:
xr.open_mfdataset(file_path, concat_dim='date', preprocess=extract_date)
raised:
xarray ValueError: Could not find any dimension coordinates to use to order the datasets for concatenation
I solved it (for Germany) by changing the code in the following way:
def extract_date(data_set):
"""
Extracts the date from the filename before merging the datasets.
"""
try:
# The attribute name changed during the development of this script
# from HDF5_Global.Filename to Filename.
if 'HDF5_GLOBAL.Filename' in data_set.attrs:
f_name = data_set.attrs['HDF5_GLOBAL.Filename']
elif 'Filename' in data_set.attrs:
f_name = data_set.attrs['Filename']
else:
raise AttributeError('The attribute name has changed again!')
# find a match between "." and ".nc4" that does not have "." .
exp = r'(?<=\.)[^\.]*(?=\.nc4)'
res = re.search(exp, f_name).group(0)
# Extract the date.
y, m, d = res[0:4], res[4:6], res[6:8]
date_str = ('%s-%s-%s' % (y, m, d))
data_set = data_set.assign(date=date_str)
data_set = data_set.expand_dims("date")
data_set.coords["lat"] = [47.5, 48.0, 48.5, 49.0, 49.5, 50.0, 50.5, 51.0, 51.5, 52.0, 52.5, 53.0, 53.5, 54.0, 54.5, 55.0]
data_set.coords["lon"] = [5.625, 6.25, 6.875, 7.5, 8.125, 8.75, 9.375, 10.0, 10.625, 11.25, 11.875, 12.5, 13.125, 13.75, 14.375, 15.0]
data_set.coords["time"] = list(range(24))
return data_set
except KeyError:
# The last dataset is the one all the other sets will be merged into.
# Therefore, no date can be extracted.
data_set.coords["lat"] = [47.5, 48.0, 48.5, 49.0, 49.5, 50.0, 50.5, 51.0, 51.5, 52.0, 52.5, 53.0, 53.5, 54.0, 54.5, 55.0]
data_set.coords["lon"] = [5.625, 6.25, 6.875, 7.5, 8.125, 8.75, 9.375, 10.0, 10.625, 11.25, 11.875, 12.5, 13.125, 13.75, 14.375, 15.0]
data_set.coords["time"] = list(range(24))
return data_set
and by commenting in the following cell:
df.drop('DISPH', axis=1, inplace=True)
df.drop(['time', 'date'], axis=1, inplace=True)
df.drop(['U2M', 'U10M', 'U50M', 'V2M', 'V10M', 'V50M'], axis=1, inplace=True)
# df['lat'] = df['lat'].apply(lambda x: lat_array[int(x)])
# df['lon'] = df['lon'].apply(lambda x: lon_array[int(x)])I could not check whether the same error occurred on another machine.
For the rest thank you for writing this awesome notebook!
Metadata
Metadata
Assignees
Labels
No labels