-
Notifications
You must be signed in to change notification settings - Fork 1
Description
machine
Lumi
catalog
climatedt-phase1
version
main
version
main
What happened?
While working on #162, I tried running aqua analysis multiple times with a few months of data, to check whether NotEnoughdataError, in case than a time span smaller than required months threshold, was triggered properly.
During these runs, I noticed ECmean (which theoretically requires at least 12 months of data) was never failing on this context, even providing 1 month of data: plots are generated with no major issues on this.
Looking more in detail, it seems that reader_data, which basically retrieves data via the Reader, does not take into account startdate and enddate (line 79-87 of ECmean CLI):
# Try to read the data, if dataset is not available return None
try:
reader = Reader(
model=model, exp=exp, source=source, catalog=catalog,
regrid=regrid, **reader_kwargs
)
xfield = reader.retrieve()
if regrid is not None:
xfield = reader.regrid(xfield)
As such, startdate and enddate are actually extracted from config:
startdate = get_arg(args, 'startdate', dataset.get('startdate'))
enddate = get_arg(args, 'enddate', dataset.get('enddate'))
But they are never taken into account from the Reader extraction:
data_atm = reader_data(model=model, exp=exp, source=source_atm,
catalog=catalog, keep_vars=atm_vars, regrid=regrid,
reader_kwargs=reader_kwargs)
As such, reader retrieves ALL data from the source, and time_check, later, looks into startdate and enddate and check if data are correct.
So,
- if you give
startdateandenddatewith less than 12 months,ECmeandoesn't break, but generates the plots according to the year those data you selected as startdate and enddate belong to; I don't know if this is intentional, but it seems like a bug, or at least misleading, since I ask for a couple of months and I obtain plots about months that I maybe didn't mean to ask - I think that retrieving every month from the source is quite inefficient if you need a source subset smaller than the whole source timeframe. Am I missing something?
Are you interested in making a pull request?
Maybe