The problem
PMP requires some post-processed datasets for its annual cycle metric (#224). These datasets are not available on obs4MIPs directly, but are instead calculated and served from a PMP dataserver.
The raw Obs4MIPs datasets have already been cleared for used by the metrics providers.
There are a few different ways we can include these data in the REF.
We could do something similar to ILAMB where these data are fetched and cached locally, rather than ingesting. The long-term plan would likely be to track these in the database so that they can be used with our provenance generation scheme. This would be simpler and is pretty much ready as part of #227.
@lee1043 can then use the data registry directly to get the local filename of the downloaded file, but then he needs to track the full key of the dataset.
The complete approach would be to ingest these datasets in a similar manner to other ESGF-based datasets.
The post-processed datasets are obs4MIPs compatible and can be ingested as it stands using the obs4MIPs ingestion process. They could then be used like other obs4MIPs datasets. To not clash with the full-obs4MIPs datasets we will need another data source (perhaps pmp_annual_cycle) and some functionality to fetch these datasets locally (already also implemented in #227, but we will need some scripts to setup all the obs data).
ESMValTool might need a similar system. @bouweandela Do you have any additional requirements or thoughts about how best to implement this?
Definition of "done"
Additional context
The problem
PMP requires some post-processed datasets for its annual cycle metric (#224). These datasets are not available on obs4MIPs directly, but are instead calculated and served from a PMP dataserver.
The raw Obs4MIPs datasets have already been cleared for used by the metrics providers.
There are a few different ways we can include these data in the REF.
We could do something similar to ILAMB where these data are fetched and cached locally, rather than ingesting. The long-term plan would likely be to track these in the database so that they can be used with our provenance generation scheme. This would be simpler and is pretty much ready as part of #227.
@lee1043 can then use the data registry directly to get the local filename of the downloaded file, but then he needs to track the full key of the dataset.
The complete approach would be to ingest these datasets in a similar manner to other ESGF-based datasets.
The post-processed datasets are obs4MIPs compatible and can be ingested as it stands using the obs4MIPs ingestion process. They could then be used like other obs4MIPs datasets. To not clash with the full-obs4MIPs datasets we will need another data source (perhaps
pmp_annual_cycle) and some functionality to fetch these datasets locally (already also implemented in #227, but we will need some scripts to setup all the obs data).ESMValTool might need a similar system. @bouweandela Do you have any additional requirements or thoughts about how best to implement this?
Definition of "done"
Additional context