Skip to content

Support wildcards in timerange facet for all data sources #2908

@bouweandela

Description

@bouweandela

Not all data sources may return a start and end time for the data they find. In some cases, this requires opening/reading the file. Reading files at the data finding stage can severely slow down data searches; therefore, we prefer not to do it.

Currently:

  • esmvalcore.local.LocalDataSource returns search results with a timerange facet for time-dependent variables. For files where the filename does not contain the start and end date, the file will be read to retrieve this information, e.g. CMIP3 data.
  • esmvalcore.io.intake_esgf.IntakeESGFDataSource returns search results without a timerange facet
  • esmvalcorel.esgf.ESGFDataSource: returns search results with a timerange facet when it can be extracted from the filename.

To better support data sources that do not provide information on start and end time, we may want to

  1. add a check that the timerange is complete in the preprocessing chain after the load step
  2. add support for wildcards (e.g. timerange="*", timerange="*/P2Y", timerange="1980/*") to the clip_timerange preprocessor function

Alternatively, we could add support for getting the timerange through intake-esgf as well, but that will only work as long as the ESGF search interface provides it. This solution may not be possible for other, planned data sources such as xcube and intake-esm.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions