Skip to content

Conversation

@rbeucher
Copy link
Member

This pull request introduces comprehensive support for temporal frequency validation and automatic resampling in the ACCESS-MOPPy workflow, ensuring input datasets conform to CMIP6 requirements. The changes add new configuration options for frequency validation and resampling, integrate these options into the main classes, and provide robust error handling. Additionally, a thorough suite of unit tests has been added to verify the new functionality.

Temporal frequency validation and resampling (core logic):

  • Added validate_frequency, enable_resampling, and resampling_method options to CMIP6_CMORiser, CMIP6_Ocean_CMORiser, and driver classes, allowing users to control frequency validation and resampling behavior. These options propagate through the workflow for consistent handling. [1] [2] [3] [4] [5]
  • Integrated pre-concatenation frequency validation (including CMIP6 compatibility checks) and post-load automatic resampling if needed, with clear user messaging and error handling. [1] [2]

Integration and configuration propagation:

  • Ensured new validation and resampling options are correctly passed from driver and ocean CMORiser classes to the base CMORiser, maintaining consistent configuration throughout. [1] [2] [3]

Testing and robustness:

  • Added a comprehensive unit test suite (tests/unit/test_temporal_resampling.py) covering detection of aggregation methods, frequency string conversion, actual resampling operations, integrated validation/resampling workflow, and error handling for edge cases.

These changes significantly improve the reliability and flexibility of temporal frequency handling in the ACCESS-MOPPy pipeline, making it easier to ensure CMIP6 compliance and diagnose issues with input data.

Features:
- Add lazy frequency detection from temporal coordinates using xarray/Dask
- Validate frequency consistency across concatenated input files
- Configurable validation with tolerance settings for slight differences
- Custom FrequencyMismatchError for clear error reporting
- integrate validation into all CMORiser classes with user control
- Comprehensive unit tests covering various frequency scenarios

This prevents issues when concatenating files with different temporal
frequencies and provides early detection of data inconsistencies without
loading full datasets into memory. The implementation is performance-oriented
using lazy evaluation to check only the first few time points of each file.

Usage:
- Validation enabled by default: ACCESS_ESM_CMORiser(...)
- Disable validation: ACCESS_ESM_CMORiser(..., validate_frequency=False)
- Tolerance configurable via validate_consistent_frequency() function

Files modified:
- utilities.py: Core frequency detection functions
- base.py: Integration with load_dataset method
- driver.py: User-facing validate_frequency parameter
- ocean.py: Updated constructor to pass through parameter
- tests/unit/test_frequency_detection.py: Comprehensive test suite
… new utility functions for frequency parsing and compatibility checks
- Added support for detecting temporal frequency from CF-compliant time bounds in `detect_time_frequency_lazy`.
- Implemented a new utility function `_detect_frequency_from_bounds` for improved frequency detection.
- Introduced `validate_and_resample_if_needed` to validate dataset frequency and resample if necessary for CMIP6 compatibility.
- Enhanced `resample_dataset_temporal` to handle various resampling methods based on variable metadata.
- Updated tests to cover new frequency detection and resampling features, including edge cases for single time points and bounds detection.
- Created a new test suite for temporal resampling functionality, ensuring robust validation and error handling.
… for calendar month variations. Implement helper functions to validate monthly files and adjust compatibility checks in CMIP6 frequency validation.
…hout resampling. Special handling for calendar month variations added to improve accuracy in CMIP6 frequency checks.
…tenation; add fallback to individual file analysis. Remove outdated test scripts for exact error simulation and monthly calendar variations.
This reverts commit 936f1c6.
@rbeucher rbeucher merged commit bd19269 into main Nov 12, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants