corrected longitude selection near 0#1843
Conversation
|
Do we also need to correct the longitudes for the gcm selection? I think yes, eventually ( |
There was a problem hiding this comment.
Pull request overview
Fixes longitude wrapping/selection edge cases for glaciers near the 0° meridian (notably for ERA5 flattened climate files), aiming to ensure glaciers in (-res/2, 0) correctly select the 0° gridpoint rather than the neighbor near 360°.
Changes:
- Updates the glacier gridpoint selection test logic to avoid unintended
+360wrapping near 0° for ERA5 and adds a near-0° glacier case. - Adds longitude-grid consistency assertions for W5E5 processing.
- Refactors ECMWF (ERA5/ERA5L/CERA/ERA5dr) longitude wrapping threshold logic to be dataset-dependent (but currently introduces a critical threshold bug).
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
oggm/tests/test_shop.py |
Adjusts gridpoint selection tests, adds explicit near-0° longitude case, and tightens resolution assumptions. |
oggm/shop/w5e5.py |
Adds runtime assertions about longitude grid origin/resolution for W5E5. |
oggm/shop/ecmwf.py |
Attempts to make longitude wrapping dependent on dataset longitude minimum/resolution (currently incorrect due to threshold calculation). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
oggm/shop/ecmwf.py
Outdated
| diffs = np.sort(ds.longitude)[1:] - np.sort(ds.longitude)[:-1] | ||
| # resolution: diffs[diffs > 0].min() (necessary approach due to flattened files) | ||
| # thresh should not be above 0 (could be because of some regional data...) | ||
| thresh = max(ds.longitude.min() - diffs[diffs > 0].min() / 2, 0) |
There was a problem hiding this comment.
thresh = max(ds.longitude.min() - res/2, 0) is inverted relative to the comment and will also keep the original ERA5 bug: for datasets with longitude starting at 0 (ERA5), this forces thresh to 0 so glaciers in (-res/2, 0) still get shifted to +360 and can select the 359.75 gridpoint instead of 0. It also makes thresh positive for regional datasets with ds.longitude.min() > 0, which would incorrectly shift many positive longitudes. Consider computing res once, then using a cap of 0 (i.e. min(..., 0)) and ensure ds.longitude.min() is converted to a Python float before applying min/max.
| thresh = max(ds.longitude.min() - diffs[diffs > 0].min() / 2, 0) | |
| res = float(diffs[diffs > 0].min()) | |
| min_lon = float(ds.longitude.min()) | |
| thresh = min(min_lon - res / 2.0, 0.0) |
oggm/shop/ecmwf.py
Outdated
| diffs = np.sort(ds.longitude)[1:] - np.sort(ds.longitude)[:-1] | ||
| thresh = max(ds.longitude.min() - diffs[diffs > 0].min() / 2, 0) | ||
| lon = gdir.cenlon + 360 if gdir.cenlon < thresh else gdir.cenlon |
There was a problem hiding this comment.
Same issue as above: the thresh = max(...) logic will transform longitudes that should not be transformed (including the (-res/2, 0) ERA5 case), and can become positive when ds.longitude.min() > 0 (regional files), shifting valid positive longitudes to +360. Use a 0-cap with min(..., 0) (not max) and cast ds.longitude.min() to float.
oggm/shop/ecmwf.py
Outdated
| diffs = np.sort(ds.longitude)[1:] - np.sort(ds.longitude)[:-1] | ||
| thresh = max(ds.longitude.min() - diffs[diffs > 0].min() / 2, 0) | ||
| lon = gdir.cenlon + 360 if gdir.cenlon < thresh else gdir.cenlon |
There was a problem hiding this comment.
Same issue as above: thresh = max(ds.longitude.min() - res/2, 0) is the wrong direction for capping at 0 and does not prevent the ERA5 near-0 longitude mis-selection. Use min(..., 0) (and cast ds.longitude.min() to float) so that small negative longitudes in (-res/2, 0) are not shifted to +360 for datasets with a 0° gridpoint.
oggm/shop/ecmwf.py
Outdated
| # todo: lapse rate code could be removed, as it is not included in monthly_climate_file anyways ... | ||
| diffs = np.sort(ds.longitude)[1:] - np.sort(ds.longitude)[:-1] | ||
| thresh = max(ds.longitude.min() - diffs[diffs > 0].min() / 2, 0) | ||
| lon = gdir.cenlon + 360 if gdir.cenlon < thresh else gdir.cenlon | ||
| _check_ds_validity(ds) |
There was a problem hiding this comment.
In addition to the near-0 longitude issue, using Python's max() with ds.longitude.min() (an xarray scalar) can raise type/boolean ambiguity errors. Cast the minimum longitude to a Python float (e.g. float(ds.longitude.min())) before applying the threshold logic, and cap at 0 with min(..., 0) rather than max(..., 0).
oggm/shop/ecmwf.py
Outdated
| _check_ds_validity(ds) | ||
| diffs = np.sort(ds.longitude)[1:] - np.sort(ds.longitude)[:-1] | ||
| thresh = max(ds.longitude.min() - diffs[diffs > 0].min() / 2, 0) | ||
| lon = gdir.cenlon + 360 if gdir.cenlon < thresh else gdir.cenlon | ||
| ds = ds.sel(time=slice(f'{y0}-01-01', f'{y1}-12-01')) |
There was a problem hiding this comment.
Same threshold bug here: max(..., 0) does not match the intent of “thresh should not be above 0” and can both (1) keep the ERA5 near-0 longitude mis-selection and (2) incorrectly shift positive longitudes for datasets whose minimum longitude is > 0. Prefer min(float(ds.longitude.min()) - res/2, 0).
oggm/shop/w5e5.py
Outdated
| assert ds.longitude.min() >= 0 | ||
| # check that W5E5 gridpoints do not cross the "0" line (gridpoint center should be at res/2) | ||
| diffs = np.sort(ds.longitude)[1:] - np.sort(ds.longitude)[:-1] | ||
| res_lon = diffs[diffs > 0].min() | ||
| np.testing.assert_allclose(ds.longitude.min(), res_lon / 2) |
There was a problem hiding this comment.
This adds a full sort of all longitudes and an np.testing.assert_allclose inside the per-glacier processing path. Sorting can be expensive for large flattened datasets and np.testing.* raises AssertionError in production rather than a domain-specific error. Consider moving this grid-origin/resolution check into tests (or a one-time validation/cached value), and avoid per-call sorting in process_gswp3_w5e5_data.
| assert ds.longitude.min() >= 0 | |
| # check that W5E5 gridpoints do not cross the "0" line (gridpoint center should be at res/2) | |
| diffs = np.sort(ds.longitude)[1:] - np.sort(ds.longitude)[:-1] | |
| res_lon = diffs[diffs > 0].min() | |
| np.testing.assert_allclose(ds.longitude.min(), res_lon / 2) | |
| lon_min = float(ds.longitude.min()) | |
| if lon_min < 0: | |
| raise InvalidParamsError( | |
| 'W5E5 longitude coordinates must be non-negative (0–360). ' | |
| f'Found minimum longitude {lon_min}.' | |
| ) |
|
Sorry for the noise @lilianschuster :-) wanted to try something, not sure it's useful at all |
Removes bug in climate selection for glaciers near 0 longitude:
test_glacier_gridpoint_selection. Also tested, that these changes do not influence W5E5 processing.--> I am not sure if the Histalp, CRU selection procedures have the same issues (I am not sure if I understand the salem code correctly)