-
Notifications
You must be signed in to change notification settings - Fork 117
Description
This issue is related to #1838 and was uncovered while applying the workaround suggested there (upgrading NumPy from 1.24.x to ≥1.26 / 2.0+).
After upgrading NumPy, numerical instabilities appear in MixedBedFlowline, particularly for trapezoidal bed calculations. These issues were not observed with NumPy 1.24.x and appear to be exposed by stricter numerical behavior in newer NumPy versions (≥1.26 and 2.0).
Observed Behavior
ValueError: Trapezoid beds need to have origin widths > 0
This error has been mentioned previously (e.g. OGGM Slack, 2022), but it is now consistently reproducible with newer NumPy versions.
The issue is not a new physical bug, but a numerical precision problem. Calculations produce values on the order of -1e-12 due to floating-point cancellation. NumPy 1.24 tolerated these near-zero negatives, while NumPy ≥1.26 does not, causing validation checks to fail.
Environment
Python 3.10 + NumPy 1.26.x
Python 3.11 + NumPy 2.0+
OGGM: 1.6.3.dev30+gbba6e5f24
Machines: local Ubuntu PC + two independent servers
The error is reproducible across all environments and machines.
This error is reproduced in the two env and with different machines.
Error Log :
2026-02-03 10:33:39: oggm.core.flowline: (RGI60-07.00026) init_present_time_glacier
2026-02-03 10:33:39: oggm.core.flowline: ValueError occurred during task init_present_time_glacier on RGI60-07.00026: Trapezoid beds need to have origin widths > 0.
Root Cause Analysis
The problem originates in two locations:
- In oggm.core.flowline.MixedBedFlowline.init:
The origin width for trapezoid beds is computed as:
self._w0_m = section / thick - lambdas * thick / 2
For some geometries where:
section / thick ≈ lambdas * thick / 2
this subtraction leads to catastrophic cancellation, producing values such as -9.9e-13.
if np.any(self._w0_m[self._ptrap] <= 0):
raise ValueError("Trapezoid beds need to have origin widths > 0.")
- In
init_present_time_glacier:
The following logic relies on exact floating-point equality:
lambdas[bed_shape == 0] = def_lambda
However, bed_shape = 4 * thick / widths_m**2 can yield very small positive values instead of exact zeros, leading to inconsistent bed-shape classification.
Current Solution
The fix requires small, non-breaking changes to improve numerical stability and use tolerance-based checks.
A. For MixedBedFlowline.init:
Use a More Stable Formula: Rewrite the _w0_m calculation to avoid the subtraction of similar numbers:
# Mathematically equivalent but numerically stable , introduce the epsilon = 1e-12
self._w0_m = (section - lambdas * thick**2 / 2) / np.maximum(thick, epsilon)
self._w0_m = np.maximum(self._w0_m, 0) # Physical enforcement
Update Validation Check: Change the check to use a small epsilon tolerance:
_if np.any(self._w0_m[self.ptrap] < -epsilon): # Allow for numerical noise
raise ValueError('Trapezoid beds need to have origin widths > 0.')
B. For init_present_time_glacier:
Use Epsilon for Comparisons: Replace exact equality with a tolerance.
epsilon = 1e-12
is_effectively_zero = bed_shape <= epsilon
lambdas[is_effectively_zero] = def_lambda
Guard Against Zero Division: When calculating bed_shape, protect against division by zero.
bed_shape = 4 * thick / (widths_m ** 2 + epsilon)
Questions
Is this the correct and preferred way to address the numerical instability?
Is there an existing OGGM guideline for tolerance-based floating-point comparisons?
Note: Separately, NumPy 2.0 introduces additional issues (e.g. np.NAN vs np.nan).
Thank you very much for your time and help.
Cheers
Ruitang