Skip to content

MESMER-MTP: precipitation#837

Open
sarasita wants to merge 39 commits intoMESMER-group:mainfrom
sarasita:mesmermtp_precipitation
Open

MESMER-MTP: precipitation#837
sarasita wants to merge 39 commits intoMESMER-group:mainfrom
sarasita:mesmermtp_precipitation

Conversation

@sarasita
Copy link

@sarasita sarasita commented Jan 9, 2026

  • Closes #xxx
  • Tests added
  • Fully documented, including CHANGELOG.rst

sarasita and others added 10 commits January 13, 2025 10:06
…using cdo remapbil,r20x20 /net/atmos/data/cmip6-ng/pr/mon/g025/pr_mon_IPSL-CM6A-LR_historical_r1i1p1f1_g025.nc /home/scsarah/mesmer/tests/test-data/calibrate-coarse-grid/cmip6-ng/pr/mon/g025/pr_mon_IPSL-CM6A-LR_historical_r1i1p1f1_g025.nc
@mathause mathause changed the title Mesmermtp precipitation MESMER-MTP: precipitation Jan 12, 2026
@codecov
Copy link

codecov bot commented Jan 13, 2026

Codecov Report

❌ Patch coverage is 24.42748% with 198 lines in your changes missing coverage. Please review.
✅ Project coverage is 90.78%. Comparing base (0162c58) to head (8468527).

Files with missing lines Patch % Lines
mesmer/stats/_regularized_glm.py 19.35% 50 Missing ⚠️
mesmer/stats/_xarray_pipelines.py 21.31% 48 Missing ⚠️
mesmer/stats/_principal_component_decomposition.py 16.98% 44 Missing ⚠️
mesmer/stats/_xarray_kde.py 27.90% 31 Missing ⚠️
mesmer/stats/_xarray_transformers.py 34.21% 25 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #837      +/-   ##
==========================================
- Coverage   99.49%   90.78%   -8.72%     
==========================================
  Files          30       35       +5     
  Lines        1995     2257     +262     
==========================================
+ Hits         1985     2049      +64     
- Misses         10      208     +198     
Flag Coverage Δ
unittests 90.78% <24.42%> (-8.72%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Member

@mathause mathause left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You were not in the office therefore I ask here:

  • What is the difference between the example_mesmer_mtp and example_mesmer_mtp_precipitation notebooks? I see that the first one does not load pr, but doesn't the "p" in "mtp" stand for precipitation?

  • There seems to be two ways to calculate a PCA:

    1. the fit_principal_components function (which applies a StandardScaler and then the PCA); used in the example_mesmer_mtp notebook.
    2. the XarrayPipeline, where you explicitely pass a StandardScaler and PCA as steps; used in the example_mesmer_mtp_precipitation notebook

    Are these two ways equal? Do we need to keep both? Or which one can we remove?

Comment on lines +62 to +64
"std_scale": (X.dims[1], std.scale_),
"std_mean": (X.dims[1], std.mean_),
"std_var": (X.dims[1], std.var_),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find it not very obvious that these belong to the StandardScaler.

Comment on lines +90 to +95
T = _transform_principal_component_decomposition_xr(X=X, params=params)

return T


def _transform_principal_component_decomposition_xr(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(there is not much point to having a wrapper function here)

from mesmer._core.utils import _ignore_warnings


# haven't properly commented this yet - WIP
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# haven't properly commented this yet - WIP

Comment on lines +95 to +97
n_years = tas.sizes["year"]
n_closest = closest_locations.sizes["closest_gridcells"]
n_cov = 1 + 2 * n_closest
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have not looked at this in details but this is very use case specific - check if we could move this out of the function somehow.

return param


# class XarrayPipeline:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we delete this commented block?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

@sarasita
Copy link
Author

Related to your initial questions:

  • example_mesmer_mtp: contains the temperature emulation technique I used in the appendix of the MESMER-M-TP paper; at the time, a working MESMER-M version was not available, but we needed to asses the errors in the precipitation emulations when the framework was forced with emulated (rather than actual) temperature emulations.
  • example_mesmer_mtp_precipitation: contains the actual precipitation emulation framework.
  • two ways to compute a pca: that is correct. I find it more intuitive to have a class-based approach, however, upon first implementing in March 25 Victoria suggested to have the structure follow the Yeo-Johnson transformer module which, at the time, consisted of individual functions. Kept both until we decide what we prefer.

@sarasita
Copy link
Author

Rewrote parts of the code related to the precipitation emulations:

  • adjusted the xarray transformer and added tests
  • rewrote the Gamma GLM (changes from statsmodel implementation to sklearn implementation, exchanged elasticNet against Ridge, added k-fold cross-validation, code is now much faster)

The variability routine has not been changed yet.

@sarasita sarasita closed this Feb 26, 2026
@sarasita sarasita reopened this Feb 26, 2026
def __init__(
self,
design_rule,
alpha_grid,
Copy link
Member

@mathause mathause Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can alpha_grid be an argument to fit? Then you don't need to "dummy initialize" it on L236.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it's a argument to __init__ in GammaRegressor so for consistency we could also keep it...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mh... I suppose either way would be fine.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok let's keep the current approach the 'ugly' part is only internal

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants