add EAMxx support for cmorization#340
Conversation
|
@zhangshixuan1987 and @crterai just to FYI, this is a PR to add EAMxx support to convert variables to format that is supported by community diagnostics package such as PMP and ILAMB. I will test with existing EAMxx run from you and tag reviewers after a test run. |
|
@copilot can you try fix the build error? |
|
@chengzhuzhang I've opened a new pull request, #341, to work on those changes. Once the pull request is ready, I'll request review from you. |
|
Both Claude Code and Copilot found the root cause: pd.DataFrame converts YAML null → pd.NA in pandas 3.x; pd.NA is not None is True, incorrectly triggering the ValueError on abs550aer However, this solution provided by Claude Code 40c65bf:Iterate directly over the YAML list of dicts — YAML null stays as Python None with no conversion artifacts. It seems to be slightly cleaner and more elegant than what is provided by Copilot here #341. |
|
@zhangshixuan1987 I think this PR to add e3sm_to_cmip conversion for EAMxx variables is close to be done. Below is a list of variables being generated. Some are missing because lack of native output. EAMxx to CMIP Variable Mapping2D variables
3D variables
|
|
@tomvothecoder This is mostly an FYI, but it would be great if you could review the code to see if something is missing. |
There was a problem hiding this comment.
Pull request overview
Adds EAMxx variable-name support to the cmorization handler system so EAMxx
outputs can be mapped to CMIP6 variables (issue #339) alongside existing EAM
mappings.
Changes:
- Added a new block of EAMxx handler entries to map descriptive EAMxx variable
names to CMIP6 targets. - Extended cmor handler formula support (notably
pr) and added aclwvi
formula helper for EAMxx condensed water path. - Refactored YAML handler loading to remove the pandas dependency; updated tests
and added debug scripts for Perlmutter workflows.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
e3sm_to_cmip/cmor_handlers/handlers.yaml |
Adds EAMxx handler entries for CMIP6 variables using EAMxx raw variable names. |
e3sm_to_cmip/cmor_handlers/_formulas.py |
Extends pr formula for EAMxx precip fluxes and adds clwvi sum for EAMxx water paths. |
e3sm_to_cmip/cmor_handlers/utils.py |
Removes pandas-based YAML parsing and iterates YAML entries directly. |
tests/cmor_handlers/test_utils.py |
Updates expected handler list to include the new EAMxx pr handler. |
scripts/debug/339-eamxx-handlers/run_eamxx_vars.py |
Adds a Perlmutter-only debug driver script for generating/regridding and cmorizing EAMxx variables. |
scripts/debug/339-eamxx-handlers/prep_vert_remap.sh |
Adds a helper script to build vert_L128.nc by extracting hybrid-coord vars and injecting P0. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
|
@zhangshixuan1987 thank you for review the datasets and the suggestions! I added one commit (c28a069) to fix the precipitation handler and another (f56e5f6) for resume the vertical interpolation options to matching zppy's workflow, this will make EAM and EAMxx results more comparable as you suggested. |
Description
Summary
Adds handler entries to map EAMxx output variable names to CMIP6 variables, resolving
#339. EAMxx uses descriptive variable names (e.g., LW_flux_up_at_model_top,
surf_radiative_T) rather than the short names used by EAM (e.g., FLUT, TS). The new entries coexist with existing EAM entries — the handler system selects whichever raw_variables are present in the input data.
31 new handler entries covering:
Category: Precipitation
CMIP variables: pr, prw
EAMxx source variables: precip_liq_surf_mass_flux, precip_ice_surf_mass_flux, VapWaterPath
────────────────────────────────────────
Category: Near-surface state
CMIP variables: ts, tas, huss, sfcWind, psl
EAMxx source variables: surf_radiative_T, T_2m, qv_2m, wind_speed_10m, SeaLevelPressure
────────────────────────────────────────
Category: Surface fluxes
CMIP variables: hfls, hfss, evspsbl, tauu, tauv
EAMxx source variables: surface_upward_latent_heat_flux, surf_sens_flux, surf_evap, surf_mom_flux_U/V
────────────────────────────────────────
Category: TOA radiation
CMIP variables: rsdt, rsut, rsutcs, rlut, rlutcs
EAMxx source variables: SW/LW_flux_at_model_top, clrsky
────────────────────────────────────────
Category: Surface radiation
CMIP variables: rsds, rsus, rsdscs, rsuscs, rlds, rlus, rldscs
EAMxx source variables: SW/LW_flux_at_model_bot, clrsky
────────────────────────────────────────
Category: Cloud water
CMIP variables: clwvi, clivi
EAMxx source variables: LiqWaterPath + IceWaterPath, IceWaterPath
────────────────────────────────────────
Category: Aerosol
CMIP variables: od550aer
EAMxx source variables: AerosolOpticalDepth550nm
────────────────────────────────────────
Category: 3D pressure-level
CMIP variables: ta, hus, hur, wap, zg
EAMxx source variables: T_mid, qv, RelativeHumidity, omega, z_mid
Key implementation notes
already in kg m-2 s-1, so no unit scaling is needed (unlike EAM's PRECC+PRECL which requires ×1000).
water path = liquid + ice"), both LiqWaterPath and IceWaterPath must be summed. This differs from EAM, where TGCLDCWP already contains the
total.
direction of the EAMxx variable name. For several EAM variables that require derived formulas (e.g., rlus = FLDS + FLNS, rsus = FSDS -
FSNS), EAMxx provides direct output variables, simplifying the handlers.
Mappings verified against
DERIVED_VARIABLES in https://github.com/E3SM-Project/e3sm_diags/blob/main/e3sm_diags/derivations/derivations.py — all EAMxx entries (marked EAMxx) cross-checked.
Checklist
If applicable: