Skip to content

cmip6plus support #335

Draft
chengzhuzhang wants to merge 9 commits intoE3SM-Project:masterfrom
ogaruba:cmip6plus
Draft

cmip6plus support #335
chengzhuzhang wants to merge 9 commits intoE3SM-Project:masterfrom
ogaruba:cmip6plus

Conversation

@chengzhuzhang
Copy link
Copy Markdown
Collaborator

Description

This PR is created from @ogaruba's fork. Yemi worked on cmorizing LESFMIP simulations and published them to CMIP6Plus project.

This update may be useful for ecosystem projects and for the transition to support CMIP7.

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • My changes generate no new warnings
  • Any dependent changes have been merged and published in downstream modules

If applicable:

  • New and existing unit tests pass with my changes (locally and CI/CD build)
  • I have added tests that prove my fix is effective or that my feature works
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have noted that this is a breaking change for a major release (fix or feature that would cause existing functionality to not work as expected)

updated  mapping of cmip6 to cmip6plus for all variables
@chengzhuzhang
Copy link
Copy Markdown
Collaborator Author

@tomvothecoder when you have a chance, could you help to evaluate if we can consider merge this branch authored by Yemi into main branch?
We should focus on if backward compatibility can be remained with minimal change. Thanks!

Copy link
Copy Markdown
Collaborator

@tomvothecoder tomvothecoder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @chengzhuzhang, here is my initial, quick review. I left some PR review comments.

Overall Analysis

The implemented logic to support a configurable mip_era does seem like it can work, but it does not scale well. The logic selects the CMIP table for each variable handler based on the mip_era argument with the assumption that variable handlers order tables as [CMIP6, CMIP6+]. If we add more tables or support more mip_era options, it will break.

Suggestions

One suggestion is to extend the util.py definition for frequencies, tables, etc. using a mapping table to the mip_era argument. Then we can perform the correct parsing logic in a scalable way.

Here's a rough example demonstrating this concept:

MIP_ERA_METADATA = 
	{
		cmip6: {
			frequencies: [...],
			tables: {
					"atmos": [...],
					"land": [...],
					...
				}
		},
		cmip6plus: {
			frequencies: [...],
			tables: {
					"atmos": [...],
					"land": [...],
					...
				}
		}
	}


# We can keep how it is implemented now, but it is cleaner to extend the above mapping tables to do this for us.
FREQUENCY_TO_CMIP_TABLES = {
   ...
}

Of course, this will involve downstream refactoring for code to reference MIP_ERA_METADATA.

I'll think more for better solutions if there are any and am open to ideas too.

Questions

  1. Do we plan on supporting mip_era for atm/land handlers at some point too (in addition to MPAS handlers in this PR)?
  2. Is there already a mapping table that maps frequencies and tables to mip_era? I recall repos such as https://github.com/E3SM-Project/CMIP6-Metadata used by e3sm_to_cmip.

Comment on lines +124 to +130
for var_handlers in var_handler:
if type(var_handlers["table"]) is list:
if mip_era == "cmip6" and len(var_handlers["table"]) > 1:
var_handlers["table"] = str(var_handlers["table"][0])

elif mip_era == "cmip6plus" and len(var_handlers["table"]) > 1:
var_handlers["table"] = str(var_handlers["table"][1])
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicate logic from load_all_handlers(). Can extract as a re-usable private function.

Comment on lines +215 to +221
for var_handler in var_handlers:
if type(var_handler["table"]) is list:
if mip_era == "cmip6" and len(var_handler["table"]) > 1:
var_handler["table"]= str(var_handler["table"][0])

elif mip_era == "cmip6plus" and len(var_handler["table"]) > 1:
var_handler["table"] = str(var_handler["table"][1])
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicate logic from load_all_handlers(). Can extract as a re-usable private function.

Comment on lines +257 to +262
if type(table_name) is list:
if mip_era == "cmip6" and len(table_name) > 1:
table_name = str(table_name[0])

elif mip_era == "cmip6plus" and len(table_name) > 1:
table_name = str(table_name[1])
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicate logic as mentioned earlier, can be extracted to re-usable private function.

Comment on lines +74 to +80
for var_handlers in var_handler:
if type(var_handlers["table"]) is list:
if mip_era == "cmip6" and len(var_handlers["table"]) > 1:
var_handlers["table"] = str(var_handlers["table"][0])

elif mip_era == "cmip6plus" and len(var_handlers["table"]) > 1:
var_handlers["table"] = str(var_handlers["table"][1])
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logical assumption that the first table is a CMIP6 table and the second table is a CMIP6+ table can easily break and does not scale well.

It is a good idea to further categorize CMIP tables by mip_era.

Comment on lines 18 to 186
@@ -32,16 +67,53 @@
"CMIP6_fx.json",
]

LAND_TABLES = ["CMIP6_Lmon.json", "CMIP6_LImon.json"]
OCEAN_TABLES = ["CMIP6_Omon.json", "CMIP6_Ofx.json"]
SEAICE_TABLES = ["CMIP6_SImon.json"]
LAND_TABLES = ["MIP_LI3hrPt.json", "MIP_LI6hrPt.json", "MIP_LIday.json", "MIP_LIfx.json", "MIP_LImon.json", "MIP_LIsubhrPtSite.json", "MIP_LP3hr.json", "MIP_LP3hrPt.json", "MIP_LP6hrPt.json", "MIP_LPday.json", "MIP_LPfx.json", "MIP_LPmon.json", "MIP_LPyr.json", "MIP_LPyrPt.json", "CMIP6_Lmon.json", "CMIP6_LImon.json"]
OCEAN_TABLES = ["MIP_OBday.json", "MIP_OBmon.json", "MIP_OBmonLev.json", "MIP_OByr.json", "MIP_OByrLev.json", "MIP_OP3hrPt.json", "MIP_OPday.json", "MIP_OPdec.json", "MIP_OPdecLev.json", "MIP_OPdecZ.json", "MIP_OPfx.json", "MIP_OPmonClim.json", "MIP_OPmonClimLev.json", "MIP_OPmon.json", "MIP_OPmonLev.json", "MIP_OPmonZ.json", "MIP_OPyr.json", "MIP_OPyrLev.json", "CMIP6_Omon.json", "CMIP6_Ofx.json"]
SEAICE_TABLES = ["MIP_SIday.json", "MIP_SImon.json", "MIP_SImonPt.json", "CMIP6_SImon.json"]

# Mapping from CMIP6 table names to their associated frequency.
# This dictionary helps determine the frequency category (e.g., "mon", "day", etc.)
# for a given CMIP6 table name when deriving or validating variable handlers.
# Keys are CMIP6 table filenames, values are their frequency strings.
FREQUENCY_TO_CMIP_TABLES = {
"yr": [
"MIP_LPyr.json",
"MIP_LPyrPt.json",
"MIP_OByr.json",
"MIP_OByrLev.json",
"MIP_OPyr.json",
"MIP_OPyrLev.json",
],
"mon": [
"MIP_ACmon.json",
"MIP_ACmonZ.json",
"MIP_AEmon.json",
"MIP_AEmonLev.json",
"MIP_AEmonZ.json",
"MIP_APmonClim.json",
"MIP_APmonClimLev.json",
"MIP_APmonDiurnal.json",
"MIP_APmon.json",
"MIP_APmonLev.json",
"MIP_APmonZ.json",
"MIP_APfx.json",
"MIP_LPmon.json",
"MIP_LPfx.json",
"MIP_LImon.json",
"MIP_LIfx.json",
"MIP_OBmon.json",
"MIP_OBmonLev.json",
"MIP_OPmonClim.json",
"MIP_OPdec.json",
"MIP_OPdecLev.json",
"MIP_OPdecZ.json",
"MIP_OPmonClimLev.json",
"MIP_OPmon.json",
"MIP_OPmonLev.json",
"MIP_OPmonZ.json",
"MIP_OPfx.json",
"MIP_SImon.json",
"MIP_SImonPt.json",
"CMIP6_Amon.json",
"CMIP6_Lmon.json",
"CMIP6_LImon.json",
@@ -53,26 +125,64 @@
"CMIP6_SImon.json",
],
"day": [
"MIP_AEday.json",
"MIP_APday.json",
"MIP_APdayLev.json",
"MIP_APdayZ.json",
"MIP_LPday.json",
"MIP_LIday.json",
"MIP_OBday.json",
"MIP_OPday.json",
"MIP_SIday.json",
"CMIP6_day.json",
"CMIP6_CFday.json",
"CMIP6_AERday.json",
],
"3hr": [
"MIP_AE1hr.json",
"MIP_AE3hrPt.json",
"MIP_AE3hrPtLev.json",
"MIP_AP3hr.json",
"MIP_AP3hrPt.json",
"MIP_AP3hrPtLev.json",
"MIP_LP3hr.json",
"MIP_LP3hrPt.json",
"MIP_LI3hrPt.json",
"MIP_OP3hrPt.json",
"CMIP6_3hr.json",
"CMIP6_CF3hr.json",
],
"6hrLev": [
"MIP_AE6hr.json",
"MIP_AP6hr.json",
"CMIP6_6hrLev.json",
],
"6hrPlev": [
"MIP_AE6hrPt.json",
"MIP_AP6hrPt.json",
"MIP_LP6hrPt.json",
"MIP_LI6hrPt.json",
"CMIP6_6hrPlev.json",
],
"6hrPlevPt": [
"MIP_AE6hrPtLev.json",
"MIP_AP6hrPtLev.json",
"MIP_AP6hrPtZ.json",
"CMIP6_6hrPlevPt.json",
],
"1hr": [
"MIP_AE1hr.json",
"MIP_AP1hr.json",
"MIP_AP1hrPt.json",
"CMIP6_AERhr.json",
],
"subhr": [
"MIP_AEsubhrPt.json",
"MIP_AEsubhrPtSite.json",
"MIP_APsubhrPt.json",
"MIP_APsubhrPtLev.json",
"MIP_LIsubhrPtSite.json",
]
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Combining CMIP6 tables and frequencies with CMIP6+ can work, but it makes downstream parsing logic more convoluted.

As mentioned here, it is a good idea to categorize CMIP tables by MIP ERA. Otherwise, we pass mip_era then make the assumption that the first table in a variable handler is CMIP6 and the second is CMIP6+ to get the correct table.

@ogaruba
Copy link
Copy Markdown

ogaruba commented Jan 14, 2026

@tomvothecoder Thanks for reviewing the PR request. To answer your question about supporting atm/lnd handlers. Yes code changes for atm/lnd handlers are included in this PR. The changes in the cmor_handlers/utils.py file include changes for the atm/lnd files. I have also added changes to the handler.yaml file to map all the atm/lnd CMIP6 variables in the file to include their CMIP6Plus equivalents.

The mapping between CMIP6 and CMIP6Plus variables can be found here: https://github.com/PCMDI/mip-cmor-tables/blob/main/src/Mapping%20between%20CMIP6%20and%20CMIP6Plus%20MIP%20tables.ipynb

ogaruba and others added 5 commits January 14, 2026 12:38
Co-authored-by: Tom Vo <tomvothecoder@gmail.com>
Co-authored-by: Tom Vo <tomvothecoder@gmail.com>
Co-authored-by: Tom Vo <tomvothecoder@gmail.com>
Co-authored-by: Tom Vo <tomvothecoder@gmail.com>
Co-authored-by: Tom Vo <tomvothecoder@gmail.com>
@tomvothecoder
Copy link
Copy Markdown
Collaborator

@tomvothecoder Thanks for reviewing the PR request. To answer your question about supporting atm/lnd handlers. Yes code changes for atm/lnd handlers are included in this PR. The changes in the cmor_handlers/utils.py file include changes for the atm/lnd files. I have also added changes to the handler.yaml file to map all the atm/lnd CMIP6 variables in the file to include their CMIP6Plus equivalents.

The mapping between CMIP6 and CMIP6Plus variables can be found here: PCMDI/mip-cmor-tables@main/src/Mapping%20between%20CMIP6%20and%20CMIP6Plus%20MIP%20tables.ipynb

Thank you @ogaruba for the work and I appreciate your clarification! I will do another review and will try to fit in time to extend this PR with the mip era mapping.

@chengzhuzhang Any particular deadline/priority for this PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: In progress

Development

Successfully merging this pull request may close these issues.

3 participants