Modify benchmark scripts to look for species_database.yml files in Ref and Dev rundirs#389
Merged
Modify benchmark scripts to look for species_database.yml files in Ref and Dev rundirs#389
Conversation
gcpy/benchmark/modules/benchmark_scrape_gchp_timers.py
- Now import ENCODING from gcpy.constants and use that instead of
the hardwired "utf-8" in open statements
- Added routine "check_file_for_timing_info", which tests if a text
file has GCHP timing info. If not it returns the file path to
"allPEs.log". This is necessary due to an update in behavior
in MAPL 2.59.
- In routine "read_one_text_file"
- Now call "check_file_for_timing_info"before reading the file.
- Strip out MAPL text that is usually placed into allPEs.log.
- Skip all lines between after the GCHPchem section until
the Summary section.
- Change the marker for the Summary section to "Report on process:"
(i.e. without a number).
- Changed the break-out-of-loop command for cloud benchmark log files
to "++", which occurs after the timers section. This is only for
MAPL versions prior to 2.59.
This merge brings PR #388 (Update "benchmark_scrape_gchp_timers.py" to look for timing information in "allPEs.log" if not found in the GCHP log file, by @yantosca) into the GCPy 1.7.0 development stream. PR #388 updates the "benchmark_scrape_gchp_timers.py" script to check if the GCHP timers output is present in the log file. If not, it will read it from the allPEs.log file in the run directory. This is needed because GCHP simulations with MAPL v2.59+ now send timers output to the allPEs.log file. Signed-off-by: Bob Yantosca <yantosca@seas.harvard.edu>
gcpy/benchmark/modules/benchmark_utils.py - Import verify_variable_type from gcpy.util - Added function get_species_database_files, which reads the config object and returns paths to the species_database.yml files in Ref and Dev rundirs gcpy/util.py - Added function read_species_metadata, which returns the species metadata for the union of species in multiple species_database.yml files. - Added code updates suggested by Pylint - Updated comments Signed-off-by: Bob Yantosca <yantosca@seas.harvard.edu>
gcpy/benchmark/cloud/*.yml gcpy/benchmark/config/*.yml - Removed "paths:spcdb_dir" YAML tag - Added "ref:gcc:species_metadata", "ref:gchp:species_metadata", "dev"gcc:species_metadata", and "dev:gchp:species_metadata" YAML tags to specify separate species_database.yml files for Ref and Dev rundirs CHANGELOG.md - Updated accordingly Signed-off-by: Bob Yantosca <yantosca@seas.harvard.edu>
gcpy/plot/compare_single_level.py gcpy/plot/compare_zonal_mean.py - Import "read_species_metadata" from gcpy.util - Add "spcdb_files=None" as a keyword argument - Throw an error if spcdb_files is None while convert_to_ugm3=True Signed-off-by: Bob Yantosca <yantosca@seas.harvard.edu>
gcpy/benchmark/modules/oh_metrics.py - Import constants from gcpy_constants by name - Import functions from gcpy.util by name - Updated PyDoc headers - Removed special handling for xr.open_mfdataset - Renamed "ds" to "data" - Implement suggestions from Pylint: - Snake case naming for variables Signed-off-by: Bob Yantosca <yantosca@seas.harvard.edu>
gcpy/benchmark/modules/budget_ox.py gcpy/benchmark/modules/budget_tt.py - Import constants from gcpy.constants by name - Pass spcdb_file as an argument; remove spcdb_dir - Updated PyDoc headers - Implemented suggestions from Pylint: - Snake case/lower case for variables - Do not use a list as a default value for keyword arguments - Specify an encoding in open() statements - Use f-strings instead of ".format" in print statements gcpy/benchmark/ste_flux.py - Removed "import os" - Import constants from gcpy.constatns by name - Removed special handling for xr.open_mfdataset - Implemented suggestions from Pylint: - Snake case/lower case for variables - Do not use a list as a default value for keyword arguments - Specify an encoding in open() statements Signed-off-by: Bob Yantosca <yantosca@seas.harvard.edu>
gcpy/benchmark/modules/benchmark_funcs.py - Pass spcdb_files as a positional argument - Removed spcdb_dir keyword argument - Import functions by name from gcpy.util - Updated PyDoc headers - Verify argument types with verify_variable_type() - Use "read_species_metadata" function to read the species database files from Ref & Dev and return the union of species - Cosmetic changes (indentation, line breaks, comments) - Pass spcdb_files as a keyword argument to compute_single_level - Pass spcdb_files as a keyword argument to compute_zonal_mean - Implemented suggestions from Pylint: - Use snake case/lower case for variables - Specify encoding when opening text files for output Signed-off-by: Bob Yantosca <yantosca@seas.harvard.edu>
gcpy/benchmark/modules/benchmark_drydep.py - Add spcdb_files as a positional argument - Remove spcdb_dir keyword argument - Updated PyDoc headers - Pass spcdb_files as a keyword argument to make_benchmark_drydep_plots gcpy/benchmark/modules/benchmark_mass_cons_table.py - Use "read_species_metadata" to read species database files from Ref & Dev, and return the union of all species - Pass spcdb_files as an argument Signed-off-by: Bob Yantosca <yantosca@seas.harvard.edu>
gcpy/benchmark/modules/benchmark_scrape_gchp_timers.py - Added "check=False" as an argument to subprocess.run, as suggested by Pylint. Signed-off-by: Bob Yantosca <yantosca@seas.harvard.edu>
gcpy/benchmark/modules/benchmark_species_changes.py - Pass spcdb_files as a keyword argument - Use "read_species_metadata" to read the species_database.yml files for Ref & Dev, and return the union of all species - Updated PyDoc headers Signed-off-by: Bob Yantosca <yantosca@seas.harvard.edu>
gcpy/benchmark/modules/run_1yr_fullchem_benchmark.py gcpy/modules/run_1yr_tt_benchmark.py gcpy/benchmark/run_benchmark.py - Remove "import get_species_database_dir" from benchmark_funcs.py - Added "import get_species_database_files" from benchmark_utils.py - Call get_species_database_files at the beginning of GCC vs GCC, GCHP vs GCC, GCHP vs GCHP and Diff of Diffs to get species_database.yml files from Ref & Dev folders. Store in spcdb_files. - Pass spcdb_files as an argument to benchmarking routines - Remove spcdb_dir keyword argument from calls to benchmarking routines - Implement fixes and suggestions from Pylint Signed-off-by: Bob Yantosca <yantosca@seas.harvard.edu>
gcpy/benchmark/modules/benchmark_funcs.py - In make_benchmark_conc_plots and make_benchmark_emis_plots, test if ref and dev are of types (str, list) gcpy/benchmark/modules/benchmark_utils.py - Added missing print statement for the path to the species_database.yml in the Ref run directory Signed-off-by: Bob Yantosca <yantosca@seas.harvard.edu>
gcpy/benchmark/modules/ste_flux.py - Fixed typo "is_TransportTracers" -> "is_transport_tracers" - Cosmetic changes (removed line break in import statement Signed-off-by: Bob Yantosca <yantosca@seas.harvard.edu>
gcpy/benchmark/modules/run_1yr_tt_benchmark.py - Bug fix: Make sure "spcdb_files" is the 5th argument (following dev_label) in the call to make_benchmark_mass_conservation_table for GCC vs GCC. Signed-off-by: Bob Yantosca <yantosca@seas.harvard.edu>
gcpy/benchmark/modules/benchmark_categories.yml - Removed the st_Ox species, this is no longer included in TransportTracers simulations CHANGELOG.md - Updated accordingly Signed-off-by: Bob Yantosca <yantosca@seas.harvard.edu>
gcpy/benchmark/config/1yr_fullchem_benchmark.yml - Restored the following YAML tags, which were inadvertently omitted from a previous commit - "plot_models_vs_obs: True - "aer_budget_table: True" - "Ox_budget_table: True" Signed-off-by: Bob Yantosca <yantosca@seas.harvard.edu>
3d5b94d to
56f1893
Compare
gcpy/benchmark/modules/benchmark_funcs.py - In routine "make_benchmark_aerosol_tables", we have renamed the "spcdb" variable to "properties", as this is the dict that stores the species database metadata. Signed-off-by: Bob Yantosca <yantosca@seas.harvard.edu>
gcpy/cstools.py - Changed x_lat = i_lat + i_vert in (0, 3) to x_lat = i_lat + (i_vert in (0, 3)) in order to enforce proper evaluation order. This fixes an error that was introduced in commit 2b0c6ad, when we added some improvements suggested by Pylint. Signed-off-by: Bob Yantosca <yantosca@seas.harvard.edu>
gcpy/benchmark/modules/benchmark_funcs.py - Now include old and new dust species names in species_list - Update logic so that all dust species are included in the aerosol burdens table - Skip dust species (DST* or DSTbin*) that are not found in the data files - Renamed spc2name to full_names, and use the species database to get the long name for each species - Updated the logic so that DST1 is not hardwired as a species name in the AOD table section - Added internal routine print_aods to print the global AOD table separately from routine print_aerosol_metrics CHANGELOG.md - Updated accordingly Signed-off-by: Bob Yantosca <yantosca@seas.harvard.edu>
Contributor
Author
|
Samples of the output: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Annual average global aerosol burdens for 2019 in gcc.14.7.0-rc.0
(weighted by the number of days per month)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Strat Trop Strat+Trop
----------- ---------- ----------
Hydrophilic black carbon aerosol (BCPI ) burden [Tg] : 0.002118068 0.09570427 0.09782234
Hydrophilic organic carbon aerosol (OCPI ) burden [Tg] : 0.007041742 0.41990561 0.42694735
Sulfate (SO4 ) burden [Tg] : 0.359306033 1.39562352 1.75492956
Dust aerosol, Reff = 0.151 microns (DSTbin1) burden [Tg] : 0.000355288 0.03622999 0.03658528
Dust aerosol, Reff = 0.253 microns (DSTbin2) burden [Tg] : 0.001523766 0.16645760 0.16798137
Dust aerosol, Reff = 0.402 microns (DSTbin3) burden [Tg] : 0.009651065 1.17880181 1.18845288
Dust aerosol, Reff = 0.818 microns (DSTbin4) burden [Tg] : 0.017383738 2.88607972 2.90346345
Dust aerosol, Reff = 1.491 microns (DSTbin5) burden [Tg] : 0.009823191 5.54217088 5.55199407
Dust aerosol, Reff = 2.417 microns (DSTbin6) burden [Tg] : 0.002355801 6.25728160 6.25963740
Dust aerosol, Reff = 3.721 microns (DSTbin7) burden [Tg] : 0.000706825 7.97738362 7.97809044
Fine (0.01-0.05 microns) sea salt aerosol (SALA ) burden [Tg] : 0.000915225 0.32516280 0.32607802
Coarse (0.5-8 microns) sea salt aerosol (SALC ) burden [Tg] : 0.000464604 2.69733069 2.69779530
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Annual average global AODs for 2019 in gcc.14.7.0-rc.0
(weighted by the number of days per month)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Strat Trop Strat+Trop
----------- ---------- ----------
Dust column optical depth [1]: 0.000099603 0.02248508 0.02258468
BCPI column optical depth [1]: 0.000027332 0.00146519 0.00149252
OCPI column optical depth [1]: 0.000087401 0.01970445 0.01979185
SALA column optical depth [1]: 0.000018285 0.01478862 0.01480691
SALC column optical depth [1]: 0.000002156 0.02047868 0.02048084
SO4 column optical depth [1]: 0.000330299 0.03358278 0.03391308
|
lizziel
requested changes
Jan 12, 2026
gcpy/benchmark/modules/benchmark_funcs.py - Obtain separate species metadata from Ref & Dev for use in in unit conversions (MW_g is the most relevant) - Removed some superfluous error checks - In routine make_benchark_operations_budget, use metadata from from both Ref & Dev to determine if it is a wet depositing species - Updated comments gcpy/benchmark/modules/benchmark_utils.py - Added constant SPECIES_DATABASE - Use SPECIES_DATABASE constant when constructing file paths in function get_species_database_files gcpy/benchmark/modules/oh_metrics.py gcpy/benchmark/modules/budget_tt.py gcpy/benchmark/modules/benchmark_mass_cons_table.py - For now, use only the Dev species metadata (e.g. mol. wt.) gcpy/plot/compare_single_level.py gcpy/plot/compare_zonal_mean.py - Now obtain species metadata for Ref & Dev separately - Now use molecular weights from the Ref and Dev species metadata when converting units to ug/m3 (via get_molwt_from_metadata function) gcpy/util.py - In routine "read_species_metadata", reeturn Ref and Dev species metadata separately instead of taking the union. If only one file is passed, return the same metadata for Ref and Dev. - Added routine "get_molwt_from_metadata" CHANGELOG.md - Updated accordingly Signed-off-by: Bob Yantosca <yantosca@seas.harvard.edu>
1197b16 to
2dfbf86
Compare
gcpy/benchmark/cloud/template*.yml gcpy/benchmark/config/*.yml - Removed the species_metadata tags, as we assume that the species database name is always species_database.yml Signed-off-by: Bob Yantosca <yantosca@seas.harvard.edu>
gcpy/util.py - Bug fix: Function get_molwt_from_metadata now returns the Ref species metadata first, followed by Dev. The order had been reversed inadvertently. Signed-off-by: Bob Yantosca <yantosca@seas.harvard.edu>
gcpy/plot/compare_single_level.py
gcpy/plot/compare_zonal_mean.py
- Edited the program logic when converting units to ug/m3 so that:
- If both Ref and Dev metadata do not contain a molecular weight,
print a warning message and skip to the next species.
- If the Ref metadata contains a molecular weight but Dev does not,
do the unit conversion for Ref, but set Dev to np.nan. Also
print a warning message.
- If the Dev metadata contains a molecular weight but Ref does not,
do the unit conversion for Dev but set Ref to np.nan. Also
print a warning message.
Signed-off-by: Bob Yantosca <yantosca@seas.harvard.edu>
…s code gcpy/benchmark/modules/benchmark_funcs.py - Call "compare_varnames" and construct the "cvars", "refonly", and "devonly" variables before adding missing variables (as NaN fields) to refdata and devdata. This fixes a problem where all of the variables are considered common to both Ref & Dev even when they are not. CHANGELOG.md - Updated accordingly Signed-off-by: Bob Yantosca <yantosca@seas.harvard.edu>
gcpy/benchmark/modules/budget_ox.py - Renamed "spcdb_file" -> "spcdb_files" and updated comments - In routine "self.get_conv_factors", skip the spcdb (species metdata) from Ref and take the spcdb from Dev Signed-off-by: Bob Yantosca <yantosca@seas.harvard.edu>
gcpy/benchmark/modules/mass_cons_tables.py - Fixed typo "species_datbase.yml" -> "species_database.yml" Signed-off-by: Bob Yantosca <yantosca@seas.harvard.edu>
lizziel
approved these changes
Jan 21, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Name and Institution (Required)
Name: Bob Yantosca
Institution: Harvard + GCST
Describe the update
In this PR, we have done the following:
Removed the
paths:spcdb_dirYAML tag in*benchmark.ymlconfig files.Added function
get_species_database_filesingcpy/benchmark/modules/benchmark_utils.py, which returns the absolute paths tospecies_database.ymlfiles in Ref and Dev.Added function
read_species_metadatatogcpy/util.py, which accepts either a single file path tospecies_database.ymlor a list of file paths to thespecies_database.ymlfiles in the Ref & Dev rundirs) and returns the Ref and Dev species database dicts. If only one file path is passed, the same species database will be returned for both Ref and Dev.Replaced the
spcdb_dirkeyword argument in several functions withspcdb_files, which can be of typestrorlist.Modified
gcpy/plot/compare_single_level.pyandgcpy/plot/compare_zonal_mean.pyto acceptspcdb_filesinstead ofspcdb_dir. Added corresponding logic so that the Ref species database is used with Ref data and the Dev species database is used with Ref data.Fixed several issues:
spcdbtometadatamake_benchmark_aerosol_tablesface_area(incstools.py) to force correct operator ordermake_benchmark_aerosol_tablesto include all dust species in the aerosol burdens tablemake_benchmark_aerosol_tablesto remove hardwiring in the computation of global AOD1yr_fullchem_benchmark.ymlconfiguration filecreate_benchmark_emission_tablesExpected changes
These updates should allow you to create benchmark plots in the case where Dev contains some species which Dev does not.