Skip to content

Allow streams to operate on reduced number of PETs #360

@ekluzek

Description

@ekluzek

I think this maybe something that's really done in the host code that calls streams. But, I'm adding it here in case there are changes required in CDEPS. And because the errors show up in the stream code.

If you have more processors than mesh elements in the stream file -- ESMF dies with an error because of this. AT first I thought I just needed to get an ESMF_DistGrid for the mesh file that was no more than the elements in the file. However, after trying this I see the error below that tells us that the VM for the

Errors in the PET file look like this:

20251116 014306.030 ERROR PET127 /glade/derecho/scratch/csgteam/temp/spack/derecho/24.12/builds/spack-stage-esmf-8.8.1-oigqhlcifmbqcxchruwcjowzoxz5sa4j/spack-src/src/Infrastructure/Mesh/src/ESMCI_Mesh_FileIO.C:456 ESMCI_mesh_create_from_ESMFMesh_ Value unrecognized or out of range - Can't create a Mesh from a file in a VM when that VM contains more PETs than elements in the file.

cesm.log shows it failing in the streams initialization:

dec0745.hsn.de.hpc.ucar.edu 87:   Streams initialization failing for Delta14co2_in_air stream file =
dec0745.hsn.de.hpc.ucar.edu 87:  /glade/campaign/cesm/cesmdata/inputdata/lnd/clm2/isotopes/ctsmforc.Graven.atm_d
dec0745.hsn.de.hpc.ucar.edu 87:  elta_C14_CMIP7_4x1_global_1700-2023_yearly_v3.0_c251013.nc
dec0745.hsn.de.hpc.ucar.edu 87:  ENDRUN:
dec0745.hsn.de.hpc.ucar.edu 87:  ERROR in CTSMForce2DStreamBaseType.F90 at line 197
.
.
.
dec0745.hsn.de.hpc.ucar.edu 107: cesm.exe           00000000011D80E0  shr_abort_mod_mp_          67  shr_abort_mod.F90
dec0745.hsn.de.hpc.ucar.edu 107: cesm.exe           00000000005EBAB0  abortutils_mp_end          63  abortutils.F90
dec0745.hsn.de.hpc.ucar.edu 107: cesm.exe           000000000094CC14  ctsmforce2dstream         197  CTSMForce2DStreamBaseType.F90
dec0745.hsn.de.hpc.ucar.edu 107: cesm.exe           0000000000FAAA08  atmcarbonisotopes         171  AtmCarbonIsotopeStreamType.F90
dec0745.hsn.de.hpc.ucar.edu 107: cesm.exe           00000000007E780A  cisoatmtimeseries         563  CNCIsoAtmTimeSeriesReadMod.F90
dec0745.hsn.de.hpc.ucar.edu 107: cesm.exe           0000000000603196  clm_initializemod         481  clm_initializeMod.F90
dec0745.hsn.de.hpc.ucar.edu 107: cesm.exe           00000000005A691B  lnd_comp_nuopc_mp         677  lnd_comp_nuopc.F90

The lnd.log shows the error as well.

Metadata

Metadata

Assignees

Labels

Responsibility: CTSMResponsibility to manage and accomplish this issue is the CTSM Software groupanswers are bfbenhancementNew feature or request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions