cam6_3_007: Initial implementation of weak scaling physics infrastructure #287

gold2718 · 2020-12-06T21:14:12Z

Initial implementation of weak scaling physics infrastructure

Implement scalable physics grid with no load balancing
Implement scalable gmean algorithm that does not need global data
Modify SE/dyn_grid.F90 and related files for weak scaling interface
Add repro sum variables if weak-scaling fix is on
Added memory use for phys_grid_init and gmean timing
Add cam_repro_sum to test namelists to ensure BFB compatibility
Add checks to phys_debug_lat and phys_debug_lon
Note: The repro_sum version of gmean is not scalable
Note: create_native_mapping_files contains non-scalable memory use

fixes #127
fixes #259

Implement scalable physics grid with no load balancing Implement scalable gmean algorithm that does not need global data Set up directory for weak scaling version of SE dycore interface Modify SE/dyn_grid.F90 for weak scaling interface Add repro sum variables if weak-scaling fix is on Added memory use for phys_grid_init and gmean timing Add cam_repro_sum to test namelists to ensure BFB compatibility Add checks to phys_debug_lat and phys_debug_lon Add src/physics/weak_scaling to TR8.sh Note that the repro_sum version of gmean is not scalable Note that create_native_mapping_files contains non-scalable memory use

cacraigucar

First round of reviews

cacraigucar · 2020-12-07T18:35:46Z

bld/configure

        my @dirs = split ',', $usr_src;
        while ( my $dir = shift @dirs ) {
-            print $fh "$dir\n";
+            print $fh "$dir xx\n";


This looks like a typo.

cacraigucar · 2020-12-07T18:39:41Z

bld/namelist_files/namelist_definition.xml

+    The value is the maximum number of double-precision fields on a task
+    for global calculations.


I didn't really understand this explanation until I read the following in the ChangeLog:
Maximum number of global fields to sum at one time

src/cpl/mct/atm_comp_mct.F90

fvitt

What is the purpose of the "-weak_scaling_fix" configure

fvitt · 2020-12-07T21:45:25Z

doc/ChangeLog

+One-line Summary: Weak scaling infrastructure and bug fixes
+Github PR URL:
+
+Purpose of changes:
+#127: Fix weak scaling issues in physics decomp


Can you elaborate on what purpose of these changes? I gather infrastructure changes are to improve performance scaling of cam physics in terms of number of MPI tasks.

The ChangeLog_template says to copy the issue number and title text. The explanation is in the issue: #127

fvitt · 2020-12-07T21:48:29Z

doc/ChangeLog

+#259: _Fillvalue in cam history files (remove fill value for coordinates)
+
+Describe any changes made to build system:
+ - Add weak scaling directory to override several files (see below)


I find the directory name "weak_scaling" to be confusing. Would something like "improved_scaling" be better?

Well, the project was to solve the weak scaling issue but it does also address strong scaling issues.
@cacraigucar, any thoughts?

How ingrained is the name "weak scaling"? I know it has been used for quite some time.

On the other hand, now is the time to give it a name change, if it is appropriate. Since it sounds like this is more than weak scaling, I do like @fvitt suggestion of "improved_scaling" as it covers both the weak and strong scaling issues.

This project is replacing the dynamics-physics coupler with a new version that has better weak scaling properties. The current coupler code lives in physics/cam/phys_grid.F90. This would be a good time to move the new implementation into its own directory, say dyn_phys_coupling, at the same level as the dynamics and physics subdirectories under src/. There doesn't seem to be any good logic behind having the coupling code in the directory meant for physics parameterizations.

@brian-eaton, phys_grid.F90 does a lot more than dynamics-physics coupling. It is also the home to defining the cam_grid information for history output from physics code. With the ionosphere PR, it also becomes the center for regridding from physics to the ionosphere and for other uses, such as inline regridding of emissions data onto the physics grid.
Also, while I agree that physics/cam is not a great place, that will be reformed as the physics becomes CCPP-ized and moves into its own repository.

Finally, only two of the files are about the new infrastructure, the others are temporary until all dycores use this interface.

How about new_phys_grid or physics_infrastructure? improved_scaling is okay as well. Consensus? Feedback on the suggestion to move it up a level (to src instead of in src/physics)?

I like "src/physics/infrastructure" or "src/physics_infrastructure"

Because it's coupling between dynamics and physics, or physics and ionosphere, I like having the directory at the same level as those 3 rather than inside physics. Calling the directory "infrastructure" would be OK. MPAS has a directory called "framework" which is similar. We have a lot of infrastructure type stuff in control/ and utils/ as well which could eventually migrate. Anything we do at this point will not be entirely consistent with the historical evolution of our directory structure, but that's OK.

cacraigucar

Round 2 comments

cacraigucar · 2020-12-07T20:00:04Z

src/dynamics/se/dyn_comp.F90

+!                  write(6, *) 'XXG ',iam,') ',ie,i,j,elem(ie)%spherep(i,j)%lat,dbuf2(indx,ie)*deg2rad
+!                  call shr_sys_flush(6)


Should these lines be deleted?

No idea. Brian left then in when he upgraded the implementation. @brian-eaton ? Any love left for these debug lines?

This output only occurs before endrun is called. It is more sanity check than debug code. So seems like good info to put in the log file. Rather than commenting them out the hardcoded unit 6 should be changed to iulog, and the XXG changed to something like 'file lat not correct on task', and add text describing the other output data.

cacraigucar · 2020-12-07T20:00:24Z

src/dynamics/se/dyn_comp.F90

+!                  write(6, *) 'XXG ',iam,') ',ie,i,j,elem(ie)%spherep(i,j)%lon,dbuf2(indx,ie)*deg2rad
+!                  call shr_sys_flush(6)


Should these lines be deleted?

cacraigucar · 2020-12-07T20:13:08Z

src/physics/cam/phys_debug_util.F90

      close(unitn)
      call freeunit(unitn)
+      ! Check inputs
+      if (phys_debug_lat /= uninit_r8) then


I found this statement confusing on first glance. I tried to figure out why you were adding the r8 to the unit number. It was only on closer inspection that I saw slight difference in characters. While I know the name uninit_r8 has been around for a while in this routine, I would suggest renaming it to be uninitr8 so that it doesn't look like a real number being promoted to type r8. This is mainly because it is immediately following the use of unitn. Maybe I'm the only one who will be tripped up by this.

fvitt · 2020-12-08T15:23:12Z

src/physics/weak_scaling/gmean_mod.F90

+   ! Perform mixed layer global calculations for energy conservation checks.
+   !


What is meant by "mixed layer"? Is this exclusively for energy conservation checks? Are other parts of the code using this utility to calculate global means?

The comments in src/utils/gmean_mod.F90 indicate that the code originated from SOM, in which case the mixed layer refers to the ocean. Just a case of old comments not being updated.

Doing a global search on 'call gmean' indicates that the primary purpose of the code is for diagnostics and the check_energy code. There's also some use in chemistry. The use of gmean in the check_energy code would cause answers to depend on task count if the means were not calculated in a task independent way. Since CAM is currently giving task count independent answers I have to assume that src/utils/gmean_mod.F90 is working as designed. But what's not clear is whether a call to gmean is relying on the non-scalable calculation to achieve this result. The current implementation is that a scalable algorithm is used first (shr_reprosum_calc called from gmean_fixed_repro), and the scalable algorithm makes an error estimate which is subsequently used to decide whether the non-scalable (but always task count independent) calculation should be used. However, as far as I can tell the check of the error estimate is not done because shr_reprosum_reldiffmax defaults to -1.0. So it appears that the current scalable code is being used and is producing task count independent results. @gold2718, did you come to a different conclusion after examining the current gmean_mod.F90?

@brian-eaton, I had not noticed the settings of shr_reprosum_reldiffmax and shr_reprosum_recompute.
Not only is shr_reprosum_reldiffmax set to -1.0 (which means do not make an estimate) but shr_reprosum_recompute is set to .false. meaning that even if the reprosum fails, CAM uses the answer anyway!

It looks to me that while our global means are reproducible, we have no idea if they are correct. Ouch!

Unless you know this is not happening, I am going to throw in some tests to see how we are doing.

I believe the default settings in shr_reprosum_mod came from Pat Worley. I assume that when Pat and other DOE folks implemented and tested this code they found that the algorithm in shr_reprosum_int worked well enough that there was no need to go to the expense of producing the error estimate in the default mode of running. The error estimate is made by redoing the sum using mpi_allreduce (so not reproducible, but an independent estimate) and comparing that with the results from shr_reprosum_int.

The thing that makes shr_reprosum_int susceptible to error is that the reproducibility is obtained by converting the floating point values to integer representations, and doing the mpi_allreduce on the integers. Then you convert back to floating point. I'm not familiar with the algorithmic details, but depending on the properties of the floats I imagine there could be pathologies when converting to integers. My guess as to why this scheme works well for this application is that the sums involve values that are all positive, e.g., energy of a column or the mass of a constituent in a column.

The tests I have been running are telling me is that shr_reprosum is pretty good if given positive-definite fields, even with a fairly high dynamic range.
However, it is easy to mess with if you give it inputs that gyrate around zero and end up with a small result (compared to the size of the gyrations). In these situations, it computes incorrect answers and is not even reproducible.
Since we are currently using gmean on non-negative fields and are not checking whether or not shr_reprosum is working, I propose the following steps to resolve this issue:

Replace the unused safety-net (the backup algorithm which is not used because we do not check) with a call to endrun

Always use shr_reprosum (i.e., remove the new cam_repro_sum namelist variable I introduced).

Turn on the shr_reprosum check for CAM regression tests. This is to prevent a future use of gmean on oscillating fields.

Thoughts? @brian-eaton? @fvitt?

Steve, what value did you use for reprosum_diffmax? And what were the means computed by the reprosum calc and allreduce code for, say, the worst case?

Francis, I wouldn't expect there to be any significant error for a non-negative fields. But if you want to check you could add the optional arg rel_diff to the shr_reprosum_calc calls. To get the rel_diff values you also need to set the reprosum_diffmax namelist variable to some positive value.

First, some of the baseline diffs were as high as 5.E-01 although for not-so-small absolute values, it still got to 1.7E-01.
I used reprosum_diffmax=1.0e-14

I do not see either the value computed or the backup value computed by par_xsum. The output looks something like:

par_xsum: difference in fixed and floating point sums exceeds tolerance in 71 fields. Maximum relative diff: (rel) 0.9694656488549618 (abs) 2.8876456781290472E-11 Maximum absolute diff: (rel) 6.9519501266768202E-11 (abs) 6.9519501266768202E-11

par_xsum does not use allreduce, it has some other reproducible algorithm.

I'm not quite understanding the par_xsum output. I think to pursue those errors you'd need to set up par_xsum the same way we discussed setting up gmean so that when the error tolerance is exceeded we print the means from both estimates and call endrun.

Did you see any failures related to gmean calls? 1.e-14 is a pretty tight tolerance.

That output is from the shr_reprosum_calc calls made by par_xsum. The (abs) numbers are just abs(fast - allreduce) while the (rel) numbers are abs(fast - allreduce) / abs(allreduce) (unless abs(allreduce) is too small in which case (rel) is just (abs)).

"Maximum relative diff" is the case (out of the 71 listed above) which had the largest (rel) value.

"Maximum absolute diff" is the case (out of the 71 listed above) which had the largest (abs) value.

To get an idea of the (absolute) value of the allreduce method, I would just look at (abs) / (rel) which gives 2.9786E-11 for the "Maximum relative diff" case (the absolute diff case did not compute a relative value since the absolute value was too small (abs(allreduce) <= abs(fast - allreduce)`).

I did not see any gmean errors which is expected because they are all summing non-negative values.

It's hard to interpret relative errors when the values are so close to zero. To be sure I'd still want to see the means computed by both methods. But I think there's no need to pursue this. When Pat Worley updated par_xsum to use the shr_reprosum_mod code he did plenty of testing. Just for the record, here is part of the relevent ChangeLog history:

Tag name: cam3_6_37
Originator(s): pworley
Date: Mon Apr 27 2009
One-line Summary: more accurate reproducible sum; additional OpenMP
parallelism.

Purpose of changes:

Modify repro_sum fixed-point algorithm to improve accuracy.
In certain situations, numerical cancellation can degrade accuracy
of original repro_sum algorithm as compared to floating-point-based
algorithms. New algorithm is nearly exact, employing variable
precision as needed. This results in roundoff level differences.

The ChangeLog file contains a lot more details about the introduction of and improvements to the scalable reproducible sums code, going back to July 2008.

fvitt · 2020-12-08T15:31:02Z

src/physics/weak_scaling/gmean_mod.F90

+         end if
+         ! Normalize
+         do fld_ind = 1, nflds
+            arr_gmean(fld_ind) = arr_gmean(fld_ind) / (4.0 * pi)


Any reason to not multiply by inv4pi rather than divide by (4.0*pi)?

This is hopefully temporary. This code is BFB with the baselines, multiplying by inv4pi was not.
In order to get this code in, we decided to go with the BFB version and introduce the faster, scalable version in the future.

fvitt · 2020-12-08T15:33:00Z

src/physics/weak_scaling/gmean_mod.F90

+         ! Non reproducible algorithm
+         do fld_ind = 1, nflds
+            ! Get weighted sum from physics
+            call weighted_sum_p(arr(:, :, fld_ind), wsum(fld_ind))
+         end do
+         call MPI_allreduce(wsum, arr_gmean, nflds, MPI_REAL8, MPI_SUM,       &
+              mpicom, ierr)
+         ! Normalize
+         do fld_ind = 1, nflds
+            arr_gmean(fld_ind) = arr_gmean(fld_ind) * inv4pi
+         end do


Is this reproducible if PE layout does not change?

It should be although the MPI standard does not guarantee that the MPI_allreduce call will always return the same sum (the network could end up doing the sum in a different order if you end up on different nodes).
All my tests have at most roundoff differences with a PE change -- I did not try repeatedly running one layout to see if it ever varies.

The potential for simulation results that are not reproducible does not seem right.

After spending more time looking at both the gmean_mod.F90 in src/utils/ and the new one in weak_scaling/ I find that I don't understand what problem the new version is trying to solve. The issue comments indicate that a scalable version is the goal, but the current code already provides both scalable and non-scalable versions. CAM is currently set up to use the scalable version which is also reproducible.

If I'm missing what's going on and the new code is improving gmean_mod, then I think it should be implemented in shr_reprosum_mod.F90. The code there originally came from CAM, but was moved so that all components had access to scalable reproducible sums. I don't think we should be making improvements to those algorithms in a CAM specific location.

fvitt · 2020-12-08T15:35:24Z

src/physics/weak_scaling/gmean_mod.F90

+   !========================================================================
+   !
+
+   subroutine test_gmean(max_diff)


Do you have a stand-alone unit test for gmean?

Not currently. I tested it by inserting calls in the init code.
We will have an easier-to-use unit test system in the new infrastructure.

fvitt · 2020-12-08T15:37:22Z

src/physics/weak_scaling/gmean_mod.F90

+      end if
+   end function bits_to_string
+
+!!XXgoldyXX: v remove once new algorithm is in place


Is the new algorithm in place? Is this stuff ready to remove?

No. See note above about getting this PR pushed out to allow MPAS-A to begin converting.
This will be done later (or maybe removed entirely if we decide to abandon the notion of BFB runs for high-resolution configurations).

fvitt · 2020-12-08T15:40:45Z

src/physics/weak_scaling/gmean_mod.F90

+   integer                       :: max_nflds = -1 ! # fields for disp. decomp
+   type(column_redist_t)         :: column_reorder
+
+   private :: test_gmean


How is this used if it is private?

With code insertion (see note above about how it was used).
I left the code in so that we can use it when we have easier-to setup unit tests.

fvitt · 2020-12-08T15:41:14Z

src/physics/weak_scaling/gmean_mod.F90

+!!XXgoldyXX: v remove once new algorithm is in place
+private :: gmean_float_repro
+private :: gmean_fixed_repro
+!!XXgoldyXX: ^ remove once new algorithm is in place


Ready to remove?

src/cpl/mct/atm_comp_mct.F90

bld/configure

cacraigucar · 2020-12-10T20:55:29Z

src/dynamics/se/interp_mod.F90

+            fld_dyn(blk_ind(1), 2, k, ie) = fldv(icol, k, lchnk-begchunk+1)
+         end do
+      end do
+    deallocate(fld_dyn)


This section just looks wrong. You allocate, set a default value, fill the field and then promptly deallocate. It looks like the deallocate should not be there. Not sure how this working because there is a subsequent reshape of fld_dyn right after this.

I agree. This deallocate is misplaced.

Which makes me think this code isn't being tested by our regression tests. I know people often use the interpolated output with SE rather than deal with native grid output, so interpolated output should be added to an SE test.

I have added tests for interpolated output (from an SE run) and also zonal mean output (from an FV run). I did this by adding a new file to existing tests (see ChangeLog).

cacraigucar · 2020-12-10T21:49:43Z

src/physics/weak_scaling/cam_comp.F90

+   ! Initialize the gmean module (must be after phys_grid_init)
+   call gmean_init(pcnst) !!XXgoldyXX: Is that a good value to use?


Right now, this is the only line that is different from the standard cam_comp.F90 routine. Instead of copying this routine in its entirety to just add this one line, couldn't we hide if with #ifdef for weak_scaling. I believe this is one of the limited times when a #ifdef is valid. Alternatively, there could be an empty gmean_init routine for non-weak scaling cases. Curious why you took the approach of copying the entire cam_comp.F90 to insert this one line.

There is no #ifdef for weak scaling but an empty routine could be added to the non-weak-scaling gmean_mod.
As for reasons, I suppose at the time, I did not know what the full impact would be.

cacraigucar · 2020-12-10T23:30:04Z

test/system/test_driver.sh

    fi

-    ##setup CESM work directory
+    ## If this is a Nag test, run the r8 and git tests


To confirm, the r8 and git test results will end up in the aux_cam_nag_XXXX.log file? Since we don't typically look in this file for test results, if it ends up there, we will need to augment the wiki page.

fvitt

I see a bunch of XXgoldyXX comments in the code. Will these be cleaned up?

fvitt · 2020-12-11T15:59:04Z

src/cpl/mct/atm_comp_mct.F90


  implicit none
  private
+  save


Is this needed?

Nope (implied at module scope).

Steve, I thought the save attribute was only implied for module variables that have an initializer. No?

That used to be true but no compiler ever implemented any module variable non-saving so they made it official in the 2008 standard that all data objects at module scope have the save attribute.

fvitt · 2020-12-11T16:06:12Z

src/dynamics/se/dyn_grid.F90

+!!XXgoldyXX: v try to remove?
+public :: get_horiz_grid_dim_d
+public :: dyn_grid_get_colndx ! get element block/column and MPI process indices
+public :: get_dyn_grid_parm
+!!XXgoldyXX: ^ try to remove?
+public :: dyn_grid_get_elem_coords ! get coords of a specified block element
+
+!!XXgoldyXX: v MUST BE DELETED
+public :: get_horiz_grid_d
+public :: get_dyn_grid_parm_real1d
+!!XXgoldyXX: ^ MUST BE DELETED


Will these be removed?

Yeah, I could swear I did that but obviously, it did not make its way to the PR. I will clean up this file.

There is still a block of comments but hopefully it is a bit more clear as to how these interfaces need to be treated down the road.

I see that there are still a few calls to get_horiz_grid_d in places where there is no conditional logic to restrict the calls to rectangular grids, i.e., in src/ionosphere/waccmx/ionosphere_interface.F90 and in src/physics/carma/models/meteor_impact/carma_model_mod.F90. Are these physics configurations not supported for the SE dycore?

The ionosphere code currently only runs on FV. The call to get_horiz_grid_d is removed in #264 which also enables SE support by moving the coupling to the physics grid.

On the other hand, CARMA is a hot mess. I do not know if anyone has tried it with SE but I would be surprised if it ever ran on an unstructured dycore. There has been talk (from ACOM folks) of reimplementing CARMA but I have not heard from them for a few months.

I'm aware of the CARMA resurrection efforts which is why I wanted to keep it on our radar. The use of get_horiz_grid_d in the CARMA code is minimal, and could easily be updated to just use info in the local chunks and then shr_reprosum_calc to compute the area of interest.

fvitt · 2020-12-11T16:07:12Z

src/dynamics/se/dyn_grid.F90

+!!XXgoldyXX: v debug only
+character(len=256) :: errmsg
+!!XXgoldyXX: ^ debug only


Should this be removed?

Yes, thanks for noticing.

cacraigucar · 2020-12-11T17:52:35Z

@gold2718 Have you found the updated file(s) that you thought was(were) committed? It would be good to get them updated ASAP.

src/dynamics/se/interp_mod.F90

nusbaume

My first round of reviews. Hope they are helpful (and don't show too much of my own ignorance)!

src/utils/cam_abortutils.F90

src/utils/cam_map_utils.F90

src/physics/cam/phys_control.F90

src/physics/cam/phys_debug_util.F90

src/physics/weak_scaling/phys_grid.F90

nusbaume · 2020-12-11T18:45:21Z

src/physics/weak_scaling/phys_grid.F90

+
+   !========================================================================
+
+   subroutine get_rlat_all_p(lcid, rlatdim, rlats)


It seems that most of the get_XXX_all_p subroutines here and below are missing any sort of phys_grid_initialized() check. I assume they should probably have one, just to be safe?

That check happens in the chunk_info_to_index_p call.

src/physics/weak_scaling/phys_grid.F90

src/physics/cam/phys_grid.F90

src/dynamics/se/dyn_grid.F90

brian-eaton · 2020-12-15T19:49:16Z

src/dynamics/se/dyn_grid.F90

      allocate(areaA(ngcols_d))
      allocate(clat(ngcols_d),clon(ngcols_d))
-      call get_horiz_grid_d(ngcols_d, clat_d_out=clat, clon_d_out=clon, area_d_out=areaA)
+      call get_horiz_grid_int(ngcols_d, clat_d_out=clat, clon_d_out=clon, area_d_out=areaA)


Are the native mapping files still used? It seems like this functionality has been superseded by the ESMF remapping tools. Maybe this whole conditional can go away. @PeterHjortLauritzen what do you think?

src/dynamics/se/dyn_grid.F90

brian-eaton · 2020-12-15T23:01:53Z

src/dynamics/se/dyn_grid.F90

+   owners = (igcol * 0) -1 ! Kill compiler warnings
+   col    = -1             ! Kill compiler warnings
+   lbk    = -1             ! Kill compiler warnings
+   call endrun('dyn_grid_get_colndx: not implemented for unstructured grids')


This routine is called by the sat_hist module. It looks like the old code supported the GLL grid, but not the FVM grid. Now the routine doesn't support either. Does this mean that it's no longer possible to have sathist output with SE?

src/dynamics/se/dp_coupling.F90

src/dynamics/se/dyn_grid.F90

brian-eaton

A few more nits.

src/physics/weak_scaling/phys_grid.F90

Remove weak-scaling-fix as an option Move weak scale fix code to infrastructure directory Add interpolated output and zonal mean processing to tested history options Use num_global_phys_cols all the time in place of ngcols_p Always use get_grid_dims in place of get_horiz_grid_dim_d Always call gmean_init (Allows using only one cam_comp module) Add checks for repro sum failures Fix PLB test to use one build (same config), just test load balance option

src/utils/gmean_mod.F90

fvitt · 2020-12-28T17:04:14Z

doc/ChangeLog

+cime_config/testdefs/testmods_dirs/cam/waccmx_weimer/user_nl_cpl
+ - Add shr_reprosum difference check
+doc/ChangeLog
+ - Waxed poetic about modifying code


It this poetry?

Everybody's a critic :-)

fvitt · 2020-12-28T17:12:51Z

bld/config_files/definition.xml

+<entry id="weak_scale_fix" valid_values="0,1" value="0">
+Use weak scaling version of physics grid and d_p coupling: 0=off, 1=on.
+</entry>


It this configure option used? It seems that the src/infrastructure code is always used when the dycore is SE.

Oops, I removed it in the last round and forgot to take it out here. Thanks for noticing.

cime_config/testdefs/testmods_dirs/cam/outfrq3s_diags/user_nl_cam

cime_config/testdefs/testmods_dirs/cam/outfrq3s_ttrac/user_nl_cam

brian-eaton · 2020-12-28T18:05:32Z

doc/ChangeLog

+ - phys_global_max_fields: Maximum number of global fields to sum at one time
+ - cam_repro_sum: Flag to enable reproducible sum


These have been removed.

Thanks for that (and all your other comments -- even the nits :-)

fvitt

weak_scale_fix needs to be removed from config_files/definition.xml

nusbaume

Looks good to me now! Just one grammar fix and a request for additional error checks.

nusbaume · 2020-12-30T16:50:45Z

cime_config/SystemTests/plb.py

-        CAM_CONFIG_OPTS = self._case1.get_value("CAM_CONFIG_OPTS")
-        self._case.set_value("CAM_CONFIG_OPTS",CAM_CONFIG_OPTS.replace(' -nadv_tt 5 -cppdefs -DTRACER_CHECK',''))
-        comments = "Setting phys_loadbalance to value that's set in the testmod."
+        comments = "Using phys_loadbalance to value that's set in the testmod."


I think grammatically this new comment should be:

"Using phys_loadbalance value that's set in the testmod."

No, I just want to express how much I 'value' the 'set' belonging to 'that' in the testmod :-)

nusbaume · 2020-12-31T02:22:37Z

src/infrastructure/phys_grid.F90

+
+   !========================================================================
+
+   subroutine get_rlat_all_p(lcid, rlatdim, rlats)


I believe all of the get_XXX_all_p subroutines still need a phys_grid_initialized() check, as they do not call chunk_info_to_index_p.

Remove unused weak-scale-fix keyword from definition.xml Make sure phys_grid is initialized in access functions in phys_grid Fix comments in plb.py Update ChangeLog with test results

Change request fulfilled, reviewer not available for re-review

Steve Goldhaber added 3 commits December 6, 2020 14:01

Remove _FillValue from grid coordinate variables

3a519b8

Log TR8 and TGIT output and initial ChangeLog entry

c94c2c6

gold2718 added the enhancement New feature or request label Dec 6, 2020

gold2718 added this to the CESM2.3 milestone Dec 6, 2020

gold2718 requested review from brian-eaton, cacraigucar, fvitt and nusbaume December 6, 2020 21:14

gold2718 self-assigned this Dec 6, 2020

cacraigucar changed the title ~~Initial implementation of weak scaling physics infrastructure~~ cam6_3_007?: Initial implementation of weak scaling physics infrastructure Dec 7, 2020

cacraigucar requested changes Dec 7, 2020

View reviewed changes

fvitt reviewed Dec 7, 2020

View reviewed changes

cacraigucar previously requested changes Dec 7, 2020

View reviewed changes

fvitt reviewed Dec 8, 2020

View reviewed changes

brian-eaton reviewed Dec 8, 2020

View reviewed changes

src/cpl/mct/atm_comp_mct.F90 Outdated Show resolved Hide resolved

brian-eaton reviewed Dec 9, 2020

View reviewed changes

bld/configure Outdated Show resolved Hide resolved

brian-eaton reviewed Dec 9, 2020

View reviewed changes

bld/configure Outdated Show resolved Hide resolved

cacraigucar reviewed Dec 10, 2020

View reviewed changes

cacraigucar reviewed Dec 11, 2020

View reviewed changes

fvitt reviewed Dec 11, 2020

View reviewed changes

brian-eaton reviewed Dec 11, 2020

View reviewed changes

src/dynamics/se/interp_mod.F90 Show resolved Hide resolved

nusbaume reviewed Dec 11, 2020

View reviewed changes

brian-eaton reviewed Dec 14, 2020

View reviewed changes

src/physics/cam/phys_grid.F90 Show resolved Hide resolved

brian-eaton reviewed Dec 15, 2020

View reviewed changes

src/dynamics/se/dyn_grid.F90 Show resolved Hide resolved

brian-eaton reviewed Dec 15, 2020

View reviewed changes

src/dynamics/se/dyn_grid.F90 Outdated Show resolved Hide resolved

brian-eaton reviewed Dec 15, 2020

View reviewed changes

src/dynamics/se/dp_coupling.F90 Show resolved Hide resolved

MiCurry mentioned this pull request Dec 16, 2020

Update MPAS-A physics, dynamics coupling with weak scaling changes #301

Closed

brian-eaton reviewed Dec 18, 2020

View reviewed changes

src/dynamics/se/dyn_grid.F90 Outdated Show resolved Hide resolved

brian-eaton reviewed Dec 24, 2020

View reviewed changes

gold2718 requested review from brian-eaton, cacraigucar, fvitt and nusbaume December 27, 2020 22:38

fvitt reviewed Dec 28, 2020

View reviewed changes

brian-eaton approved these changes Dec 28, 2020

View reviewed changes

fvitt approved these changes Dec 31, 2020

View reviewed changes

nusbaume approved these changes Dec 31, 2020

View reviewed changes

Remove reprosum checks from FV tests which failed baselines on Cheyenne

32b8a01

Remove unused weak-scale-fix keyword from definition.xml Make sure phys_grid is initialized in access functions in phys_grid Fix comments in plb.py Update ChangeLog with test results

Finalize ChangeLog

26b949f

gold2718 changed the title ~~cam6_3_007?: Initial implementation of weak scaling physics infrastructure~~ cam6_3_007: Initial implementation of weak scaling physics infrastructure Jan 1, 2021

gold2718 merged commit e3630ff into ESCOMP:cam_development Jan 1, 2021

gold2718 deleted the weak_scaling branch June 13, 2021 17:28

		The value is the maximum number of double-precision fields on a task
		for global calculations.

		! write(6, ) 'XXG ',iam,') ',ie,i,j,elem(ie)%spherep(i,j)%lat,dbuf2(indx,ie)deg2rad
		! call shr_sys_flush(6)

		! Perform mixed layer global calculations for energy conservation checks.
		!

		! Initialize the gmean module (must be after phys_grid_init)
		call gmean_init(pcnst) !!XXgoldyXX: Is that a good value to use?


		!========================================================================

		subroutine get_rlat_all_p(lcid, rlatdim, rlats)

		- phys_global_max_fields: Maximum number of global fields to sum at one time
		- cam_repro_sum: Flag to enable reproducible sum

cam6_3_007: Initial implementation of weak scaling physics infrastructure #287

cam6_3_007: Initial implementation of weak scaling physics infrastructure #287

Uh oh!

Conversation

gold2718 commented Dec 6, 2020

Uh oh!

cacraigucar left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

fvitt left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cacraigucar left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!