CAM debugging techniques

This page contains techniques which are useful when the model is crashing. Note that this is not a page about using parallel debuggers since availability is often limited.

Debugging Techniques

First Steps
CAM Snapshot
Stopping the model
cam_pio_dump_field
pbuf_dump_pbuf

First Steps

1. Build and run the model with compiler debug flags enabled:

If you are running into either build or run-time errors that you don't understand, then first try building and running the model with compiler debug flags enabled. This can be done by setting DEBUG to TRUE in env_build.xml in your case directory, and then re-building and re-running the model.

The model should still fail, but the information in the logs should be more informative, and may even point you to the exact line of code that has an error. Please note that if the model fails at run-time this "stack trace" to the problematic line is likely going to be in the cesm.log.XXXX file, which is usually located in your case's run directory.

2. Try a different compiler, if possible:

If running with debug flags did not help you track down the issue, then if possible try running with a different fortran compiler, ideally with DEBUG still set to TRUE. Often times one compiler will report an error that another compiler may simply ignore, or try to manage on its own. It might even generate a proper stack trace even if the first compiler failed to do so.

The easiest way to change your compiler (when on a supported machine) is to create a new CAM case using create_newcase with the --compiler flag that specifies which compiler to use. If you aren't sure which compilers are available on your machine, then run the command:

query_config --machines

and search for your particular machine's name, which should include a "compilers" line that lists all available compiler options. This command will be located in the same location as create_newcase. Again, please note that these instructions will only work for machines where CAM, CESM, or CIME have been properly ported.

CAM Snapshot

cam_snapshot is a set of routines which will write out all of the fields in state, constituents, tend, ptend, cam_in, cam_out and pbuf along with a few fields that are just local to tphysac and tphysbc. The times that these fields are written out are controlled by the "cam_snapshot_before" and "cam_snapshot_after" types of variables. "cam_snapshot_before" variables are used to capture the model variables before a particular physics parameterization is called and "cam_snapshot_after" is used to capture variables after the parameterization. cam_snapshot is controlled by four namelist variables:

cam_snapshot_before_num - the output file number for the before snapshots (for example, setting to 6 will result in the values being written to the h5 file)
cam_snapshot_after_num - the output file number for the after snapshots (for example, setting to 7 will result in the values being written to the h6 file)
cam_take_snapshot_before - the name of the parameterization before which all fields will be output
cam_take_snapshot_after - the name of the parameterization after which all fields will be output

In addition, it is almost always the case that a user will want to specify that the information is written out on every time step, so the corresponding elements in nhtfrq should be set to 1 in user_nl_cam

If the model is crashing, set the corresponding elements of mfilt to 1 in user_nl_cam.

If the cam_take_snapshot_before and cam_take_snapshot_after are set to the same parameterization, then the changes made by that particular parameterization are isolated. If they are set to different parameterizations, then the values will be output before the parameterization specified by cam_take_snapshot_before is called and after the cam_take_snapshot_after parameterization completes.

Stopping the model

`cam_pio_dump_field`

cam_pio_dump_field is a function which immediately writes a NetCDF file with information from a field. For example: `` call cam_pio_dump_field('CLD', 1, pcols, 1, pver, cld)

will write the field, `cld`, to a file called `CLD_dump_<##>.nc` where `<##>` is a number starting at one and increasing as this call is repeated. The file simply contains the contents as a 3-dimensional array where the first two dimensions are given by the bounds (`1:pcols` and `1:pver`) and the third dimension is the MPI task number (`1:npes`).

`cam_pio_dump_field` can also handle 3, 4, and 6-dimensional fields, just call the function with the appropriate number of bounds for the field.

Note that by default, `cam_pio_dump_field` collects the bounds from all MPI tasks and uses the largest range for the NetCDF file. To skip this step, set the optional variable, `compute_maxdim_in`, to `.false.`.

## `pbuf_dump_pbuf`

`pbuf_dump_pbuf` is similar to `cam_pio_dump_field` in that it immediately writes NetCDF files. The main difference is that is cannot be called from a threaded region and requires access to the full `pbuf` (aka the `pbuf2d` variable). The call is:

pbuf_dump_pbuf(pbuf2d, name, num)

where `pbuf2d` is the full pbuf, `name` is an optional name to be added to each filename, and `num` is an optional integer to be added to each filename.

`pbuf_dump_pbuf` then writes a NetCDF file for each field in the pbuf for this run. The file format is the same for `cam_pio_dump_field` (see above).

CAM wiki

Home

CAM Documentation

CAM Model Development

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CAM debugging techniques

Debugging Techniques

First Steps

1. Build and run the model with compiler debug flags enabled:

2. Try a different compiler, if possible:

CAM Snapshot

Stopping the model

`cam_pio_dump_field`

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally