Skip to content

Conversation

@gmao-qliu
Copy link
Contributor

@gmao-qliu gmao-qliu commented Apr 7, 2025

Add python scripts for computing ObsFcstAna statistics.

Processing is broken down into three steps:

  1. Create intermediate monthly stats data (nc4).
  2. Compute final stats from intermediate monthly data.
  3. Aggregate stats across different species and/or remap stats data.

Steps 1) and 2) are generic for all ObsFcstAna output. Stats are computed in the tile space of the DA experiment and for all obs species (separately) in the associated obs_param file.

Step 3) depends on the assimilation experiment and use case scenario. E.g., SMAP L4_SM stats are defined in the EASEv2 M09 tile space but are usually aggregated across H/V-pol and AM/PM ops and visualized on the EASEv2 M36 grid.

The main script serves as a template to show how to use the pre-processed data to compute and plot the usual O-F and O-A statistical metrics.

The intermediate monthly stats nc4 files contain:

dimensions:

tile    = [NUMBER_OF_TILES] ;
species = [NUMBER_OF_SPECIES] ;

variables:

char obs_param_assim(species) ;      

int N_data(tile, species) ;                ! number of obs

float obsxfcst_sum(tile, species) ;        ! sums of products of variables
float obsxana_sum(tile, species) ;
float fcstxana_sum(tile, species) ;

float obs_obs_sum(tile, species) ;         ! sums of variables
float obs_obsvar_sum(tile, species) ;
float obs_fcst_sum(tile, species) ;
float obs_fcstvar_sum(tile, species) ;
float obs_ana_sum(tile, species) ;
float obs_anavar_sum(tile, species) ;

float obs_obs2_sum(tile, species) ;        ! sums of squared variables
float obs_obsvar2_sum(tile, species) ;
float obs_fcst2_sum(tile, species) ;
float obs_fcstvar2_sum(tile, species) ;
float obs_ana2_sum(tile, species) ;
float obs_anavar2_sum(tile, species) ;

To facilitate the aggregation of the intermediate stats in Step 2), zeros are not replaced with NaNs in the "sum" variables even when the corresponding N_data value is zero.

Related issue: GEOS-ESM/GEOSldas#798 (closed)

@gmao-qliu gmao-qliu added 0-diff trivial very, very obvious 0-diff change post-processing labels Apr 7, 2025
@gmao-rreichle gmao-rreichle changed the title Feature/qliu/add postproc scripts add python scripts for processing ObsFcstAna output into monthly (intermediate) stats data Apr 7, 2025
@gmao-rreichle gmao-rreichle changed the title add python scripts for processing ObsFcstAna output into monthly (intermediate) stats data add python scripts for computing stats of ObsFcstAna output (via intermediate monthly files) Apr 7, 2025
Copy link
Collaborator

@gmao-rreichle gmao-rreichle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gmao-qliu, I added a few thoughts. Please take a look and let me know if you have any questions or concerns. Thanks!

@gmao-rreichle
Copy link
Collaborator

@gmao-qliu : I removed the binary files in the pycache/ directories from the repository. I don't think these should be under version control.

@gmao-qliu
Copy link
Contributor Author

Multi-experiments cross-masking is enabled now. @amfox37 @gmao-rreichle

@amfox37
Copy link
Contributor

amfox37 commented May 6, 2025

I have gone ahead and added the capability to calculate monthly OmF mean etc statistics from the precomputed sums/sum of squares. I'm not entirely sure it belongs in the Main_example.py as I've inserted it, let's discuss.
Also, have only run a very limited test case locally, so will now do some more substantial testing on Discover.

Copy link
Collaborator

@gmao-rreichle gmao-rreichle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gmao-qliu : I added a few more commits:

  1. Here's a link that shows the cumulative changes (white-space differences are hidden): https://github.com/GEOS-ESM/GEOSldas_GridComp/compare/f02efb4..fe2b15a?w=1
    The most likely source of errors in my changes is in my addition of a check for obsparam across the exp_list: d1852a8
    Please take a look and let me know if anything needs fixing.
  2. What did we decide to do about the stats of the normalized O-F? If I recall correctly, as implemented the stats differ from what we did with the old Matlab scripts. Are we still thinking of trying to change this so it becomes identical to what we did in Matlab? If not, we need to add documentation in Plot_stats_maps.py.

@gmao-qliu
Copy link
Contributor Author

  1. What did we decide to do about the stats of the normalized O-F? If I recall correctly, as implemented the stats differ from what we did with the old Matlab scripts. Are we still thinking of trying to change this so it becomes identical to what we did in Matlab? If not, we need to add documentation in Plot_stats_maps.py.

I added some comments regarding the current approach. We can change the approach later if needed.

@gmao-rreichle
Copy link
Collaborator

@gmao-qliu : I made a few more changes to clean up and clarify things. Here's a link to the cumulative differences of my commits of today: https://github.com/GEOS-ESM/GEOSldas_GridComp/compare/84c3b58..5d42392

This leaves the following comment as the only one that may still need resolution: #87 (comment) (re. the potentially excessive length of the names of the files that hold the monthly sums)

@gmao-qliu
Copy link
Contributor Author

@gmao-qliu : I made a few more changes to clean up and clarify things. Here's a link to the cumulative differences of my commits of today: https://github.com/GEOS-ESM/GEOSldas_GridComp/compare/84c3b58..5d42392

This leaves the following comment as the only one that may still need resolution: #87 (comment) (re. the potentially excessive length of the names of the files that hold the monthly sums)

@gmao-rreichle, Please review my latest commit to address this issue.

@gmao-rreichle

This comment was marked as duplicate.

@gmao-qliu
Copy link
Contributor Author

gmao-qliu commented Jun 11, 2025

@gmao-qliu, I added one more commit with clean-up: 990bce2. Besides adding new and clarifying existing comments, I simplified the variables related to exptag, exptag_list, and outid. I think we only need one string for all of this, which I called "exptag". The others seem to be superfluous. Please double check and let me know, thanks!

@gmao-rreichle Looks and tests all good to me.

@gmao-rreichle gmao-rreichle marked this pull request as ready for review June 11, 2025 21:43
@gmao-rreichle gmao-rreichle requested a review from a team as a code owner June 11, 2025 21:43
@gmao-rreichle gmao-rreichle merged commit 0425274 into develop Jun 11, 2025
8 checks passed
@gmao-rreichle gmao-rreichle deleted the feature/qliu/add_postproc_scripts branch June 11, 2025 21:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

0-diff trivial very, very obvious 0-diff change post-processing

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants