Skip to content

Commit c7b5c45

Browse files
Add regression test setup, scripts, and HISTORY templates
1 parent 754e2eb commit c7b5c45

File tree

5 files changed

+1727
-0
lines changed

5 files changed

+1727
-0
lines changed

GEOSldas_App/ldas_setup

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,8 +3,10 @@
33
import sys
44
import argparse
55
import resource
6+
import os, shutil
67
from setup_utils import *
78
from ldas import *
9+
from pathlib import Path
810

911
def parseCmdLine():
1012
"""
@@ -176,3 +178,27 @@ if __name__=='__main__':
176178
print ("creating batch Run scripts")
177179
status = ldasObj.createBatchRun()
178180
assert (status)
181+
182+
# --- Install regression driver into this experiment (copy from source tree) ---
183+
def _copy_regression_from_source(expdir: Path):
184+
"""
185+
Copy util/postproc/regression from the GEOSldas source tree
186+
into <EXPDIR>/regress.
187+
"""
188+
src = Path(__file__).resolve().parents[2] / "src" / "Components" / "@GEOSldas_GridComp" / "GEOSldas_App" / "util" / "postproc" / "regression"
189+
dst = expdir / "regress"
190+
if not src.is_dir():
191+
print(f"WARNING: regression source not found: {src}")
192+
return
193+
194+
for root, dirs, files in os.walk(src):
195+
rel = Path(root).relative_to(src)
196+
(dst / rel).mkdir(parents=True, exist_ok=True)
197+
for f in files:
198+
srcf = Path(root) / f
199+
dstf = dst / rel / f
200+
shutil.copy2(srcf, dstf)
201+
if dstf.suffix == ".sh":
202+
mode = os.stat(dstf).st_mode
203+
os.chmod(dstf, mode | 0o111)
204+
Lines changed: 165 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,165 @@
1+
GEOSldas Global Regression: Model Start/Stop & Layout (6-hour tavg profile)
2+
3+
Overview
4+
5+
This regression is run after you have already built and executed a working GEOSldas experiment.
6+
7+
You must have:
8+
9+
A complete experiment directory containing:
10+
11+
run/, input/, build/, and output/<DOMAIN>/
12+
13+
Valid restart files under output/<DOMAIN>/rs/ens0000/
14+
(e.g., CURRENT.catch_internal_rst.*, CURRENT.landice_internal_rst.*)
15+
16+
A run/LDAS.rc that defines your grid type (CF or EASE)
17+
18+
The regression does not modify your experiment.
19+
It makes a self-contained sandbox copy, runs start/stop tests, and compares results.
20+
21+
This regression runs GEOSldas in an isolated sandbox cloned from your experiment,
22+
forces a 6-hour time-averaged HISTORY profile (small & fast), and verifies that:
23+
24+
Restarts are identical for a 24 h run vs 12 h + 12 h split.
25+
26+
HISTORY (6-hour centers) is identical for the same 24 h window.
27+
28+
It is grid-agnostic:
29+
30+
CF (cubed-sphere): tavg24_2d_*_Nx
31+
32+
EASE (1-D grids): tavg24_1d_*_Nt
33+
34+
Both are normalized to 6-hour frequency: 060000 with ref_time: 000000.
35+
36+
Your real experiment is not modified.
37+
Everything runs in regress/sandbox/<EXPID> (comment out the cleanup line to keep it).
38+
39+
Regression package layout
40+
util/postproc/regression/
41+
├─ start_stop_model.sh # regression driver
42+
├─ templates/
43+
│ ├─ HISTORY_2d.rc # CF (2d/Nx) 6-hour tavg only
44+
│ └─ HISTORY_1d.rc # EASE (1d/Nt) 6-hour tavg only
45+
├─ README.md # this file
46+
47+
48+
When a regression run starts, this structure appears under your experiment:
49+
50+
<EXPID>/
51+
├─ run/ # original job files (unchanged)
52+
├─ input/ # restart, tile, forcing, etc.
53+
├─ build/ # model binaries
54+
├─ output/<DOMAIN>/ # real experiment outputs
55+
│ ├─ rs/ens0000/ # restarts (catch, land-ice)
56+
│ ├─ cat/ens0000/ # HISTORY (tavg24_*.nc4)
57+
│ └─ rc_out/ # category files
58+
└─ regress/
59+
├─ logs/ # regression stdout/stderr with timestamps
60+
├─ sets/ # collected results per segment:
61+
│ ├─ T1_* # 24 h run
62+
│ ├─ T2_* # 12 h first half
63+
│ └─ T3_* # 12 h second half
64+
└─ sandbox/<EXPID>/ # isolated copy used for the run
65+
├─ run/ # patched job/rc files
66+
├─ build/ # symlink to ../build
67+
├─ output/<DOMAIN>/ # new outputs written here
68+
└─ scratch/ # Slurm log/stdout/err for sandbox runs
69+
70+
71+
To inspect the sandbox after a run, comment out the final cleanup line
72+
in start_stop_model.sh.
73+
By default, the sandbox is deleted after a PASS.
74+
75+
Quick start
76+
77+
Run your experiment once so that restart files and outputs exist.
78+
The regression uses these restarts as inputs.
79+
80+
Run the regression driver
81+
82+
cd util/postproc/regression
83+
./start_stop_model.sh
84+
85+
86+
Run with layout test
87+
88+
To check layout invariance (different 1-D axis decomposition):
89+
90+
RUN_LAYOUT=1 ALT_1D=120 ./start_stop_model.sh
91+
92+
93+
where ALT_1D can be 84, 120, 126, etc., depending on grid resolution.
94+
95+
What the regression does
96+
97+
Creates regress/sandbox/<EXPID> and copies your run directory.
98+
99+
Detects grid type (CF or EASE) and applies the correct 6-hour HISTORY template.
100+
101+
Adjusts environment variables:
102+
103+
DO_HISTORY=TRUE
104+
DO_HIST=TRUE
105+
POSTPROC_HIST=0
106+
107+
108+
Runs:
109+
110+
T1 – single 24-hour job
111+
112+
T2 – 12-hour run to mid-time
113+
114+
T3 – 12-hour run to final time
115+
116+
Compares:
117+
118+
RESTARTS: T1 (24 h) vs T3 (12 h + 12 h)
119+
120+
HISTORY: T1 vs [T2 ∪ T3] at 03/09/15/21 Z centers
121+
122+
Environment variables
123+
Variable Description Default
124+
EXPDIR Experiment root (run/, input/, build/, output/) auto-detected
125+
EXPDOMAIN Domain under output/ auto-detected
126+
SUBMIT Batch command (Slurm only) sbatch
127+
ALT_1D Alternate 1-D task count for layout test required if RUN_LAYOUT=1
128+
NCCMP_FLAGS_TOL Tolerant compare flags -dmfgqMNS -t 1e-12 -T 1e-6
129+
HIST_STEP_SEC Step for HISTORY collect 21600 (6 h)
130+
HIST_STEP_OFFSET_SEC Center offset (+3 h) 10800
131+
132+
Example:
133+
134+
export EXPDIR=/discover/nobackup/borescan/par/global_regress_test/CURRENT
135+
export EXPDOMAIN=CF0090x6C_GLOBAL
136+
RUN_LAYOUT=0 ./start_stop_model.sh
137+
138+
Comparison logic
139+
140+
Restarts are compared with nccmp -dmfgqMNS.
141+
If strict compare fails, the script performs a tolerant comparison.
142+
143+
HISTORY compares all 6-hour stamps in the same 24-hour window.
144+
145+
Notes
146+
147+
The 6-hour profile is used for both CF (2d/Nx) and EASE (1d/Nt).
148+
It reduces runtime and I/O while staying bit-for-bit safe for segmented runs.
149+
150+
For EASE daily tavg24 tests, use day-aligned 24 h jobs only.
151+
Do not test sub-day segments in one job with daily tavg.
152+
153+
If restart diffs appear only in diagnostic counters, enable:
154+
155+
MAPL_R8_BFB=1
156+
MAPL_BFB_REDUCTIONS=1
157+
158+
159+
or restrict comparison to prognostic variables.
160+
161+
162+
Maintenance
163+
164+
Templates (templates/HISTORY_1d.rc, templates/HISTORY_2d.rc) are version-controlled.
165+
If land-ice is disabled, the glc stream is ignored automatically by GEOSldas.

0 commit comments

Comments
 (0)