|
| 1 | +""" |
| 2 | + Study |
| 3 | +
|
| 4 | +A structure for managing parameter studies with multiple peridynamic simulations. |
| 5 | +
|
| 6 | +# Fields |
| 7 | +- `create_job::Function`: A function with signature `create_job(setup::NamedTuple)` that |
| 8 | + creates a [`Job`](@ref) from a setup configuration. |
| 9 | +- `setups::Vector{NamedTuple}`: A vector of setup configurations. Each setup must be a |
| 10 | + `NamedTuple` with the same field names. |
| 11 | +- `jobs::Vector{Job}`: A vector of jobs created from the setups. |
| 12 | +- `submission_status::Vector{Bool}`: Status vector indicating whether each job was |
| 13 | + submitted successfully (`true`) or encountered an error (`false`). |
| 14 | +- `postproc_status::Vector{Bool}`: Status vector indicating whether post-processing for |
| 15 | + each job was successful (`true`) or encountered an error (`false`). |
| 16 | +- `results::Vector{Vector{NamedTuple}}`: Storage for results from post-processing. Each |
| 17 | + element corresponds to a simulation and contains a vector of `NamedTuple`s returned by |
| 18 | + the processing function for each time step. |
| 19 | +
|
| 20 | +# Example |
| 21 | +```julia |
| 22 | +function create_job(setup::NamedTuple) |
| 23 | + # Create body, solver, etc. using setup parameters |
| 24 | + body = ... |
| 25 | + solver = VelocityVerlet(steps=setup.n_steps) |
| 26 | + job = Job(body, solver; path=setup.path, freq=setup.freq) |
| 27 | + return job |
| 28 | +end |
| 29 | +
|
| 30 | +setups = [ |
| 31 | + (; n_steps=1000, path="sim1", freq=10), |
| 32 | + (; n_steps=2000, path="sim2", freq=20), |
| 33 | +] |
| 34 | +
|
| 35 | +study = Study(create_job, setups) |
| 36 | +``` |
| 37 | +
|
| 38 | +See also: [`submit!`](@ref), [`postproc!`](@ref) |
| 39 | +""" |
| 40 | +struct Study |
| 41 | + create_job::Function |
| 42 | + setups::Vector{NamedTuple} |
| 43 | + jobs::Vector{Job} |
| 44 | + submission_status::Vector{Bool} |
| 45 | + postproc_status::Vector{Bool} |
| 46 | + results::Vector{Vector{NamedTuple}} |
| 47 | + |
| 48 | + function Study(create_job::Function, setups::Vector{<:NamedTuple}) |
| 49 | + check_setups(setups) |
| 50 | + n = length(setups) |
| 51 | + jobs = Vector{Job}(undef, n) |
| 52 | + submission_status = fill(false, n) |
| 53 | + postproc_status = fill(false, n) |
| 54 | + results = [NamedTuple[] for _ in 1:n] |
| 55 | + |
| 56 | + for (i, setup) in enumerate(setups) |
| 57 | + try |
| 58 | + job = create_job(setup) |
| 59 | + if !(job isa Job) |
| 60 | + msg = "create_job function must return a Job object!\n" |
| 61 | + msg *= "For setup $i, got $(typeof(job)) instead.\n" |
| 62 | + throw(ArgumentError(msg)) |
| 63 | + end |
| 64 | + jobs[i] = job |
| 65 | + catch err |
| 66 | + msg = "Error creating job for setup $i:\n" |
| 67 | + msg *= " Setup: $setup\n" |
| 68 | + msg *= " Error: $err\n" |
| 69 | + throw(ArgumentError(msg)) |
| 70 | + end |
| 71 | + end |
| 72 | + |
| 73 | + return new(create_job, setups, jobs, submission_status, postproc_status, results) |
| 74 | + end |
| 75 | +end |
| 76 | + |
| 77 | +function check_setups(setups::Vector{<:NamedTuple}) |
| 78 | + if isempty(setups) |
| 79 | + throw(ArgumentError("setups vector cannot be empty!\n")) |
| 80 | + end |
| 81 | + |
| 82 | + # Check that all setups have the same field names |
| 83 | + first_keys = keys(setups[1]) |
| 84 | + for (i, setup) in enumerate(setups[2:end]) |
| 85 | + if keys(setup) != first_keys |
| 86 | + msg = "All setups must have the same field names!\n" |
| 87 | + msg *= " First setup has fields: $first_keys\n" |
| 88 | + msg *= " Setup $(i+1) has fields: $(keys(setup))\n" |
| 89 | + throw(ArgumentError(msg)) |
| 90 | + end |
| 91 | + end |
| 92 | + |
| 93 | + return nothing |
| 94 | +end |
| 95 | + |
| 96 | +function Base.show(io::IO, @nospecialize(study::Study)) |
| 97 | + n_jobs = length(study.jobs) |
| 98 | + n_submitted = count(study.submission_status) |
| 99 | + n_postproc = count(study.postproc_status) |
| 100 | + print(io, "Study with $n_jobs simulations ($n_submitted submitted, $n_postproc post-processed)") |
| 101 | + return nothing |
| 102 | +end |
| 103 | + |
| 104 | +function Base.show(io::IO, ::MIME"text/plain", @nospecialize(study::Study)) |
| 105 | + if get(io, :compact, false) |
| 106 | + show(io, study) |
| 107 | + else |
| 108 | + n_jobs = length(study.jobs) |
| 109 | + n_submitted = count(study.submission_status) |
| 110 | + n_postproc = count(study.postproc_status) |
| 111 | + println(io, "Study:") |
| 112 | + println(io, " Number of simulations: $n_jobs") |
| 113 | + println(io, " Successfully submitted: $n_submitted") |
| 114 | + println(io, " Successfully post-processed: $n_postproc") |
| 115 | + if n_jobs > 0 |
| 116 | + println(io, " Setup parameters: $(keys(study.setups[1]))") |
| 117 | + end |
| 118 | + end |
| 119 | + return nothing |
| 120 | +end |
| 121 | + |
| 122 | +""" |
| 123 | + submit!(study::Study; kwargs...) |
| 124 | +
|
| 125 | +Submit all jobs in the study for simulation. Jobs are executed **sequentially**, with each |
| 126 | +job running to completion before the next begins. Each individual job can utilize MPI or |
| 127 | +multithreading as configured through the [`submit`](@ref) function. |
| 128 | +
|
| 129 | +Each job is run independently, and if one job encounters an error, the remaining jobs will |
| 130 | +still be executed. The submission status for each job is stored in `study.submission_status`. |
| 131 | +
|
| 132 | +!!! note "Execution model" |
| 133 | + Jobs in a study run sequentially, not in parallel. This is by design because: |
| 134 | + - Each simulation typically uses all available computational resources (MPI ranks/threads) |
| 135 | + - Running multiple MPI jobs simultaneously from one process is problematic |
| 136 | + - For true parallel execution of parameter studies, submit separate jobs to your HPC scheduler |
| 137 | +
|
| 138 | +# Arguments |
| 139 | +- `study::Study`: The study containing the jobs to submit. |
| 140 | +
|
| 141 | +# Keywords |
| 142 | +- `quiet::Bool`: If `true`, suppress output for each individual job. (default: `false`) |
| 143 | +
|
| 144 | +# Returns |
| 145 | +- `nothing` |
| 146 | +
|
| 147 | +# Example |
| 148 | +```julia |
| 149 | +study = Study(create_job, setups) |
| 150 | +submit!(study) |
| 151 | +
|
| 152 | +# Check which jobs completed successfully |
| 153 | +successful_jobs = findall(study.submission_status) |
| 154 | +``` |
| 155 | +
|
| 156 | +See also: [`Study`](@ref), [`submit`](@ref) |
| 157 | +""" |
| 158 | +function submit!(study::Study; quiet::Bool=false) |
| 159 | + n_jobs = length(study.jobs) |
| 160 | + |
| 161 | + println("Starting parameter study with $n_jobs simulations...") |
| 162 | + |
| 163 | + for (i, job) in enumerate(study.jobs) |
| 164 | + println("\n" * "="^80) |
| 165 | + println("Simulation $i of $n_jobs") |
| 166 | + println("Setup: $(study.setups[i])") |
| 167 | + println("="^80) |
| 168 | + |
| 169 | + try |
| 170 | + submit(job; quiet=quiet) |
| 171 | + study.submission_status[i] = true |
| 172 | + println("✓ Simulation $i completed successfully") |
| 173 | + catch err |
| 174 | + study.submission_status[i] = false |
| 175 | + println("✗ Simulation $i encountered an error:") |
| 176 | + println(" Error type: $(typeof(err))") |
| 177 | + println(" Error message: $err") |
| 178 | + if err isa Exception |
| 179 | + println(" Stacktrace:") |
| 180 | + for (exc, bt) in Base.catch_stack() |
| 181 | + showerror(stdout, exc, bt) |
| 182 | + println() |
| 183 | + end |
| 184 | + end |
| 185 | + println("Continuing with remaining simulations...") |
| 186 | + end |
| 187 | + end |
| 188 | + |
| 189 | + println("\n" * "="^80) |
| 190 | + n_successful = count(study.submission_status) |
| 191 | + println("Parameter study completed: $n_successful of $n_jobs simulations successful") |
| 192 | + println("="^80) |
| 193 | + |
| 194 | + return nothing |
| 195 | +end |
| 196 | + |
| 197 | +""" |
| 198 | + postproc!(proc_func::Function, study::Study; kwargs...) |
| 199 | +
|
| 200 | +Apply a post-processing function to all successfully submitted jobs in the study. Results |
| 201 | +are stored in `study.results` as a vector of vectors of `NamedTuple`s. |
| 202 | +
|
| 203 | +Post-processing can be performed in parallel (multithreaded or MPI) for each individual |
| 204 | +simulation by setting `serial=false` (default). The parallelization happens within each |
| 205 | +simulation's time steps, not across different simulations. |
| 206 | +
|
| 207 | +# Arguments |
| 208 | +- `proc_func::Function`: A function with signature `proc_func(r0, r, id)` where: |
| 209 | + - `r0`: Reference results (from [`read_vtk`](@ref) of the initial export) |
| 210 | + - `r`: Current time step results (from [`read_vtk`](@ref)) |
| 211 | + - `id`: File ID / time step number |
| 212 | + The function should return either a `NamedTuple` with results or `nothing`. |
| 213 | +- `study::Study`: The study to post-process. |
| 214 | +
|
| 215 | +# Keywords |
| 216 | +- `serial::Bool`: If `true`, process results serially on a single thread. (default: `false`) |
| 217 | +
|
| 218 | +# Returns |
| 219 | +- `nothing`: Results are stored in `study.results`. For simulation `i`, access results via |
| 220 | + `study.results[i]`, which contains a vector of `NamedTuple`s (one per time step). |
| 221 | +
|
| 222 | +# Example |
| 223 | +```julia |
| 224 | +function proc_func(r0, r, id) |
| 225 | + # Calculate some quantity |
| 226 | + max_displacement = maximum(r[:displacement]) |
| 227 | + return (; time_step=id, max_disp=max_displacement) |
| 228 | +end |
| 229 | +
|
| 230 | +study = Study(create_job, setups) |
| 231 | +submit!(study) |
| 232 | +postproc!(proc_func, study) |
| 233 | +
|
| 234 | +# Access results for first simulation |
| 235 | +results_sim1 = study.results[1] |
| 236 | +``` |
| 237 | +
|
| 238 | +See also: [`Study`](@ref), [`process_each_export`](@ref) |
| 239 | +""" |
| 240 | +function postproc!(proc_func::Function, study::Study; serial::Bool=false) |
| 241 | + check_process_function(proc_func) |
| 242 | + |
| 243 | + n_jobs = length(study.jobs) |
| 244 | + n_successful = count(study.submission_status) |
| 245 | + |
| 246 | + if n_successful == 0 |
| 247 | + @warn "No successfully submitted jobs to post-process!" |
| 248 | + return nothing |
| 249 | + end |
| 250 | + |
| 251 | + println("\nStarting post-processing for $n_successful successful simulations...") |
| 252 | + |
| 253 | + # Clear previous results and determine if we should collect |
| 254 | + collect_results = Ref(false) |
| 255 | + first_result_type_checked = Ref(false) |
| 256 | + |
| 257 | + for (i, job) in enumerate(study.jobs) |
| 258 | + # Clear previous results for this simulation |
| 259 | + empty!(study.results[i]) |
| 260 | + |
| 261 | + # Skip jobs that weren't submitted successfully |
| 262 | + if !study.submission_status[i] |
| 263 | + study.postproc_status[i] = false |
| 264 | + continue |
| 265 | + end |
| 266 | + |
| 267 | + println("\nPost-processing simulation $i of $n_jobs...") |
| 268 | + |
| 269 | + try |
| 270 | + function wrapper_func(r0, r, id) |
| 271 | + result = proc_func(r0, r, id) |
| 272 | + |
| 273 | + # On first result, check if we should collect |
| 274 | + if !first_result_type_checked[] |
| 275 | + if result isa NamedTuple |
| 276 | + collect_results[] = true |
| 277 | + elseif !isnothing(result) |
| 278 | + @warn "proc_func returned $(typeof(result)) instead of NamedTuple or nothing. Results will not be collected." |
| 279 | + end |
| 280 | + first_result_type_checked[] = true |
| 281 | + end |
| 282 | + |
| 283 | + # Collect if appropriate |
| 284 | + if collect_results[] && result isa NamedTuple |
| 285 | + push!(study.results[i], result) |
| 286 | + end |
| 287 | + |
| 288 | + return nothing |
| 289 | + end |
| 290 | + |
| 291 | + process_each_export(wrapper_func, job; serial=serial) |
| 292 | + |
| 293 | + study.postproc_status[i] = true |
| 294 | + println("✓ Post-processing simulation $i completed successfully") |
| 295 | + |
| 296 | + catch err |
| 297 | + study.postproc_status[i] = false |
| 298 | + println("✗ Post-processing simulation $i encountered an error:") |
| 299 | + println(" Error type: $(typeof(err))") |
| 300 | + println(" Error message: $err") |
| 301 | + println("Continuing with remaining simulations...") |
| 302 | + end |
| 303 | + end |
| 304 | + |
| 305 | + println("\n" * "="^80) |
| 306 | + n_postproc_successful = count(study.postproc_status) |
| 307 | + println("Post-processing completed: $n_postproc_successful of $n_successful simulations successful") |
| 308 | + println("="^80) |
| 309 | + |
| 310 | + return nothing |
| 311 | +end |
0 commit comments