|
| 1 | +PLEIADES YAML config + workflow (draft) |
| 2 | +====================================== |
| 3 | + |
| 4 | +Purpose |
| 5 | +------- |
| 6 | +This note defines the draft YAML structure that instructs PLEIADES how to process data, |
| 7 | +configure SAMMY, and execute fitting routines. It reflects the desired directory layout |
| 8 | +for multi-fit workflows and serves as a working specification for end-to-end operation. |
| 9 | + |
| 10 | +Backbone + reproducibility intent |
| 11 | +--------------------------------- |
| 12 | +The YAML file is intended to be the backbone structure for running PLEIADES. It should: |
| 13 | +- Define the complete workspace layout for consistent file placement. |
| 14 | +- Declare datasets and fit routines in a single, structured source of truth. |
| 15 | +- Record each run as an append-only entry for analysis provenance and reproducibility. |
| 16 | +- Capture configuration inputs (fit options, nuclear parameters, data sources) alongside |
| 17 | + execution details (backend, paths, outputs) to enable re-running or auditing results. |
| 18 | +This makes the config both an operational entry point and a durable record of analysis. |
| 19 | + |
| 20 | +Directory layout |
| 21 | +---------------- |
| 22 | +working_dir/ |
| 23 | + endf_dir/ |
| 24 | + isotope_dir_1/ |
| 25 | + results_dir/ |
| 26 | + dummy.inp |
| 27 | + dummy.par |
| 28 | + isotope_dir_2/ |
| 29 | + ... |
| 30 | + fitting_dir/ |
| 31 | + <routine_id>/ |
| 32 | + results_dir/ |
| 33 | + input.inp |
| 34 | + params.par |
| 35 | + results_dir/ |
| 36 | + run_results_*.json |
| 37 | + results_map.json |
| 38 | + data_dir/ |
| 39 | + <routine_id>.dat |
| 40 | + image_dir/ |
| 41 | + ... |
| 42 | + config.yaml |
| 43 | + |
| 44 | +Notes: |
| 45 | +- The SAMMY fit directory is named after the routine_id. |
| 46 | +- The data file for a run is keyed by routine_id: data_dir/<routine_id>.dat |
| 47 | +- endf_dir should map to PleiadesConfig.nuclear_data_cache_dir so NuclearDataManager |
| 48 | + uses it for ENDF caching. |
| 49 | +- For the docker backend, ``sammy.docker.image_name`` should be digest-pinned |
| 50 | + (``repo/image@sha256:...``) or at least use an explicit non-mutable version |
| 51 | + tag (for example ``repo/image:1.2.3``); unpinned or mutable tags are rejected. |
| 52 | + |
| 53 | +Draft YAML schema (example) |
| 54 | +--------------------------- |
| 55 | +pleiades_version: 2 |
| 56 | + |
| 57 | +workspace: |
| 58 | + root: /path/to/working_dir |
| 59 | + endf_dir: ${workspace.root}/endf_dir |
| 60 | + fitting_dir: ${workspace.root}/fitting_dir |
| 61 | + results_dir: ${workspace.root}/results_dir |
| 62 | + data_dir: ${workspace.root}/data_dir |
| 63 | + image_dir: ${workspace.root}/image_dir |
| 64 | + |
| 65 | +nuclear: |
| 66 | + sources: |
| 67 | + DIRECT: https://www-nds.iaea.org/public/download-endf |
| 68 | + API: https://www-nds.iaea.org/exfor/servlet |
| 69 | + default_library: ENDF-B-VIII.0 |
| 70 | + isotopes: |
| 71 | + - isotope: "U-235" |
| 72 | + abundance: 0.0072 |
| 73 | + vary_abundance: 0 |
| 74 | + endf_library: ENDF-B-VIII.0 |
| 75 | + - isotope: "U-238" |
| 76 | + abundance: 0.9928 |
| 77 | + vary_abundance: 0 |
| 78 | + |
| 79 | +sammy: |
| 80 | + backend: local # local | docker | nova |
| 81 | + local: |
| 82 | + sammy_executable: /path/to/sammy |
| 83 | + shell_path: /bin/bash |
| 84 | + env_vars: {} |
| 85 | + docker: |
| 86 | + # Use a pinned digest when possible; vetted version tags are acceptable fallback. |
| 87 | + image_name: kedokudo/sammy-docker:1.0.0 |
| 88 | + container_working_dir: /sammy/work |
| 89 | + container_data_dir: /sammy/data |
| 90 | + nova: |
| 91 | + url: ${NOVA_URL} |
| 92 | + api_key: ${NOVA_API_KEY} |
| 93 | + tool_id: neutrons_imaging_sammy |
| 94 | + timeout: 3600 |
| 95 | + |
| 96 | +datasets: |
| 97 | + example_dataset: |
| 98 | + description: "Natural Si transmission" |
| 99 | + data_kind: raw_imaging # raw_imaging | sammy_dat | sammy_twenty |
| 100 | + raw: |
| 101 | + facility: ornl |
| 102 | + sample_folders: |
| 103 | + - /path/to/sample/run_1 |
| 104 | + ob_folders: |
| 105 | + - /path/to/ob/run_1 |
| 106 | + nexus_dir: /path/to/nexus |
| 107 | + roi: |
| 108 | + x1: 0 |
| 109 | + y1: 0 |
| 110 | + width: 512 |
| 111 | + height: 512 |
| 112 | + image_dir: ${workspace.image_dir} |
| 113 | + processed: |
| 114 | + transmission_files: [] |
| 115 | + energy_units: eV |
| 116 | + cross_section_units: barn |
| 117 | + path_to_data_files: ${workspace.data_dir}/example_fit.dat |
| 118 | + metadata: {} |
| 119 | + |
| 120 | +fit_routines: |
| 121 | + example_fit: |
| 122 | + dataset_id: example_dataset |
| 123 | + mode: fitting # fitting | endf_extraction | multi_isotope |
| 124 | + update_from_results: false |
| 125 | + fit_config: |
| 126 | + fit_title: "SAMMY Fit" |
| 127 | + tolerance: null |
| 128 | + max_iterations: 1 |
| 129 | + i_correlation: 50 |
| 130 | + max_cpu_time: null |
| 131 | + max_wall_time: null |
| 132 | + max_memory: null |
| 133 | + max_disk: null |
| 134 | + nuclear_params: {} # pleiades.nuclear.models.nuclearParameters |
| 135 | + physics_params: {} # pleiades.experimental.models.PhysicsParameters |
| 136 | + data_params: {} # pleiades.sammy.data.options.SammyData |
| 137 | + options_and_routines: {} # pleiades.sammy.fitting.options.FitOptions |
| 138 | +runs: |
| 139 | + - run_id: run_001 |
| 140 | + routine_id: example_fit |
| 141 | + dataset_id: example_dataset |
| 142 | + created_at: "2026-01-14T12:00:00Z" |
| 143 | + fit_dir: ${workspace.fitting_dir}/example_fit |
| 144 | + results_dir: ${workspace.fitting_dir}/example_fit/results_dir |
| 145 | + input_files: |
| 146 | + inp: ${workspace.fitting_dir}/example_fit/input.inp |
| 147 | + par: ${workspace.fitting_dir}/example_fit/params.par |
| 148 | + data: ${workspace.data_dir}/example_fit.dat |
| 149 | + output_files: |
| 150 | + lpt: ${workspace.fitting_dir}/example_fit/results_dir/SAMMY.LPT |
| 151 | + lst: ${workspace.fitting_dir}/example_fit/results_dir/SAMMY.LST |
| 152 | + sammy_par: ${workspace.fitting_dir}/example_fit/results_dir/SAMMY.PAR |
| 153 | + sammy_execution: |
| 154 | + backend: local |
| 155 | + success: false |
| 156 | + console_output: ${workspace.fitting_dir}/example_fit/results_dir/sammy_console.txt |
| 157 | + results: |
| 158 | + run_results_path: ${workspace.results_dir}/run_results_001.json |
| 159 | + summary: |
| 160 | + chi_squared: null |
| 161 | + dof: null |
| 162 | + reduced_chi_squared: null |
| 163 | + |
| 164 | +results_index: |
| 165 | + per_fit: [] |
| 166 | + aggregate: ${workspace.results_dir}/results_map.json |
| 167 | + |
| 168 | +How this config is used |
| 169 | +----------------------- |
| 170 | +1) Load config.yaml into PleiadesConfig (workspace + nuclear + sammy + datasets + routines). |
| 171 | +2) Resolve dataset inputs: |
| 172 | + - raw_imaging: run normalization to produce transmission data, then export |
| 173 | + to data_dir/<routine_id>.dat (or .twenty). |
| 174 | + - sammy_dat/sammy_twenty: use path_to_data_files or input_files.data directly. |
| 175 | +3) Cache isotope data with NuclearDataManager: |
| 176 | + - Use nuclear.isotopes for FitConfig population. |
| 177 | + - If isotopic data is not already cached, download using nuclear.data_cache_dir |
| 178 | + (default: ~/.pleiades/nuclear_data) and default_library. |
| 179 | +4) Create a run record: |
| 180 | + - Append a new entry to runs with run_id, routine_id, dataset_id, and paths. |
| 181 | + - Capture runtime metadata (timestamps, user, host, software versions). |
| 182 | +5) Build SAMMY inputs: |
| 183 | + - Construct FitConfig from fit_routines.<routine_id>.fit_config. |
| 184 | + - Write input.inp and params.par via InpManager and ParManager into the fit_dir. |
| 185 | +6) Execute SAMMY: |
| 186 | + - Instantiate SammyRunner via SammyFactory using sammy.backend. |
| 187 | + - Run SAMMY with SammyFiles; collect output files in results_dir. |
| 188 | +7) Parse outputs: |
| 189 | + - LptManager and LstManager create RunResults. |
| 190 | + - Serialize RunResults to JSON and store the path in runs[].results. |
| 191 | +8) Record provenance and reproducibility: |
| 192 | + - Persist config snapshot, SAMMY outputs, and run metadata together. |
| 193 | + - Store git commit, environment, and dependency versions for re-running. |
| 194 | +9) Optional iteration: |
| 195 | + - If update_from_results is true, update FitConfig for the next run. |
| 196 | + |
| 197 | +ENDF integration |
| 198 | +---------------- |
| 199 | +- NuclearDataManager uses PleiadesConfig.nuclear_data_cache_dir as its cache root. |
| 200 | +- If nuclear.data_cache_dir is omitted, it defaults to ~/.pleiades/nuclear_data. |
| 201 | +- When provided, nuclear.data_cache_dir overrides the default and can be placed |
| 202 | + under workspace.endf_dir or any other location. |
0 commit comments