|
| 1 | +#what kind of directory structure to expect? |
| 2 | +#For a directory structure like /archive/am5/am5/am5f3b1r0/c96L65_am5f3b1r0_pdclim1850F/gfdl.ncrc5-deploy-prod-openmp/pp |
| 3 | +# the input_path_template is set as follows. |
| 4 | +#We have NA in those values that do not match up with any of the expected headerlist (CSV columns), otherwise we |
| 5 | +#simply specify the associated header name in the appropriate place. E.g. The third directory in the PP path example |
| 6 | +#above is the model (source_id), so the third list value in input_path_template is set to 'source_id'. We make sure |
| 7 | +#this is a valid value in headerlist as well. |
| 8 | +#The fourth directory is am5f3b1r0 which does not map to an existing header value. So we simply NA in input_path_template |
| 9 | +#for the fourth value. |
| 10 | + |
| 11 | +#catalog headers |
| 12 | +#The headerlist is expected column names in your catalog/csv file. This is usually determined by the users in conjuction |
| 13 | +#with the ESM collection specification standards and the appropriate workflows. |
| 14 | + |
| 15 | +headerlist: ["activity_id", "institution_id", "source_id", "experiment_id", |
| 16 | + "frequency", "realm", "table_id", |
| 17 | + "member_id", "grid_label", "variable_id", |
| 18 | + "time_range", "chunk_freq","platform","dimensions","cell_methods","standard_name","path"] |
| 19 | + |
| 20 | +#what kind of directory structure to expect? |
| 21 | +#For a directory structure like /archive/am5/am5/am5f3b1r0/c96L65_am5f3b1r0_pdclim1850F/gfdl.ncrc5-deploy-prod-openmp/pp |
| 22 | +# the input_path_template is set as follows. |
| 23 | +#We have NA in those values that do not match up with any of the expected headerlist (CSV columns), otherwise we |
| 24 | +#simply specify the associated header name in the appropriate place. E.g. The third directory in the PP path example |
| 25 | +#above is the model (source_id), so the third list value in input_path_template is set to 'source_id'. We make sure |
| 26 | +#this is a valid value in headerlist as well. |
| 27 | +#The fourth directory is am5f3b1r0 which does not map to an existing header value. So we simply NA in input_path_template |
| 28 | +#for the fourth value. |
| 29 | + |
| 30 | +input_path_template: ['NA','NA','source_id','NA','experiment_id','platform','custom_pp','realm','cell_methods','frequency','chunk_freq'] |
| 31 | + |
| 32 | +input_file_template: ['realm','time_range','variable_id'] |
| 33 | + |
| 34 | +#OUTPUT FILE INFO is currently passed as command-line argument. |
| 35 | +#We will revisit adding a csvfile, jsonfile and logfile configuration to the builder configuration file in the future. |
| 36 | +#csvfile = #jsonfile = #logfile = |
| 37 | + |
| 38 | +####################################################### |
| 39 | + |
| 40 | +schema: "/home/a1r/git/forkCatalogBuilder-/catalogbuilder/cats/mdtf_template.json" #if your json schema is slighlty different but vetted with MSD, you may use your json schema here |
| 41 | +input_path: "/archive/am5/am5/am5f7b10r0/c96L65_am5f7b10r0_amip/gfdl.ncrc5-deploy-prod-openmp/pp/" |
| 42 | +output_path: "/home/a1r/github/noaa-gfdl/catalogs/c96L65_am5f7b10r0_amip30_test" # ENTER NAME OF THE CSV AND JSON, THE SUFFIX ALONE. e.g catalog (the builder then generates catalog.csv and catalog.json. This can also be an absolute path) |
0 commit comments