Skip to content

Commit fa84244

Browse files
Ciheim BrownCiheim Brown
authored andcommitted
Updating generation page
1 parent e4cdb7f commit fa84244

File tree

1 file changed

+18
-6
lines changed

1 file changed

+18
-6
lines changed

doc/generation.rst

Lines changed: 18 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -61,7 +61,11 @@ A default configuration is used for catalog generation unless a custom configura
6161

6262
`Here <https://github.com/NOAA-GFDL/CatalogBuilder/blob/main/catalogbuilder/tests/config-cfname.yaml>`_ is an example configuration file.
6363

64-
Catalog headers (column names) are set with the *HEADER LIST* variable. The *OUTPUT PATH TEMPLATE* variable controls the expected directory structure of input data.
64+
65+
**HEADERLIST**
66+
67+
Catalog headers (column names) are set with the *HEADER LIST* variable. The headerlist contains the expected column names of your catalog/csv file. This is usually determined by the users in conjuction
68+
with the ESM collection specification standards and the appropriate workflows.
6569

6670
.. code-block:: yaml
6771
@@ -71,8 +75,10 @@ Catalog headers (column names) are set with the *HEADER LIST* variable. The *OUT
7175
"member_id", "grid_label", "variable_id",
7276
"time_range", "chunk_freq","platform","dimensions","cell_methods","standard_name","path"]
7377
74-
The headerlist contains the expected column names of your catalog/csv file. This is usually determined by the users in conjuction
75-
with the ESM collection specification standards and the appropriate workflows.
78+
79+
**OUTPUT PATH TEMPLATE**
80+
81+
The *OUTPUT PATH TEMPLATE* variable controls the expected directory structure of input data.
7682

7783
.. code-block:: yaml
7884
@@ -82,20 +88,26 @@ with the ESM collection specification standards and the appropriate workflows.
8288
For a directory structure like /archive/am5/am5/am5f3b1r0/c96L65_am5f3b1r0_pdclim1850F/gfdl.ncrc5-deploy-prod-openmp/pp the output_path_template is set as above.
8389

8490
We have NA in those values that do not match up with any of the expected headerlist (CSV columns), otherwise we
85-
simply specify the associated header name in the appropriate place. E.g. The third directory in the PP path example above is the model (source_id), so the third list value in output_path_template is set to 'source_id'. We make sure this is a valid value in headerlist as well. The fourth directory is am5f3b1r0 which does not map to an existing header value. So we simply add NA in output_path_template for the fourth value.
91+
simply specify the associated header name in the appropriate place. E.g. The third directory in the PP path example above is the model (source_id), so the third list value in output_path_template is set to 'source_id'. We make sure this is a valid value in headerlist as well. The fourth directory, 'am5f3b1r0', does not map to an existing header value so we simply add NA in output_path_template for the fourth value.
8692

87-
We have NA in values that do not match up with any of the expected headerlist (CSV columns), otherwise we simply specify the associated header name in the appropriate place. E.g. The third directory in the PP path example above is the model (source_id), so the third list value in output_path_template is set to 'source_id'. We make sure this is a valid value in headerlist as well.
93+
**OUTPUT FILE TEMPLATE**
94+
95+
The *OUTPUT FILE TEMPLATE* variable controls the expected directory structure of the of input data. This is used to navigate within the post-processed (pp) directory where files are stored.
8896

8997
.. code-block:: yaml
9098
9199
#Filename information
92100
output_file_template = ['realm','temporal_subset','variable_id']
93101
102+
**INPUT/OUTPUT PATH**
103+
104+
The *INPUT/OUTPUT PATH* variables are used by the Catalog Builder to locate input data and store output to the proper location.
105+
94106
.. code-block:: yaml
95107
96108
#Input directory and output info
97109
input_path: "/archive/am5/am5/am5f7b10r0/c96L65_am5f7b10r0_amip/gfdl.ncrc5-deploy-prod-openmp/pp/"
98-
output_path: "/home/a1r/github/noaa-gfdl/catalogs/c96L65_am5f7b10r0_amip" # ENTER NAME OF THE CSV AND JSON, THE SUFFIX ALONE. This can be an absolute or a relative path
110+
output_path: "/home/a1r/github/noaa-gfdl/catalogs/c96L65_am5f7b10r0_amip" # ENTER NAME OF THE CSV AND JSON, THE SUFFIX ALONE. This can be an absolute or a relative path.
99111
100112
Creating a data catalog
101113
=======================

0 commit comments

Comments
 (0)