Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 17 additions & 13 deletions docs/products/data-management/dataaccess/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ tags: [Products, CIROH, National Water Model, AWS, Google Cloud, NOAA]

# NWM Data Access

Within the CIROH projects, we encounter a wide range of data resources and data access inquiries. One of the most frequently asked questions is, "How can I obtain access to xyz-resource?". To help with answering that question, we have documented some of the most common data access methods and resources here, with links to additional sites to dive deeper.
Within the [CIROH projects](https://ciroh.ua.edu/about/), we encounter a wide range of data resources and data access inquiries. One of the most frequently asked questions is, "How can I obtain access to xyz-resource?". To help with answering that question, we have documented some of the most common data access methods and resources here, with links to additional sites to dive deeper.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure this link is necessary? Given the nature of CIROH Hub, it seems reasonable to treat the consortium itself as common knowledge for users who're this deep into the site.


## Input and Output Data of the National Water Model

Expand All @@ -29,28 +29,30 @@ post-processed/ 02-Nov-2020 14:31 -
prod/ 24-Oct-2023 00:18 -
v3.0/ 24-Oct-2023 00:18 -
```
The `para_post-processed` directory lacks specific documentation, although the "para" designation suggests it is a "parallel" execution, indicating a candidate production run under testing for operational use. In the post-processed dataset, you will find the following subdirectories:
The `para_post-processed` directory lacks specific documentation, although the "para" designation suggests it is a "parallel" execution, indicating a candidate production run under testing for operational use. When the NWC is experimenting with new or proposed products, these will often be placed in the `para_post-processed` folder for examination before official adoption.

In the post-processed dataset, you will find the following subdirectories:

- [NOMADS post-processed](https://nomads.ncep.noaa.gov/pub/data/nccf/com/nwm/post-processed/)
- RFC: Outputs filtered down to RFC locations.
- WMS: Contains re-indexed/reformatted outputs in per-forecast netCDFs suitable for rapid querying and responsive for graph visualizations on the water.noaa.gov/map site.
- IMAGES: .png-formatted renderings of NWM output for various domains and variables.
- logs: Logs. :)
- logs: Logs.

### NODD - NOAA Open Data Dissemination Program
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the smiley face removed, this bullet point no longer has a punchline, which means it actually needs to bear weight. What info is being logged in this folder?

(Granted, the logs folder linked currently contains nothing at all, logs or otherwise. Not sure whether the historical purpose of this folder needs to be mentioned, or if it can just be specified as empty.)

"The NOAA Open Data Dissemination (NODD) Program provides public access to NOAA's open data on commercial cloud platforms through public-private partnerships. These partnerships remove obstacles to public use of NOAA data, help avoid costs and risks associated with federal data access services, and leverage operational public-private partnerships with the cloud computing and information services industries."
(For more information, visit [NODD](https://www.noaa.gov/information-technology/open-data-dissemination))

The NODD datasets made available through several public cloud vendors are an incredible resource for accessing NWM data for research and evaluative purposes. The NWS NODD datasets are listed on [this page](https://www.noaa.gov/nodd/datasets) and include the following:

#### AWS
#### Amazon Web Services

AWS hosts two repositories as part of their sustainability data initiative.
Amazon Web Services (AWS) hosts two repositories as part of their sustainability data initiative.

The first repository contains the operational data (now hosts 4 week rolling collection of all output; it used to only be short range and the registry entry retains the description only for the short_range data [here](https://registry.opendata.aws/noaa-nwm-pds/); alternatively, the same resource is described under the sustainability initiative page [here](https://aws.amazon.com/marketplace/pp/prodview-73iwu7dcfuge2).)
The first repository contains the operational data. It now hosts 4 week rolling collection of all output; it used to only be short range and the registry entry retains the description only for the short_range data [here](https://registry.opendata.aws/noaa-nwm-pds/). Alternatively, the same resource is described under the sustainability initiative page [here](https://aws.amazon.com/marketplace/pp/prodview-73iwu7dcfuge2).)
- The catalog of AWS-hosted operational NWM data can be browsed [here](https://noaa-nwm-pds.s3.amazonaws.com/index.html).

The second (and more useful) AWS repository contains several versions of the retrospective dataset each described on the main page under the open data registry [here](https://registry.opendata.aws/nwm-archive/).
The second AWS repository contains several versions of the retrospective dataset each described on the main page under the open data registry [here](https://registry.opendata.aws/nwm-archive/).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possible loss of information here. Is there a reason that the second repo was considered "more useful"? If so, it may be worth mentioning.

{/* (The same information is also on the AWS sustainability initiative webpage [here](https://aws.amazon.com/marketplace/pp/prodview-g6lcchc7brshwa) )
Commenting this part since the AWS link throws a 404 error */}

Expand All @@ -64,14 +66,14 @@ The different catalogs of those [currently] five versions of that resource are l
- NWM v1.2 retrospective data
- netCDF, [here](https://nwm-archive.s3.amazonaws.com/index.html)

The AWS retrospective resource is the primary publicly available source for the version 1.0 of the “AORC” Analysis of Record for Calibration dataset, which is a 40-year best-available estimate of most common meteorological parameters required for hydrological modeling. Version 1.1 of the dataset will accompany the release of the NWM model version 3.0 retrospective (or 2.2 version??), hopefully in the next few weeks.
The AWS retrospective resource is the primary publicly available source for the version 1.0 of the Analysis of Record for Calibration dataset (AORC), which is a 40-year best-available estimate of most common meteorological parameters required for hydrological modeling. Version 1.1 of the dataset will accompany the release of the NWM model version 3.0 retrospective (or 2.2 version??), hopefully in the next few weeks.

Jupyter notebook instructions for processing NWM Zarr and NetCDF output formats [here](https://github.com/CIROH-UA/data_access_example/)

<details>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We realized the break is unnecessary <br/> so this commit removes it.
Curiously, the GitHub rendering of the break is a bit odd. If there isn't a space between the summary and the collapsed content, it will not render correctly. See below:

This is the rendering with proper spacing

There must be a space after this line for the collapsing to work correctly
#this is fake python for testing

print("Hello, World!")
This is a test showing the bad Github Rendering ```py #this is fake python for testing

print("Hello, World!")

</details>
When you expand the details, you can see the rest of the page is absorbed into the collapsed tab. 

This comment was marked as resolved.

<summary> Click to see an example of pulling data from the channel output zarr 2.1 archive and writing the results to csv. </summary>
<br/>
```py

```python
'''
#install these libraries if they aren't already installed
!pip install zarr
Expand Down Expand Up @@ -125,14 +127,16 @@ dan.ames@byu.edu
'''

```

</details>

#### Google – Operational NWM Data

Google hosts the most complete operational data archive of inputs and outputs from the National Water Model, with nearly every file since August 2018. The Google open data registry provides additional explanations [here](https://console.cloud.google.com/marketplace/product/noaa-public/national-water-model?project=explore-ai-387703).
- Operational data can be browsed [here](https://console.cloud.google.com/storage/browser/national-water-model).
- Google also hosts a copy of the NWM v1.2 retrospective [here](https://console.cloud.google.com/storage/browser/national-water-model-reanalysis).

Coming soon: Big Query
#### Coming soon: Big Query

Efforts are underway to make some of the datasets from the NWM operational and retrospective simulations available on BigQuery for ultra-high-bandwidth access. Stay tuned...
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: am I mistaken, or is this section outdated? I'm pretty sure BigQuery support for NWM retrospectives has been alive and well for some time, now.


Expand Down Expand Up @@ -168,4 +172,4 @@ A detailed description of various aspects of the WRF-Hydro code, which produces

import DocCardList from '@theme/DocCardList';

<DocCardList />
<DocCardList />