This repository contains helper scripts and a guided notebook for preparing hydrologic model inputs for EF5-style flash flood simulation workflows.
The core intent is to:
- pull forcing/observation data (MRMS precipitation and USGS streamflow),
- convert and organize files into model-ready formats, and
- run a repeatable, notebook-driven preprocessing workflow.
EF5 is a distributed hydrologic modeling framework commonly used for event-based and real-time flood simulation. In many operational or research workflows, EF5 is paired with high-frequency precipitation products (such as MRMS) and streamflow observations (such as USGS NWIS Instantaneous Values) to support setup, calibration, and evaluation tasks.
In that context, this project focuses on the data preparation side of an EF5-FLASH-style workflow:
- preparing precipitation forcing inputs from MRMS PrecipRate archives,
- preparing observed streamflow time series for comparison,
- and walking through raster/model input preparation in the notebook.
-
prepare_model.ipynb- Primary workflow notebook.
- Contains the step-by-step process to prepare model inputs (raster clipping, conversion, and related preprocessing tasks).
- Recommended path: follow this notebook sequentially from top to bottom.
-
download_mrms_preciprate.sh- Bash helper script to download MRMS PrecipRate
.gzfiles from IEM mtarchive for a date range. - Supports dry-run mode and skips files that already exist locally.
- Decompresses downloaded
.gzfiles at the end of a real run.
- Bash helper script to download MRMS PrecipRate
-
fetch_usgs_from_control.py- Python helper script to download USGS NWIS instantaneous streamflow (
parameterCd=00060) for one gauge and date range. - Converts discharge from cfs to cms and writes a CSV.
- Exports all available timesteps returned by USGS in the requested interval.
- Python helper script to download USGS NWIS instantaneous streamflow (
-
requirements.txt- Frozen Python package list exported from the working environment used for this project.
- Use this file to recreate a compatible environment for notebook and script execution.
-
control_files/- Example control/configuration files used for EF5 execution contexts.
- Useful as templates/reference when connecting prepared inputs into an EF5 run.
-
__pycache__/- Python bytecode cache artifacts.
- Start with
prepare_model.ipynband follow cells in order. - Use helper scripts to fetch raw forcing/observation data.
- Return to notebook steps for conversion/formatting and final model input preparation.
The notebook is the orchestrator for the full pipeline; the scripts are supporting utilities.
Script: download_mrms_preciprate.sh
Make executable (one-time):
chmod +x download_mrms_preciprate.shShow help:
./download_mrms_preciprate.sh --helpDry-run first (recommended):
./download_mrms_preciprate.sh \
--start-date 2022-07-27 \
--end-date 2022-07-30 \
--dest-dir ~/MRMS_preciprate \
--dry-runRun actual download:
./download_mrms_preciprate.sh \
--start-date 2022-07-27 \
--end-date 2022-07-30 \
--dest-dir ~/MRMS_preciprateBehavior notes:
- Creates destination directory if it does not exist.
- Skips files that already exist (either
.gzor already decompressed version). - Prints a compact summary with skipped/downloaded counts.
- In dry-run mode, no files are downloaded or decompressed.
Script: fetch_usgs_from_control.py
Show help:
python3 fetch_usgs_from_control.py --helpExample:
python3 fetch_usgs_from_control.py \
--gauge 04085200 \
--start-date 2022-07-27 \
--end-date 2022-07-30 \
--outdir ~/Kewaunee/observationsBehavior notes:
- Accepted date formats:
YYYYMMDDHHMMSS,YYYY-MM-DD, or ISO-8601. - Output is UTC and includes all available USGS timesteps in the requested interval.
- Output file pattern:
Streamflow_Time_Series_CMS_UTC_USGS_<gauge>.csv
download_mrms_preciprate.shrequires common shell tools andwget,grep,sed,gunzip.fetch_usgs_from_control.pyuses Python 3 standard library only.- Notebook and geospatial preprocessing steps rely on additional Python packages listed in
requirements.txt.
From the EF5-RTI repository root:
python3 -m venv .venv
source .venv/bin/activate
python -m pip install --upgrade pip
python -m pip install -r requirements.txtOptional (for Jupyter notebook kernel selection):
python -m ipykernel install --user --name ef5-rti --display-name "Python (ef5-rti)"Then open prepare_model.ipynb and select the installed Python (ef5-rti) kernel.
The notebook references the following external sources.
| Source (webpage/repo) | What it is and how it is used |
|---|---|
| IEM MTArchive (MRMS PrecipRate) | Iowa State IEM archive hosting historical MRMS PrecipRate files, used by download_mrms_preciprate.sh to pull precipitation forcing by date. |
| USGS StreamStats | USGS watershed delineation and basin data portal used to obtain basin boundaries and related geospatial inputs for model setup. |
| USGS WaterData station 04085200 | Station information page for the example gage in this workflow, used to verify gauge metadata and context for observation downloads. |
| HyDROSLab/EF5-US-Parameters | Parameter dataset repository referenced for national-scale EF5 parameter layers used as source inputs before clipping/preprocessing. |
| HyDROSLab/EF5-dockerized | Companion repository referenced for prebuilt CONUS CREST/SAC/KW parameter resources and broader EF5 workflow support assets. |
This project is practical and useful for iterative modeling work, but it should be treated as an evolving workflow rather than a fully hardened production system.
Please keep in mind:
- some paths, assumptions, and examples are environment-specific,
- edge cases across all basins/events may not be fully tested,
- upstream data service behavior/availability can change,
- there is room for improvement in robustness, error handling, and broader test coverage.
Before operational use, validate outputs for your basin/event and review intermediate products in the notebook.
- Add automated tests for date parsing, download logic, and CSV output schema.
- Add retry/backoff logic for transient network/API failures.
- Add structured logging and optional verbose/quiet modes.
- Add notebook checks to verify required files/directories before heavy processing steps.