PyBFS App

An interactive web application for exploring and analyzing baseflow separation across thousands of USGS stream gages in the contiguous United States. Built on the PyBFS physically-based algorithm and backed by USGS WaterServices, this tool lets researchers and practitioners calibrate models, visualize flow components, detect anomalies, and fetch live data — all from a browser.

Quick Start

git clone git@github.com:BYU-Hydroinformatics/baseflow_analyst.git
cd baseflow_analyst
python -m venv venv
source venv/bin/activate          # Windows: venv\Scripts\activate
pip install -r requirements.txt
python app.py

Then open http://127.0.0.1:5000 in your browser. The app ships with pre-calibrated results for 8,729 USGS gages, so no data pipeline steps are required to start exploring.

Overview

PyBFS App combines a physically-based baseflow separation model with real-time USGS data access into a single Flask application. The map view displays every active USGS stream gage that has daily discharge records, color-coded by calibration status and watershed behavior. Clicking a gage loads interactive Plotly charts without any page reload, running BFS on-the-fly from pre-calibrated parameters stored on disk.

The backend pipeline consists of four standalone scripts that can be run once to bootstrap the dataset, after which the web app serves everything dynamically.

Features

Baseflow Separation

PyBFS algorithm — physically-based coupled surface/subsurface reservoir model separating total streamflow into baseflow, surface flow, and direct runoff
Traditional method comparisons — Chapman, Lyne-Hollick (LH), and Eckhardt filter overlaid on the same chart
90% Bayesian credible intervals on baseflow estimates via pybfs.bf_ci

Forecasting

90-day baseflow forecast projected from the last observed state, displayed alongside the historical record

Analytics

Annual Baseflow Index (BFI) — year-by-year bar chart with long-term mean
Flow component fractions — BFF / SFF / DRF displayed as a donut summary
Flow anomaly detection — top extreme low-flow and high-flow events identified by percentile thresholds, ranked by severity, with narrative descriptions

Data Integration

US Drought Monitor (USDM) — county-level weekly drought severity timeline correlated with detected low-flow anomalies
National Water Model (NWM) classification — Natural vs. Artificial channel behavior
GAGES-II reference / non-reference status for each gage
Live USGS update — fetch the latest full discharge record from USGS WaterServices, recalibrate, and view updated results in the browser without touching local files

Map Interface

Leaflet-based interactive map with marker clustering
Filter panel: calibration status, NWM behavior, GAGES-II class, low-flow gages
Side panel with gage metadata, calibration trigger, and tabbed chart view

Architecture

pybfs_app/
├── app.py                      # Flask application (entry point)
├── usgs_download_all_daily.py  # Pipeline step 1: download USGS data
├── detect_low_flow_gages.py    # Pipeline step 2: classify low-flow gages
├── calibrate_all_gages.py      # Pipeline step 3: calibrate BFS parameters
├── run_bfs_all_gages.py        # Pipeline step 4: pre-compute results & plots
├── templates/
│   └── index.html              # Single-page map application
├── behavior/
│   ├── NWM_USGS_Natural_Flow.csv
│   ├── NWM_USGS_Artificial_Path.csv
│   └── GAGES-II_ref_non_ref.csv
├── pybfs/                      # PyBFS library (submodule)
├── baseflow/                   # Baseflow library (submodule)
├── usgs_daily_streamflow/      # Downloaded gage CSVs (generated)
├── usgs_calibration_results/   # Calibrated parameters (generated)
├── usgs_bfs_results/           # Pre-computed plots & CSVs (generated)
├── temp_results/               # Live-update working directory (generated)
└── low_flow_gages.csv          # Low-flow classification (generated)

The app reads calibration parameters from usgs_calibration_results/params/ and runs BFS in-process on every chart request. Results are cached in memory for 5 minutes (BFS_CACHE_TTL). The "Update Data" feature runs a full download + calibration cycle in a background thread, writing to temp_results/ so original data is never modified.

Prerequisites

Requirement	Version
Python	≥ 3.9
pip	≥ 22

Python package dependencies (install via requirements.txt):

flask
pandas
numpy
matplotlib
scipy
statsmodels
numba

The pybfs and baseflow libraries ship as subdirectories under the project root and are added to sys.path automatically by app.py.

Installation

# Clone the repository
git clone git@github.com:BYU-Hydroinformatics/baseflow_analyst.git
cd baseflow_analyst

# Create and activate a virtual environment
python -m venv venv
source venv/bin/activate   # Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

Note: numba requires a C compiler. On macOS install Xcode Command Line Tools (xcode-select --install). On Linux ensure gcc is available.

Data Pipeline

Run these scripts once to build the dataset. Each step is idempotent — re-running skips already-completed work so you can safely resume after interruptions.

Step 1 — Download USGS Streamflow

Downloads all available daily discharge records for every active USGS stream gage in CONUS (48 states + DC).

python usgs_download_all_daily.py

Output (usgs_daily_streamflow/):

site_info.csv — gage metadata (coordinates, drainage area, …)
<site_no>.csv — one CSV per gage with date and streamflow (cfs) columns
gauges_with_data.csv — list of gages that returned data
gauges_without_data.csv — list of gages with no available records

This step queries the USGS WaterServices REST API state-by-state with a 1-second delay between requests and a 0.5-second delay per gage download to respect rate limits. Expect several hours for a full CONUS run (~10 000 gages).

Step 2 — Detect Low-Flow Gages

Classifies gages that exhibit sustained low-flow periods (e.g. regulated rivers, intermittent streams). Uses multiprocessing to analyze all gages in parallel.

python detect_low_flow_gages.py [--pct 10] [--cv 0.15] [--dur 60]

Flag	Default	Description
`--pct`	10	Percentile threshold (flow below this is "low")
`--cv`	0.15	Max coefficient of variation to consider flow "constant"
`--dur`	60	Minimum consecutive days below threshold

Output: low_flow_gages.csv with columns site_no, has_lowflow, max_lowflow_duration, max_run_cv, threshold_cfs, record_days.

Step 3 — Calibrate All Gages

Runs the PyBFS calibration routine for every gage that has both streamflow data and a known drainage area. Saves per-site parameter files used by the web app. Skips already-calibrated sites and logs failures for easy resumption.

python calibrate_all_gages.py

Output (usgs_calibration_results/):

params/params_<site_no>.csv — calibrated basin and groundwater parameters
bff/bff_<site_no>.csv — baseflow, surface flow, and direct runoff fractions
all_params.csv — combined parameter table for all sites
all_bff.csv — combined BFF table
calibration_failed.csv — sites that could not be calibrated

Calibration takes roughly 60–90 seconds per gage on a modern CPU.

Step 4 — Pre-compute BFS Results (optional)

Generates static PNG plots and result CSVs for every calibrated gage. The web app runs BFS on-the-fly, so this step is optional but useful for producing publication-quality figures in bulk.

python run_bfs_all_gages.py

Output per gage (usgs_bfs_results/<site_no>/):

bfs_results.csv — daily baseflow separation time series
baseflow_separation.png — observed streamflow vs. flow components
flow_fractions.png — BFF / SFF / DRF pie chart
annual_bfi.png — annual Baseflow Index bar chart
confidence_intervals.png — baseflow with 90% credible interval band
ci_table.csv / ci_daily.csv — credible interval tables
forecast.png / forecast.csv — 90-day baseflow forecast

A .done marker file is written per site so the script can be interrupted and resumed safely.

Step 5 — Run the App

python app.py

Open http://localhost:5000 in your browser. The app starts by loading all gage metadata, NWM behavior data, and low-flow classifications into memory.

For production deployments, serve with Gunicorn:

pip install gunicorn
gunicorn -w 4 -b 0.0.0.0:5000 app:app

Usage

Map View

The map loads all gages as clustered markers. Marker color indicates status:

Color	Meaning
Green	Calibrated — charts available
Gray	Pending — calibration not yet run

Use the filter panel (top-right) to show only:

Calibrated gages
Natural-channel gages (NWM classification)
GAGES-II reference gages
Gages without low-flow periods

Gage Detail Panel

Click any marker to open the side panel:

Info tab — site name, USGS ID, coordinates, drainage area, NWM behavior, GAGES-II class, calibration status
Charts tab — interactive Plotly charts loaded on demand:
- Baseflow separation time series with traditional method overlay
- Flow component fractions (BFF / SFF / DRF)
- Annual BFI bar chart
- Confidence interval band
- 90-day forecast
- County drought monitor timeline
Anomalies tab — ranked extreme flow events with narrative descriptions and drought context

Calibrating a Gage

If a gage shows "Pending," click Run Calibration. A progress bar tracks the background process (~90 seconds). Once complete, the charts load automatically.

Live Data Update

Click Update Data on any gage to:

Download the complete historical record fresh from USGS
Re-calibrate on the updated data
View charts based on the new calibration (written to temp_results/)

Original calibration files are never overwritten.

API Reference

All endpoints return JSON unless noted otherwise.

GET `/api/gages`

Returns all gages with coordinates, status, NWM behavior, and low-flow flag.

[
  {
    "site_no": "12167000",
    "name": "SAUK RIVER NEAR SAUK CITY, WA",
    "lat": 48.47,
    "lng": -121.59,
    "status": "calibrated",
    "behavior": "Natural",
    "ref_status": "Ref",
    "has_lowflow": false
  }
]

GET `/api/gage/<site_no>/info`

Returns detailed metadata and BFF/SFF/DRF fractions for a single gage.

GET `/api/gage/<site_no>/data?source=original|updated`

Runs BFS on-the-fly and returns time series data for all interactive charts:

{
  "dates": ["2000-01-01", "..."],
  "qob": [12.4, "..."],
  "baseflow": [8.1, "..."],
  "surface_flow": [3.1, "..."],
  "direct_runoff": [1.2, "..."],
  "ci_lower": [7.5, "..."],
  "ci_upper": [8.7, "..."],
  "forecast_dates": ["2025-01-01", "..."],
  "forecast_baseflow": [7.9, "..."],
  "annual_years": [2000, 2001, "..."],
  "annual_bfi": [0.652, "..."],
  "mean_bfi": 0.641,
  "bff": 0.641, "sff": 0.253, "drf": 0.106
}

Large datasets (> 5 000 points) are automatically downsampled for chart performance; total_points reports the full record length.

POST `/api/gage/<site_no>/process`

Triggers background calibration. Returns {"status": "started"} or {"status": "already_done"}.

GET `/api/gage/<site_no>/progress?source=original|updated`

Server-Sent Events (SSE) stream reporting calibration / update progress:

{"stage": "calibrating", "progress": 10, "message": "Calibrating...", "done": false}

POST `/api/gage/<site_no>/update`

Fetches the latest USGS data and runs a full calibration cycle in temp_results/.

GET `/api/gage/<site_no>/drought`

Returns US Drought Monitor weekly severity percentages (D0–D4) for the gage's county since January 2000, plus the current drought level string.

GET `/api/gage/<site_no>/anomalies`

Detects and ranks extreme low-flow and high-flow events, correlates with drought data, and returns narrative descriptions.

{
  "events": [
    {
      "type": "low_flow",
      "start_date": "2015-07-01",
      "end_date": "2015-09-28",
      "duration": 89,
      "min_flow": 0.12,
      "threshold": 0.43,
      "severity": 58.3,
      "drought_context": "This coincided with D3 (Extreme Drought) conditions affecting 74% of the county.",
      "narrative": "Extreme low-flow period from 2015-07-01 to 2015-09-28..."
    }
  ]
}

GET `/plots/<site_no>/<filename>?source=original|updated`

Serves pre-computed static plot PNGs.

Directory Structure

pybfs_app/
├── app.py                          # Flask application
├── usgs_download_all_daily.py      # Step 1: download USGS data
├── detect_low_flow_gages.py        # Step 2: low-flow classification
├── calibrate_all_gages.py          # Step 3: BFS calibration
├── run_bfs_all_gages.py            # Step 4: batch BFS + plots
│
├── templates/
│   └── index.html                  # Map SPA (Leaflet + Plotly)
│
├── behavior/                       # Static reference datasets
│   ├── NWM_USGS_Natural_Flow.csv
│   ├── NWM_USGS_Artificial_Path.csv
│   └── GAGES-II_ref_non_ref.csv
│
├── pybfs/                          # PyBFS library
├── baseflow/                       # Baseflow separation library
│
├── low_flow_gages.csv              # Generated by step 2
│
├── usgs_daily_streamflow/          # Generated by step 1
│   ├── site_info.csv
│   ├── gauges_with_data.csv
│   └── <site_no>.csv  (one per gage)
│
├── usgs_calibration_results/       # Generated by step 3
│   ├── params/params_<site_no>.csv
│   ├── bff/bff_<site_no>.csv
│   ├── all_params.csv
│   ├── all_bff.csv
│   └── calibration_failed.csv
│
├── usgs_bfs_results/               # Generated by step 4 (optional)
│   └── <site_no>/
│       ├── bfs_results.csv
│       ├── baseflow_separation.png
│       ├── flow_fractions.png
│       ├── annual_bfi.png
│       ├── confidence_intervals.png
│       ├── forecast.png
│       └── ci_table.csv
│
└── temp_results/                   # Live-update scratch space
    └── <site_no>/

Configuration

Key constants at the top of app.py:

Constant	Default	Description
`FORECAST_DAYS`	90	Number of days projected in the baseflow forecast
`BFS_CACHE_TTL`	300	Seconds to cache on-the-fly BFS results in memory
`CFS_TO_M3_PER_DAY`	2446.58	Unit conversion factor (cfs → m³/day)
`SQMI_TO_M2`	2 589 988.11	Unit conversion factor (mi² → m²)

Directory paths (STREAMFLOW_DIR, CALIB_DIR, RESULTS_DIR, TEMP_DIR) are derived from the location of app.py and can be adjusted at the top of the file if you keep data in a different location.

Contributing

Fork the repository and create a feature branch.
Follow existing code style (no external formatter required).
Keep new dependencies minimal — the app intentionally avoids heavy frameworks.
Open a pull request with a clear description of the change and any relevant screenshots.

For bug reports, please include the USGS site number that triggered the issue, the Python version, and the full traceback.

License

This project is licensed under the MIT License. See LICENSE for details.

The PyBFS and baseflow libraries ship as subdirectories and carry their own respective licenses.

Acknowledgements

PyBFS — BYU Hydroinformatics, physically-based baseflow separation algorithm
USGS WaterServices — daily streamflow data and site metadata
US Drought Monitor (USDM) — county-level drought severity data
FCC Census Bureau Area API — lat/lon to county FIPS conversion
National Water Model (NWM) — channel behavior classification
GAGES-II — reference/non-reference gage classification

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
baseflow		baseflow
behavior		behavior
templates		templates
usgs_calibration_results		usgs_calibration_results
usgs_daily_streamflow		usgs_daily_streamflow
.gitignore		.gitignore
README.md		README.md
app.py		app.py
calibrate_all_gages.py		calibrate_all_gages.py
detect_low_flow_gages.py		detect_low_flow_gages.py
low_flow_gages.csv		low_flow_gages.csv
requirements.txt		requirements.txt
run_bfs_all_gages.py		run_bfs_all_gages.py
usgs_download_all_daily.py		usgs_download_all_daily.py

BYU-Hydroinformatics/baseflow_analyst

Folders and files

Latest commit

History

Repository files navigation

PyBFS App

Quick Start

Table of Contents

Overview

Features

Baseflow Separation

Forecasting

Analytics

Data Integration

Map Interface

Architecture

Prerequisites

Installation

Data Pipeline

Step 1 — Download USGS Streamflow

Step 2 — Detect Low-Flow Gages

Step 3 — Calibrate All Gages

Step 4 — Pre-compute BFS Results (optional)

Step 5 — Run the App

Usage

Map View

Gage Detail Panel

Calibrating a Gage

Live Data Update

API Reference

GET /api/gages

GET /api/gage/<site_no>/info

GET /api/gage/<site_no>/data?source=original|updated

POST /api/gage/<site_no>/process

GET /api/gage/<site_no>/progress?source=original|updated

POST /api/gage/<site_no>/update

GET /api/gage/<site_no>/drought

GET /api/gage/<site_no>/anomalies

GET /plots/<site_no>/<filename>?source=original|updated

Directory Structure

Configuration

Contributing

License

Acknowledgements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

GET `/api/gages`

GET `/api/gage/<site_no>/info`

GET `/api/gage/<site_no>/data?source=original|updated`

POST `/api/gage/<site_no>/process`

GET `/api/gage/<site_no>/progress?source=original|updated`

POST `/api/gage/<site_no>/update`

GET `/api/gage/<site_no>/drought`

GET `/api/gage/<site_no>/anomalies`

GET `/plots/<site_no>/<filename>?source=original|updated`

Packages