This repository contains:
- A quarterly pipeline that downloads and derives invasive species datasets and supporting spatial layers.
- A Dash dashboard that visualizes the latest derived outputs.
- derived_data/ = the latest snapshot (stable paths).
The dashboard always reads from here. - outputs/<run_id>/ = archived artifacts per run (traceability / rollback).
A typical quarterly run:
- writes to
outputs/<run_id>/...(archive) - then updates
derived_data/...(latest snapshot)
config/settings.yaml— all paths + runtime parametersinput_data/— static inputs (AOI, boundaries, base hexbins, FLAM raster, etc.)derived_data/— latest snapshot created by the pipeline (app reads here)outputs/— archived quarterly run outputssrc/tbep_invasives/steps/— individual pipeline steps (each has arun(cfg)entrypoint)pipeline/— orchestrators (quarterly runner + preflight checks)app/— dashboard app (Dash factory + runner + WSGI entrypoint)
src/tbep_invasives/pipeline/quarterly.py
-
main():- loads config via
tbep_invasives.paths.load_config() - calls
run(cfg)
- loads config via
-
run(cfg):- (optional) calls
pipeline.preflight.preflight(cfg, mode="quarterly") - calls each step’s
run(cfg)in sequence:steps.download_invasives.run(cfg)steps.flam_overlay.run(cfg)steps.hex_enrichment.run(cfg)steps.report_cards.run(cfg)
- (optional) calls
Each step reads from input_data/ and/or previously generated derived_data/ files,
writes archived outputs into outputs/<run_id>/..., then updates derived_data/....
src/tbep_invasives/app/run_app.py
main():- loads config via
load_config() - calls
dashboard.create_app(cfg) - starts a dev server (Dash) using
app.run(...)
- loads config via
src/tbep_invasives/app/wsgi.py
- module-level:
- loads config via
load_config() - calls
dashboard.create_app(cfg) - exposes
server = dash_app.serverfor Gunicorn
- loads config via
Docker runs Gunicorn, which serves the Dash/Flask server via WSGI.
All steps follow the same interface:
run(cfg) -> dictdoes the work and returns output paths.main()loads config and callsrun(cfg)so the file can be run directly.
Purpose: Download/filter invasive observations, standardize fields, assign hexbinID by spatial join, and export a dataset the app can consume.
Called by: pipeline/quarterly.py (Step 1)
Primary inputs (config):
paths.hexbins_shp(base hexbins withid)paths.invasives_concern_csv(top concern species list)- other parameters under
parameters:(HUCs, min_year, etc.)
Outputs:
- Latest snapshot:
derived.invasives_csvderived.invasives_geojson
- Archive:
outputs/<run_id>/invasives/...
Purpose: Clip FLAM raster to AOI and create a transparent PNG overlay + bounds metadata for map display.
Called by: pipeline/quarterly.py (Step 2)
Inputs:
paths.flam_rasterpaths.aoi_shpflam_overlay.colormap,flam_overlay.alpha
Outputs:
- Latest snapshot:
derived.flam_overlay_pngderived.flam_meta_json
- Archive:
outputs/<run_id>/rasters/...
Note: This step can be rerun at any time; it doesn’t depend on invasives downloads.
Purpose: Enrich hexbins with spatial attributes and FLAM stats (dominant muni/segment, zonal mean, FLAM rank). This is used by the dashboard to compute priority hexbins.
Called by: pipeline/quarterly.py (Step 3)
Inputs:
paths.hexbins_shppaths.municipalities_shppaths.bay_segments_shppaths.flam_raster
Outputs:
- Latest snapshot:
derived.hexbins_v2_shp(shapefile set)
- Archive:
outputs/<run_id>/shp/...
Purpose: Generate summary “report card” PNGs (abundance/richness) from the latest invasives dataset.
Called by: pipeline/quarterly.py (Step 4)
Inputs:
derived.invasives_csvpaths.aoi_shppaths.bay_segments_shpreport_cards.*parameters
Outputs:
- Archive:
outputs/<run_id>/plots/report_cards/...png
- Optional latest snapshot:
derived.report_cards_dir/...png(if enabled in config)
Purpose: The Dash “app factory.”
create_app(cfg)loads:- derived invasives datasets (
derived.*) - derived FLAM overlay (
derived.*) - derived enriched hexbins (
derived.hexbins_v2_shp) - boundary layers (
paths.*)
- derived invasives datasets (
- and returns a Dash
appobject.
Purpose: Development runner.
- Calls
create_app(cfg)and runs a dev server.
Purpose: Production runner entrypoint (Gunicorn).
- Exposes
server = dash_app.server.
project.name— project identifier (informational)
requests.verify_ssl— setfalseonly if necessary behind corporate SSL interceptionrequests.ca_bundle— optional path to a corporate CA bundle PEM
Input directories
paths.input_data_dirpaths.shp_dir,paths.rasters_dir,paths.tables_dir
Input files
paths.aoi_shppaths.municipalities_shppaths.bay_segments_shppaths.hexbins_shppaths.flam_rasterpaths.invasives_concern_csv
Outputs
paths.derived_data_dirpaths.outputs_dir
derived.invasives_csvderived.invasives_geojsonderived.flam_overlay_pngderived.flam_meta_jsonderived.hexbins_v2_shpderived.report_cards_dir(optional convenience)
run.run_id— set e.g."2025Q1"; if null a date-based id is usedrun.archive_outputs— write tooutputs/<run_id>/...run.update_derived— updatederived_data/...latest snapshot
parameters.*— download filters (min_year, HUCs, etc.)flam_overlay.*— colormap, alphahex_enrichment.*— logging frequency, etc.report_cards.*— plotting optionsapp.*— dev server host/port/debug (used byrun_app.py)
py -3.9 -m venv .venv
.\.venv\Scripts\Activate.ps1
python -m pip install -U pip
python -m pip install -r requirements.txtThis repo includes manual GitHub Actions workflows used to validate the system on a clean Linux runner.
Runs the full quarterly pipeline end-to-end and uploads artifacts.
- Produces:
derived_data/(latest snapshot) andoutputs/(archived run outputs) - Artifacts are downloadable from the workflow run summary.
How to run:
- GitHub → Actions → Quarterly Pipeline E2E → Run workflow
- Optionally provide
run_id(e.g.,2025Q1)
Validates Docker deployment by:
- Building the Docker image
- Running the quarterly pipeline inside the container
- Starting Gunicorn and verifying Dash endpoints respond
The workflow checks:
//_dash-layout/_dash-dependencies
A successful run indicates the containerized app is deployable and healthy.
Artifacts:
derived_data(latest snapshot)outputs(including CI logs when enabled)
Contact: Bud Davis (bdavis@drummondcarpenter.com)