|
| 1 | +<div align="center"> |
| 2 | + <img src="assets/branding/zyra-logo.png" width="180" alt="Zyra Logo" /> |
| 3 | + <h1>Zyra: Modular, Reproducible Data Workflows for Science</h1> |
| 4 | + <p><em>An Open-Source Python Framework by NOAA Global Systems Laboratory</em></p> |
| 5 | + <p> |
| 6 | + <strong>Eric Hackathorn</strong> · NOAA GSL · |
| 7 | + <a href="https://orcid.org/0000-0002-9693-2093">ORCID</a> |
| 8 | + </p> |
| 9 | + <p> |
| 10 | + <a href="https://pypi.org/project/zyra/"><img src="https://img.shields.io/pypi/v/zyra?color=%231A5A69" alt="PyPI" /></a> |
| 11 | + <a href="https://noaa-gsl.github.io/zyra/"><img src="https://img.shields.io/badge/docs-Sphinx-blue" alt="Docs" /></a> |
| 12 | + <a href="https://doi.org/10.5281/zenodo.16923323"><img src="https://img.shields.io/badge/DOI-10.5281%2Fzenodo.16923323-blue" alt="DOI" /></a> |
| 13 | + <a href="https://github.com/NOAA-GSL/zyra"><img src="https://img.shields.io/github/license/NOAA-GSL/zyra" alt="License" /></a> |
| 14 | + </p> |
| 15 | +</div> |
| 16 | + |
| 17 | +--- |
| 18 | + |
| 19 | +## The Challenge |
| 20 | + |
| 21 | +Environmental and scientific workflows span heterogeneous data sources (HTTP, FTP, S3, APIs) and formats (GRIB2, NetCDF, GeoTIFF). They require repeatable transformation chains and produce diverse outputs -- static maps, animations, interactive pages, and datasets. Existing approaches often rely on ad-hoc scripts that break when data changes and lack reproducibility across teams and environments. |
| 22 | + |
| 23 | +**Zyra** provides a lightweight, CLI-first framework that standardizes common steps while remaining extensible for domain-specific logic. |
| 24 | + |
| 25 | +--- |
| 26 | + |
| 27 | +## The Pipeline: 8 Composable Stages |
| 28 | + |
| 29 | +```mermaid |
| 30 | +graph LR |
| 31 | + A["1. Import\n(acquire)"] --> B["2. Process\n(transform)"] |
| 32 | + B --> C["3. Simulate"] |
| 33 | + C --> D["4. Decide\n(optimize)"] |
| 34 | + D --> E["5. Visualize\n(render)"] |
| 35 | + E --> F["6. Narrate"] |
| 36 | + F --> G["7. Verify"] |
| 37 | + G --> H["8. Export\n(disseminate)"] |
| 38 | +
|
| 39 | + style A fill:#00529E,stroke:#00172D,color:#FEFEFE,stroke-width:2px |
| 40 | + style B fill:#2C670C,stroke:#00172D,color:#FEFEFE,stroke-width:2px |
| 41 | + style C fill:#8B8985,stroke:#A3A29D,color:#FEFEFE,stroke-width:1px,stroke-dasharray:5 5 |
| 42 | + style D fill:#8B8985,stroke:#A3A29D,color:#FEFEFE,stroke-width:1px,stroke-dasharray:5 5 |
| 43 | + style E fill:#576216,stroke:#00172D,color:#FEFEFE,stroke-width:2px |
| 44 | + style F fill:#1A5A69,stroke:#00172D,color:#FEFEFE,stroke-width:2px |
| 45 | + style G fill:#5F9DAE,stroke:#00172D,color:#00172D,stroke-width:2px |
| 46 | + style H fill:#00529E,stroke:#00172D,color:#FEFEFE,stroke-width:2px |
| 47 | +``` |
| 48 | + |
| 49 | +| Stage | Purpose | CLI | |
| 50 | +|-------|---------|-----| |
| 51 | +| **Import** | Fetch from HTTP/S, S3, FTP, REST API | `zyra acquire` | |
| 52 | +| **Process** | Decode, subset, convert (GRIB2, NetCDF, GeoTIFF) | `zyra process` | |
| 53 | +| **Simulate** | Generate synthetic/test data | *planned* | |
| 54 | +| **Decide** | Parameter optimization and selection | *planned* | |
| 55 | +| **Visualize** | Static maps, plots, animations, interactive | `zyra visualize` | |
| 56 | +| **Narrate** | AI-driven captions, summaries, reports | `zyra narrate` | |
| 57 | +| **Verify** | Quality checks and metadata validation | `zyra verify` | |
| 58 | +| **Export** | Push to S3, FTP, Vimeo, local, HTTP POST | `zyra export` | |
| 59 | + |
| 60 | +Stages are **composable** -- use only what you need. Stages support **streaming** via stdin/stdout for Unix-style chaining: |
| 61 | + |
| 62 | +```bash |
| 63 | +zyra acquire http $URL -o - | zyra process convert-format - netcdf --stdout | zyra visualize heatmap --input - --var TMP -o plot.png |
| 64 | +``` |
| 65 | + |
| 66 | +--- |
| 67 | + |
| 68 | +## Use Case: HRRR Weather Model Processing |
| 69 | + |
| 70 | +Acquire the latest High-Resolution Rapid Refresh forecast, convert to NetCDF, and visualize temperature: |
| 71 | + |
| 72 | +```bash |
| 73 | +zyra acquire http https://noaa-hrrr-bdp-pds.s3.amazonaws.com/hrrr.20240101/conus/hrrr.t00z.wrfsfcf00.grib2 -o hrrr.grib2 |
| 74 | +zyra process convert-format hrrr.grib2 netcdf -o hrrr.nc |
| 75 | +zyra visualize heatmap --input hrrr.nc --var TMP --colorbar --output hrrr_temp.png |
| 76 | +``` |
| 77 | + |
| 78 | +<div align="center"> |
| 79 | + <img src="assets/generated/heatmap.png" width="500" alt="Heatmap visualization" /> |
| 80 | + <br/><em>Heatmap rendered by <code>zyra visualize heatmap</code></em> |
| 81 | +</div> |
| 82 | + |
| 83 | +--- |
| 84 | + |
| 85 | +## Use Case: Drought Animation Pipeline |
| 86 | + |
| 87 | +A real-world production workflow syncs weekly drought risk frames from NOAA FTP, fills gaps, and composes a video -- all defined as a declarative YAML swarm manifest: |
| 88 | + |
| 89 | +```mermaid |
| 90 | +graph TD |
| 91 | + DL["download_frames\n(import / ftp-sync)"] --> SC["scan_frames\n(transform / metadata)"] |
| 92 | + SC --> FM["fill_missing\n(process / pad-missing)"] |
| 93 | + FM --> CA["compose_animation\n(visualize / compose-video)"] |
| 94 | + CA --> SL["save_local\n(export / local)"] |
| 95 | +
|
| 96 | + style DL fill:#00529E,stroke:#00172D,color:#FEFEFE,stroke-width:2px |
| 97 | + style SC fill:#2C670C,stroke:#00172D,color:#FEFEFE,stroke-width:2px |
| 98 | + style FM fill:#2C670C,stroke:#00172D,color:#FEFEFE,stroke-width:2px |
| 99 | + style CA fill:#576216,stroke:#00172D,color:#FEFEFE,stroke-width:2px |
| 100 | + style SL fill:#00529E,stroke:#00172D,color:#FEFEFE,stroke-width:2px |
| 101 | +``` |
| 102 | + |
| 103 | +```bash |
| 104 | +zyra swarm samples/swarm/drought_animation.yaml --parallel --memory provenance.sqlite |
| 105 | +``` |
| 106 | + |
| 107 | +Each agent logs provenance (start time, duration, command, exit code) to a SQLite store for full reproducibility. |
| 108 | + |
| 109 | +--- |
| 110 | + |
| 111 | +## Use Case: AI/LLM Narration Swarm |
| 112 | + |
| 113 | +Zyra orchestrates multi-agent workflows where LLM-powered agents generate, critique, and refine narrative outputs: |
| 114 | + |
| 115 | +```mermaid |
| 116 | +graph TD |
| 117 | + UI["User Intent\n(natural language)"] --> PL["Planner\n(zyra plan)"] |
| 118 | + PL --> VE["Value Engine\n(suggest augmentations)"] |
| 119 | + VE --> DAG["Execution DAG\n(parallel / sequential)"] |
| 120 | + DAG --> A1["Stage Agent\n(acquire)"] |
| 121 | + DAG --> A2["Stage Agent\n(process)"] |
| 122 | + DAG --> A3["Stage Agent\n(visualize)"] |
| 123 | + DAG --> A4["LLM Agent\n(narrate)"] |
| 124 | + A1 --> PR["Provenance\n(SQLite)"] |
| 125 | + A2 --> PR |
| 126 | + A3 --> PR |
| 127 | + A4 --> PR |
| 128 | +
|
| 129 | + style UI fill:#50452C,stroke:#00172D,color:#FEFEFE,stroke-width:2px |
| 130 | + style PL fill:#1A5A69,stroke:#00172D,color:#FEFEFE,stroke-width:2px |
| 131 | + style VE fill:#FFC107,stroke:#00172D,color:#00172D,stroke-width:2px |
| 132 | + style DAG fill:#00529E,stroke:#00172D,color:#FEFEFE,stroke-width:2px |
| 133 | + style A1 fill:#2C670C,stroke:#00172D,color:#FEFEFE,stroke-width:2px |
| 134 | + style A2 fill:#2C670C,stroke:#00172D,color:#FEFEFE,stroke-width:2px |
| 135 | + style A3 fill:#576216,stroke:#00172D,color:#FEFEFE,stroke-width:2px |
| 136 | + style A4 fill:#1A5A69,stroke:#00172D,color:#FEFEFE,stroke-width:2px |
| 137 | + style PR fill:#5F9DAE,stroke:#00172D,color:#00172D,stroke-width:2px |
| 138 | +``` |
| 139 | + |
| 140 | +The narration swarm chains **context**, **summary**, **critic**, and **editor** agents, each backed by configurable LLM providers: |
| 141 | + |
| 142 | +| Provider | Usage | |
| 143 | +|----------|-------| |
| 144 | +| OpenAI | `--provider openai --model gpt-4` | |
| 145 | +| Ollama | `--provider ollama --model gemma` | |
| 146 | +| Gemini | `--provider gemini` | |
| 147 | +| Mock | `--provider mock` (offline testing) | |
| 148 | + |
| 149 | +Outputs are validated against Pydantic schemas with optional guardrails (RAIL files) for structured, reproducible results. |
| 150 | + |
| 151 | +--- |
| 152 | + |
| 153 | +## Use Case: Reproducible Pipeline Configs |
| 154 | + |
| 155 | +Define multi-stage pipelines as YAML -- no scripting required: |
| 156 | + |
| 157 | +```yaml |
| 158 | +name: FTP to Local Video |
| 159 | +stages: |
| 160 | + - stage: acquire |
| 161 | + command: ftp |
| 162 | + args: |
| 163 | + path: ftp://ftp.nnvl.noaa.gov/SOS/DroughtRisk_Weekly |
| 164 | + sync_dir: ./frames |
| 165 | + since_period: "P1Y" |
| 166 | + - stage: visualize |
| 167 | + command: compose-video |
| 168 | + args: |
| 169 | + frames: ./frames |
| 170 | + output: video.mp4 |
| 171 | + fps: 4 |
| 172 | + - stage: export |
| 173 | + command: local |
| 174 | + args: |
| 175 | + input: video.mp4 |
| 176 | + path: /output/video.mp4 |
| 177 | +``` |
| 178 | +
|
| 179 | +```bash |
| 180 | +zyra run pipeline.yaml # execute |
| 181 | +zyra run pipeline.yaml --dry-run # preview commands |
| 182 | +zyra run pipeline.yaml --set visualize.fps=8 # override parameters |
| 183 | +``` |
| 184 | + |
| 185 | +--- |
| 186 | + |
| 187 | +## Building Off the Foundation |
| 188 | + |
| 189 | +Zyra provides three layers of access — from terminal commands to autonomous AI agents — all sharing the same 8-stage pipeline architecture: |
| 190 | + |
| 191 | +```mermaid |
| 192 | +graph BT |
| 193 | + CLI["1. CLI\nzyra [command]"] --> API["2. Python API\nimport zyra"] |
| 194 | + API --> MCP["3. MCP + AI Agents\ntools/discover"] |
| 195 | +
|
| 196 | + style CLI fill:#2C670C,stroke:#00172D,color:#FEFEFE,stroke-width:2px |
| 197 | + style API fill:#00529E,stroke:#00172D,color:#FEFEFE,stroke-width:2px |
| 198 | + style MCP fill:#1A5A69,stroke:#00172D,color:#FEFEFE,stroke-width:2px |
| 199 | +``` |
| 200 | + |
| 201 | +| Layer | Description | |
| 202 | +|-------|-------------| |
| 203 | +| **CLI** | Scriptable, streaming commands via `stdin/stdout` for Unix-style pipeline composition | |
| 204 | +| **Python API** | Programmatic access via `import zyra` for custom modules and automated workflows | |
| 205 | +| **MCP + AI Agents** | Every pipeline stage exposed as an MCP tool for LLM agent discovery and execution | |
| 206 | + |
| 207 | +Whether invoked from bash, Python, REST (`zyra serve`), or an AI agent, every execution follows the same architecture with full provenance tracking. |
| 208 | + |
| 209 | +--- |
| 210 | + |
| 211 | +## Visualization Gallery |
| 212 | + |
| 213 | +<table> |
| 214 | + <tr> |
| 215 | + <td align="center"> |
| 216 | + <img src="assets/generated/heatmap.png" width="380" alt="Heatmap" /><br/> |
| 217 | + <code>zyra visualize heatmap</code> |
| 218 | + </td> |
| 219 | + <td align="center"> |
| 220 | + <img src="assets/generated/contour.png" width="380" alt="Contour" /><br/> |
| 221 | + <code>zyra visualize contour</code> |
| 222 | + </td> |
| 223 | + </tr> |
| 224 | + <tr> |
| 225 | + <td align="center"> |
| 226 | + <img src="assets/generated/vector.png" width="380" alt="Vector Field" /><br/> |
| 227 | + <code>zyra visualize vector</code> |
| 228 | + </td> |
| 229 | + <td align="center"> |
| 230 | + <img src="assets/generated/timeseries.png" width="380" alt="Time Series" /><br/> |
| 231 | + <code>zyra visualize timeseries</code> |
| 232 | + </td> |
| 233 | + </tr> |
| 234 | +</table> |
| 235 | + |
| 236 | +--- |
| 237 | + |
| 238 | +## Key Features |
| 239 | + |
| 240 | +- **Scientific formats**: GRIB2, NetCDF, GeoTIFF with xarray, cfgrib, rasterio |
| 241 | +- **Connectors**: HTTP/S, S3, FTP, REST API, Vimeo |
| 242 | +- **Visualization**: heatmaps, contours, vectors, particles, animations, interactive maps (Folium, Plotly) |
| 243 | +- **AI integration**: multi-agent narration swarm, planning engine, value engine, guardrails |
| 244 | +- **Provenance**: SQLite-based event logging for full reproducibility |
| 245 | +- **Service mode**: FastAPI REST API + MCP tools for LLM integration |
| 246 | +- **Modular extras**: `pip install "zyra[visualization]"`, `"zyra[processing]"`, `"zyra[llm]"`, or `"zyra[all]"` |
| 247 | +- **Python 3.10+** · **Apache 2.0** · **CLI-first** · **Streaming-friendly** |
| 248 | + |
| 249 | +--- |
| 250 | + |
| 251 | +## Get Started |
| 252 | + |
| 253 | +```bash |
| 254 | +pip install zyra # core |
| 255 | +pip install "zyra[all]" # everything |
| 256 | +zyra --help # explore commands |
| 257 | +``` |
| 258 | + |
| 259 | +| Resource | Link | |
| 260 | +|----------|------| |
| 261 | +| GitHub | [github.com/NOAA-GSL/zyra](https://github.com/NOAA-GSL/zyra) | |
| 262 | +| PyPI | [pypi.org/project/zyra](https://pypi.org/project/zyra/) | |
| 263 | +| Documentation | [noaa-gsl.github.io/zyra](https://noaa-gsl.github.io/zyra/) | |
| 264 | +| Wiki | [github.com/NOAA-GSL/zyra/wiki](https://github.com/NOAA-GSL/zyra/wiki) | |
| 265 | +| DOI | [10.5281/zenodo.16923323](https://doi.org/10.5281/zenodo.16923323) | |
0 commit comments