Skip to content

feat(tutorials): ship tutorial data + downloadable example systems#706

Merged
FBumann merged 3 commits into
mainfrom
feat/tutorial-data-fetch
Jun 16, 2026
Merged

feat(tutorials): ship tutorial data + downloadable example systems#706
FBumann merged 3 commits into
mainfrom
feat/tutorial-data-fetch

Conversation

@FBumann

@FBumann FBumann commented Jun 16, 2026

Copy link
Copy Markdown
Member

Problem

Students following the FlixOpt notebooks hit dead imports (from data.tutorial_data import ...) on a plain pip install flixopt — the tutorial data lives under docs/ (pruned from the package), so they had to clone the repo or copy files out of GitHub by hand.

Solution

A public fx.tutorials API that makes every notebook standalone, in two tiers:

  • get_data(name) — the synthetic datasets for notebooks 01–07. Pure numpy/pandas, no files, no network, works out of a bare install.
  • load_example(name) — the six realistic example systems for notebooks 08–09. Pre-built once, serialized with FlowSystem.to_netcdf, hosted on a GitHub release, and downloaded + cached + hash-verified via pooch. The heavy demandlib/pvlib/holidays profile generation no longer runs at user runtime. Gated behind a new flixopt[tutorials] extra.

list_data() / list_examples() and validation all derive from the DataName / ExampleName Literals, which are the single source of truth for the names (each name written exactly once; builders resolved by convention).

Contents

  • flixopt/tutorials/get_data, load_example, list_data, list_examples, DataName, ExampleName.
  • pyproject.tomltutorials extra (pooch); added to full/dev.
  • scripts/build_tutorial_datasets.py — builds the .nc artefacts + registry.txt.
  • .github/workflows/tutorial-data.yaml — manual workflow that builds and uploads the assets to the data release.
  • All 14 notebooks migrated to the new API; orphaned docs/notebooks/data/tutorial_data.py removed.
  • tests/test_tutorials.py.

Verification

  • tests/test_tutorials.py passes; ruff/pre-commit clean.
  • Full download path exercised against the live tutorial-data-v1 release: all six examples download, hash-verify, deserialize, and cache.

Rollout — done ✅

The tutorial-data-v1 release has been published with all seven assets (registry.txt + six .nc), so the 08–09 notebooks download successfully and Build documentation is green.

Review follow-ups (CodeRabbit)

  • 404 on tutorial-data-v1 assets (Critical/Major) — resolved by publishing the release above; those comments reflected the pre-publish state.
  • Workflow hardening (Major) — added persist-credentials: false to the checkout step (write-scoped job authenticates releases via explicit GH_TOKEN). Action version tags left as @v* to match the repo-wide convention; SHA-pinning is better handled as a separate repo-wide PR.

CI note

The only remaining red check is the pre-existing flaky test test_heatmap_reshape::test_with_irregular_data (module-level np.random + pytest-xdist ordering, ~0.7% fail rate) — unrelated to this PR. Fixed in #707; once that merges, rebasing this branch turns CI green.

Open question

get_data (in-memory) vs load_example (IO/network) is an intentional naming asymmetry. Happy to switch to load_data/load_example symmetry if preferred.

🤖 Generated with Claude Code

@coderabbitai

coderabbitai Bot commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Warning

Review limit reached

@FBumann, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 17 minutes and 18 seconds. Learn how PR review limits work.

Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file).

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: ed328223-5c65-462f-a159-dcfc16198ef9

📥 Commits

Reviewing files that changed from the base of the PR and between bd36609 and 4e477d4.

📒 Files selected for processing (22)
  • .github/workflows/tutorial-data.yaml
  • docs/notebooks/02-heat-system.ipynb
  • docs/notebooks/03-investment-optimization.ipynb
  • docs/notebooks/04-operational-constraints.ipynb
  • docs/notebooks/05-multi-carrier-system.ipynb
  • docs/notebooks/06a-time-varying-parameters.ipynb
  • docs/notebooks/07-scenarios-and-periods.ipynb
  • docs/notebooks/08a-aggregation.ipynb
  • docs/notebooks/08b-rolling-horizon.ipynb
  • docs/notebooks/08c-clustering.ipynb
  • docs/notebooks/08c2-clustering-storage-modes.ipynb
  • docs/notebooks/08d-clustering-multiperiod.ipynb
  • docs/notebooks/08e-clustering-internals.ipynb
  • docs/notebooks/08f-clustering-segmentation.ipynb
  • docs/notebooks/09-plotting-and-data-access.ipynb
  • flixopt/__init__.py
  • flixopt/tutorials/__init__.py
  • flixopt/tutorials/_examples.py
  • flixopt/tutorials/_tutorial_data.py
  • pyproject.toml
  • scripts/build_tutorial_datasets.py
  • tests/test_tutorials.py
📝 Walkthrough

Walkthrough

Introduces flixopt.tutorials, a new subpackage exposing get_data(name) for synthetic notebook datasets and load_example(name) for pre-built FlowSystem examples downloaded via pooch from GitHub Releases. A build script and CI workflow generate and publish the .nc dataset artifacts. All 14 tutorial notebooks are migrated from local helper imports to the new API.

Changes

flixopt.tutorials API, build tooling, and notebook migrations

Layer / File(s) Summary
tutorials API: data module, example loader, package init, and tests
flixopt/tutorials/_tutorial_data.py, flixopt/tutorials/_examples.py, flixopt/tutorials/__init__.py, flixopt/__init__.py, pyproject.toml, tests/test_tutorials.py
_tutorial_data.py privatises individual get_*_data builder functions and adds DataName Literal, list_data(), and get_data(name) via globals() dispatch. _examples.py adds ExampleName, list_examples(), and load_example(name) backed by pooch with a hosted registry.txt for hash verification and an env-var base-URL override. tutorials/__init__.py re-exports both namespaces; flixopt/__init__.py adds tutorials to imports and __all__. pyproject.toml adds a new tutorials optional-dependency extra plus pooch >=1.8.0,<2 to full and dev. Tests verify module exposure, dataset loading, Literal alignment, and ValueError messages for unknown names.
Dataset build script and CI publish workflow
scripts/build_tutorial_datasets.py, .github/workflows/tutorial-data.yaml
build_tutorial_datasets.py enumerates examples via tutorials.list_examples(), constructs each FlowSystem via create_<name>_system, serialises to .nc, computes SHA-256, and writes registry.txt. The manual workflow_dispatch CI job verifies DATA_RELEASE matches the requested tag, runs the build script, creates the GitHub Release if absent, and uploads all artifacts with --clobber.
Tutorial notebook migrations
docs/notebooks/02-heat-system.ipynb, docs/notebooks/03-investment-optimization.ipynb, docs/notebooks/04-operational-constraints.ipynb, docs/notebooks/05-multi-carrier-system.ipynb, docs/notebooks/06a-time-varying-parameters.ipynb, docs/notebooks/07-scenarios-and-periods.ipynb, docs/notebooks/08a-aggregation.ipynb, docs/notebooks/08b-rolling-horizon.ipynb, docs/notebooks/08c-clustering.ipynb, docs/notebooks/08c2-clustering-storage-modes.ipynb, docs/notebooks/08d-clustering-multiperiod.ipynb, docs/notebooks/08e-clustering-internals.ipynb, docs/notebooks/08f-clustering-segmentation.ipynb, docs/notebooks/09-plotting-and-data-access.ipynb
All 14 tutorial notebooks remove imports of data.tutorial_data and data.generate_example_systems helpers and replace each data-loading or system-construction call with fx.tutorials.get_data(name) (notebooks 02–07) or fx.tutorials.load_example(name) (notebooks 08a–09).

Sequence Diagram

sequenceDiagram
  participant Notebook
  participant flixopt.tutorials
  participant _tutorial_data
  participant _examples
  participant pooch
  participant GitHubReleases

  rect rgba(100, 149, 237, 0.5)
    Note over Notebook,_tutorial_data: Synthetic dataset path (notebooks 02–07)
    Notebook->>flixopt.tutorials: get_data('heat_system')
    flixopt.tutorials->>_tutorial_data: get_data('heat_system')
    _tutorial_data->>_tutorial_data: validate DataName, call _get_heat_system_data()
    _tutorial_data-->>Notebook: dict{timesteps, heat_demand, gas_price}
  end

  rect rgba(60, 179, 113, 0.5)
    Note over Notebook,GitHubReleases: Pre-built example path (notebooks 08a–09)
    Notebook->>flixopt.tutorials: load_example('district_heating')
    flixopt.tutorials->>_examples: load_example('district_heating')
    _examples->>_examples: validate ExampleName
    _examples->>pooch: fetch('district_heating.nc')
    pooch->>GitHubReleases: GET district_heating.nc (if not cached)
    GitHubReleases-->>pooch: district_heating.nc
    pooch-->>_examples: local cache path
    _examples->>_examples: FlowSystem.from_netcdf(path)
    _examples-->>Notebook: FlowSystem
  end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐇 A new tutorials burrow, dug with care,
get_data and load_example waiting there.
No more local imports to hunt and seek —
fx.tutorials is all the help you need!
The notebooks rejoice, the CI sings,
Pooch fetches datasets on fluffy wings. 🌟

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 61.90% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Title check ✅ Passed The pull request title 'feat(tutorials): ship tutorial data + downloadable example systems' accurately and concisely summarizes the main change: introducing a tutorials API with both inline data and downloadable example systems.
Description check ✅ Passed PR description comprehensively covers problem, solution, contents, verification, and rollout status with clear examples and implementation details.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/tutorial-data-fetch

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Notebook tutorial data was only reachable by cloning the repo or copying
files out of GitHub, so users hit dead imports on a plain `pip install`.

Add a public `fx.tutorials` API that makes every notebook standalone:

- `get_data(name)` returns the synthetic datasets for notebooks 01-07,
  generated from numpy/pandas with no files or network.
- `load_example(name)` downloads a pre-built FlowSystem for notebooks
  08-09 from the project's GitHub releases (cached + hash-verified via
  pooch), so the heavy demandlib/pvlib/holidays generation no longer runs
  at user runtime. Gated behind the new `flixopt[tutorials]` extra.

The dataset/example names live once each in the `DataName`/`ExampleName`
Literals; lists, validation and builder dispatch all derive from them.

Add `scripts/build_tutorial_datasets.py` and a manual `tutorial-data`
workflow to build and upload the example artefacts to the data release,
and migrate all notebooks to the new API.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@FBumann FBumann force-pushed the feat/tutorial-data-fetch branch from bd36609 to df34a1a Compare June 16, 2026 13:17
@FBumann FBumann changed the title feat(tutorials): ship tutorial data + downloadable example systems docs(tutorials): ship tutorial data + downloadable example systems Jun 16, 2026
@FBumann FBumann changed the title docs(tutorials): ship tutorial data + downloadable example systems feat(tutorials): ship tutorial data + downloadable example systems Jun 16, 2026

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/workflows/tutorial-data.yaml:
- Around line 26-37: In the tutorial-data.yaml workflow file, replace the three
mutable action version tags with pinned commit SHAs for security hardening:
change actions/checkout@v6 to use its full commit SHA, astral-sh/setup-uv@v7 to
use its full commit SHA, and actions/setup-python@v6 to use its full commit SHA.
Additionally, add persist-credentials: false as a parameter to the
actions/checkout step to disable credential persistence and reduce token
exposure risk for this write-scoped job.

In `@docs/notebooks/08a-aggregation.ipynb`:
- Line 63: The notebook at line 63 calls
fx.tutorials.load_example('district_heating'), which depends on data files
(district_heating.nc, registry.txt, and other referenced tutorial files) that
must be available through the tutorial-data-v1 release. Currently, these files
are returning HTTP 404 errors, meaning the release has not been published.
Before merging this PR, you must publish the tutorial-data-v1 release containing
all required tutorial data files (registry.txt, district_heating.nc,
operational.nc, simple.nc, complex.nc, seasonal_storage.nc, and multiperiod.nc)
so that the load_example function can successfully retrieve the data at runtime.

In `@flixopt/tutorials/_examples.py`:
- Around line 31-35: The DATA_RELEASE variable is currently set to
'tutorial-data-v1', but this GitHub release tag does not have the required
registry.txt asset published, causing 404 errors when load_example() tries to
download from the _DEFAULT_BASE_URL. Either publish the missing assets to the
'tutorial-data-v1' release tag on GitHub before merging, or temporarily update
the DATA_RELEASE variable to point to an existing release tag that already has
all required assets (registry.txt and data files) published.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 8c4729a7-e182-409b-8af3-c2076d795936

📥 Commits

Reviewing files that changed from the base of the PR and between 82d6ee2 and bd36609.

📒 Files selected for processing (22)
  • .github/workflows/tutorial-data.yaml
  • docs/notebooks/02-heat-system.ipynb
  • docs/notebooks/03-investment-optimization.ipynb
  • docs/notebooks/04-operational-constraints.ipynb
  • docs/notebooks/05-multi-carrier-system.ipynb
  • docs/notebooks/06a-time-varying-parameters.ipynb
  • docs/notebooks/07-scenarios-and-periods.ipynb
  • docs/notebooks/08a-aggregation.ipynb
  • docs/notebooks/08b-rolling-horizon.ipynb
  • docs/notebooks/08c-clustering.ipynb
  • docs/notebooks/08c2-clustering-storage-modes.ipynb
  • docs/notebooks/08d-clustering-multiperiod.ipynb
  • docs/notebooks/08e-clustering-internals.ipynb
  • docs/notebooks/08f-clustering-segmentation.ipynb
  • docs/notebooks/09-plotting-and-data-access.ipynb
  • flixopt/__init__.py
  • flixopt/tutorials/__init__.py
  • flixopt/tutorials/_examples.py
  • flixopt/tutorials/_tutorial_data.py
  • pyproject.toml
  • scripts/build_tutorial_datasets.py
  • tests/test_tutorials.py

Comment on lines +26 to +37
- uses: actions/checkout@v6
with:
fetch-depth: 0

- uses: astral-sh/setup-uv@v7
with:
version: "0.10.9"
enable-cache: true

- uses: actions/setup-python@v6
with:
python-version: ${{ env.PYTHON_VERSION }}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "Mutable action refs (should be SHA-pinned):"
rg -n '^\s*-\s*uses:\s*.+@v[0-9]+' .github/workflows/tutorial-data.yaml || true

echo
echo "Checkout hardening block:"
rg -n -A5 -B1 'actions/checkout@' .github/workflows/tutorial-data.yaml

echo
echo "Expected: no `@v`* refs, and persist-credentials: false present under checkout."

Repository: flixOpt/flixopt

Length of output: 482


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Get the full workflow file to check permissions
cat -n .github/workflows/tutorial-data.yaml | head -50

Repository: flixOpt/flixopt

Length of output: 2057


Pin action revisions to commit SHAs and disable checkout credential persistence in this write-scoped job.

This workflow has permissions: contents: write at the job level. The three action references use mutable @v* version tags rather than pinned commit SHAs:

  • actions/checkout@v6 (line 26)
  • astral-sh/setup-uv@v7 (line 30)
  • actions/setup-python@v6 (line 35)

Additionally, the checkout action does not set persist-credentials: false. For a write-scoped job, pin each action to a full commit SHA and disable credential persistence to reduce supply-chain and token exposure risk.

🧰 Tools
🪛 zizmor (1.25.2)

[warning] 26-28: credential persistence through GitHub Actions artifacts (artipacked): does not set persist-credentials: false

(artipacked)


[error] 26-26: unpinned action reference (unpinned-uses): action is not pinned to a hash (required by blanket policy)

(unpinned-uses)


[error] 30-30: unpinned action reference (unpinned-uses): action is not pinned to a hash (required by blanket policy)

(unpinned-uses)


[error] 35-35: unpinned action reference (unpinned-uses): action is not pinned to a hash (required by blanket policy)

(unpinned-uses)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/tutorial-data.yaml around lines 26 - 37, In the
tutorial-data.yaml workflow file, replace the three mutable action version tags
with pinned commit SHAs for security hardening: change actions/checkout@v6 to
use its full commit SHA, astral-sh/setup-uv@v7 to use its full commit SHA, and
actions/setup-python@v6 to use its full commit SHA. Additionally, add
persist-credentials: false as a parameter to the actions/checkout step to
disable credential persistence and reduce token exposure risk for this
write-scoped job.

Source: Linters/SAST tools

"from data.generate_example_systems import create_district_heating_system\n",
"\n",
"flow_system = create_district_heating_system()\n",
"flow_system = fx.tutorials.load_example('district_heating')\n",

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

BASE_URL="https://github.com/flixOpt/flixopt/releases/download/tutorial-data-v1"
files=(
  "registry.txt"
  "district_heating.nc"
  "operational.nc"
  "simple.nc"
  "complex.nc"
  "seasonal_storage.nc"
  "multiperiod.nc"
)

for f in "${files[@]}"; do
  code=$(curl -s -o /dev/null -w "%{http_code}" -L "${BASE_URL}/${f}")
  echo "${f} -> HTTP ${code}"
done

Repository: flixOpt/flixopt

Length of output: 247


Tutorial release assets are missing; publish tutorial-data-v1 release before merging.

Line 63 uses fx.tutorials.load_example('district_heating'), which requires district_heating.nc and registry.txt from the tutorial-data-v1 release. Verification confirms all tutorial data files (registry.txt, district_heating.nc, operational.nc, simple.nc, complex.nc, seasonal_storage.nc, multiperiod.nc) are currently unavailable (HTTP 404). The notebook will fail at runtime if this release is not published before merge.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/notebooks/08a-aggregation.ipynb` at line 63, The notebook at line 63
calls fx.tutorials.load_example('district_heating'), which depends on data files
(district_heating.nc, registry.txt, and other referenced tutorial files) that
must be available through the tutorial-data-v1 release. Currently, these files
are returning HTTP 404 errors, meaning the release has not been published.
Before merging this PR, you must publish the tutorial-data-v1 release containing
all required tutorial data files (registry.txt, district_heating.nc,
operational.nc, simple.nc, complex.nc, seasonal_storage.nc, and multiperiod.nc)
so that the load_example function can successfully retrieve the data at runtime.

Comment on lines +31 to +35
DATA_RELEASE = 'tutorial-data-v1'

_BASE_URL_ENV = 'FLIXOPT_DATA_BASE_URL' # override for testing / self-hosting
_DEFAULT_BASE_URL = f'https://github.com/flixOpt/flixopt/releases/download/{DATA_RELEASE}/'
_REGISTRY_FILENAME = 'registry.txt'

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

DATA_RELEASE currently resolves to missing assets (404), breaking notebook execution.

The docs pipeline is failing on registry.txt download for tutorial-data-v1, so load_example()-backed notebooks cannot run until that release is populated. Publish assets for this tag before merge (or temporarily point to an existing populated tag).

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@flixopt/tutorials/_examples.py` around lines 31 - 35, The DATA_RELEASE
variable is currently set to 'tutorial-data-v1', but this GitHub release tag
does not have the required registry.txt asset published, causing 404 errors when
load_example() tries to download from the _DEFAULT_BASE_URL. Either publish the
missing assets to the 'tutorial-data-v1' release tag on GitHub before merging,
or temporarily update the DATA_RELEASE variable to point to an existing release
tag that already has all required assets (registry.txt and data files)
published.

Source: Pipeline failures

The tutorial-data job is `contents: write` and authenticates its release
steps via an explicit GH_TOKEN, so the checkout action's persisted git
credentials are unnecessary. Set persist-credentials: false to reduce token
exposure (per CodeRabbit review on #706). Action version tags left as-is to
match the repo-wide convention.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@FBumann

FBumann commented Jun 16, 2026

Copy link
Copy Markdown
Member Author

@PStange THis should make usage of tutorials easier. And also expose the pre buildt systems generally

@FBumann FBumann enabled auto-merge (squash) June 16, 2026 13:56
@FBumann FBumann merged commit c9652bc into main Jun 16, 2026
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant