Skip to content

Commit 0618914

Browse files
committed
review preparation, many changes see the changelog
1 parent 6176535 commit 0618914

26 files changed

+697
-286
lines changed
Lines changed: 60 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,60 @@
1+
name: Build and Publish Docker Images
2+
3+
on:
4+
push:
5+
branches: [main, update]
6+
paths:
7+
- 'docker/**'
8+
release:
9+
types: [published]
10+
workflow_dispatch:
11+
12+
env:
13+
REGISTRY: ghcr.io
14+
IMAGE_NAME: ${{ github.repository }}
15+
16+
jobs:
17+
build-and-push:
18+
runs-on: ubuntu-latest
19+
permissions:
20+
contents: read
21+
packages: write
22+
23+
strategy:
24+
matrix:
25+
component: [hydromt, surrogate, wflow]
26+
27+
steps:
28+
- name: Checkout repository
29+
uses: actions/checkout@v4
30+
31+
- name: Set up Docker Buildx
32+
uses: docker/setup-buildx-action@v3
33+
34+
- name: Log in to GitHub Container Registry
35+
uses: docker/login-action@v3
36+
with:
37+
registry: ${{ env.REGISTRY }}
38+
username: ${{ github.actor }}
39+
password: ${{ secrets.GITHUB_TOKEN }}
40+
41+
- name: Extract metadata
42+
id: meta
43+
uses: docker/metadata-action@v5
44+
with:
45+
images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}/${{ matrix.component }}
46+
tags: |
47+
type=raw,value=latest,enable={{is_default_branch}}
48+
type=sha
49+
50+
- name: Build and push Docker image
51+
uses: docker/build-push-action@v5
52+
with:
53+
context: ./docker/${{ matrix.component }}
54+
file: ./docker/${{ matrix.component }}/Dockerfile
55+
push: true
56+
tags: ${{ steps.meta.outputs.tags }}
57+
labels: ${{ steps.meta.outputs.labels }}
58+
platforms: linux/amd64,linux/arm64
59+
cache-from: type=gha
60+
cache-to: type=gha,mode=max

.gitignore

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,9 @@
44
.env*
55
tests/*.z*
66

7+
scripts/
8+
9+
.secrets.baseline
710
/openeo/hydromt/hydromt-output/
811
/openeo/wflow/wflow-output/
912
/openeo/surrogate/surrogate-output/

CHANGELOG.md

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,25 @@
55
Since we are working on 1 branch now and my commit messages are getting \
66
ridiculously long, I thought it would be a good idea to start a changelog.
77

8+
## 01/11/2025 preparation for the review
9+
10+
### Changes
11+
12+
- Move old stuff to the archive
13+
- Rework the README to reflect the current state of the project
14+
- Update the environment.yaml to include all necessary packages for local development
15+
- add `example/usecase.ipynb` with a simple OpenEO workflow to run the use case
16+
- Add github action to build and push the docker images
17+
- Add documentation for the new features and changes
18+
- Updated labels of docker images in the Dockerfiles
19+
20+
### In progress
21+
22+
- Final fixes for the use case containers
23+
- Final testing and preparation of the use case example
24+
- Final review of the documentation
25+
- SQAaaS integration after this update
26+
827
## 10/07/2024 Reworking the project structure in preparation for OpenEO demo
928

1029
### Fixes

README.md

Lines changed: 57 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,21 @@
44

55
## Table of Contents
66

7+
- [Introduction](#introduction)
8+
- [Repository structure](#repository-structure)
9+
- [Environment setup](#environment-setup)
10+
- [Use case components](#use-case-components)
11+
- [HydroMT](#hydromt)
12+
- [Wflow](#wflow)
13+
- [Surrogate model based on ItwinAI](#surrogate-model-based-on-it
14+
winai)
15+
- [OSCAR](#oscar)
16+
- [Running the use case using openEO and OSCAR](#running-the-use-case-using-openeo-and-oscar)
17+
- [openEO OSCAR integration](#openeo-oscar-integration)
18+
- [Tests](#tests)
19+
- [License](#license)
20+
- [Project framework](#project-framework)
21+
722
## Introduction
823

924
HyDroForM stands for "Hydrological Drought Forecasting Model with HydroMT and Wflow". It is a Digital Twin for Drought Early Warning in the Alps developed as a use case for the [InterTwin project](https://www.intertwin.eu/). The details of the use case are also available online [here](https://www.intertwin.eu/intertwin-use-case-a-digital-twin-for-drought-early-warning-in-the-alps).
@@ -19,7 +34,17 @@ InterTwin components used in this use case are:
1934
- [ItwinAI](https://github.com/interTwin-eu/itwinai)
2035
- [OSCAR](https://github.com/grycap/oscar)
2136
- [Hython](https://github.com/interTwin-eu/hython)
22-
- [InterLink](https://github.com/interTwin-eu/interLink)
37+
38+
## Repository structure
39+
40+
- `Archive`: contains older versions and CWL descriptions of the use case
41+
- `docker`: contains the Dockerfiles and scripts to build and run the components of the use case
42+
- `docs`: documentation and images
43+
- `example`: example files for running the use case
44+
- `OSCAR`: contains the OSCAR deployment files for the use case
45+
- `scripts`: helper scripts used during the development of the use case
46+
- `tests`: scripts to test the components of the use case
47+
- `environment.yaml`: conda environment file to set up the development environment
2348

2449
## Environment setup
2550

@@ -30,10 +55,6 @@ conda env create -f environment.yaml
3055
conda activate hydroform
3156
```
3257

33-
## TODO: Use case diagram
34-
35-
## TODO: System design diagram
36-
3758
## Use case components
3859

3960
There are **three main components** in the HyDroForM use case:
@@ -42,27 +63,41 @@ There are **three main components** in the HyDroForM use case:
4263

4364
HydroMT (Hydro Model Tools) is an open-source Python package that facilitates the process of building and analyzing spatial geoscientific models with a focus on water system models. It does so by automating the workflow to go from raw data to a complete model instance which is ready to run and to analyse model results once the simulation has finished. HydroMT builds on the latest packages in the scientific and geospatial python eco-system including xarray, rasterio, rioxarray, geopandas, scipy and pyflwdir. Source: [Deltares HydroMT](https://deltares.github.io/hydromt/latest/)
4465

45-
#### Running HydroMT
46-
47-
To run HydroMT from start to finish you can use the `validation` script which is located in `/docker/hydromt/validation.sh`. This script will run the HydroMT validation test which includes the following steps:
66+
### Wflow
4867

49-
1. Update the configuration file of HydroMT
50-
2. Run HydroMT using the configuration file
51-
3. Convert the output Wflow configuration file to lowercase letters
52-
4. Wrap the outputs into STAC collections
68+
Wflow is Deltares’ solution for modelling hydrological processes, allowing users to account for precipitation, interception, snow accumulation and melt, evapotranspiration, soil water, surface water and groundwater recharge in a fully distributed environment. Successfully applied worldwide for analyzing flood hazards, drought, climate change impacts and land use changes, wflow is growing to be a leader in hydrology solutions. Wflow is conceived as a framework, within which multiple distributed model concepts are available, which maximizes the use of open earth observation data, making it the hydrological model of choice for data scarce environments. Based on gridded topography, soil, land use and climate data, wflow calculates all hydrological fluxes at any given grid cell in the model at a given time step.
5369

54-
### Wflow
70+
Source: [Deltares Wflow](https://deltares.github.io/Wflow.jl/stable/)
5571

56-
Wflow is Deltares’ solution for modelling hydrological processes, allowing users to account for precipitation, interception, snow accumulation and melt, evapotranspiration, soil water, surface water and groundwater recharge in a fully distributed environment. Successfully applied worldwide for analyzing flood hazards, drought, climate change impacts and land use changes, wflow is growing to be a leader in hydrology solutions. Wflow is conceived as a framework, within which multiple distributed model concepts are available, which maximizes the use of open earth observation data, making it the hydrological model of choice for data scarce environments. Based on gridded topography, soil, land use and climate data, wflow calculates all hydrological fluxes at any given grid cell in the model at a given time step. Source: [Deltares Wflow](https://deltares.github.io/Wflow.jl/stable/)
72+
### Surrogate model based on ItwinAI
5773

58-
#### Running Wflow
74+
`itwinai` is a Python toolkit designed to help scientists and researchers streamline AI and machine learning workflows, specifically for digital twin applications. It provides easy-to-use tools for distributed training, hyper-parameter optimization on HPC systems, and integrated ML logging, reducing engineering overhead and accelerating research. Developed primarily by CERN, in collaboration with Forschungszentrum Jülich (FZJ), itwinai supports modular and reusable ML workflows, with the flexibility to be extended through third-party plugins, empowering AI-driven scientific research in digital twins.
5975

60-
### TODO: Surrogate model
76+
Source: [ItwinAI](https://github.com/interTwin-eu/itwinai)
6177

6278
## OSCAR
6379

6480
OSCAR is an open-source platform to support the event-driven serverless computing model for data-processing applications. It can be automatically deployed on multi-Clouds, and even on low-powered devices, to create highly-parallel event-driven data-processing serverless applications along the computing continuum. These applications execute on customized runtime environments provided by Docker containers that run on elastic Kubernetes clusters. It is also integrated with the SCAR framework, which supports a High Throughput Computing Programming Model to create highly-parallel event-driven data-processing serverless applications that execute on customized runtime environments provided by Docker containers run on AWS Lambda and AWS Batch. [OSCAR](https://github.com/grycap/oscar)
6581

82+
## Running the use case using openEO and OSCAR
83+
84+
The `OSCAR` directory contains the files necessary to deploy the use case on the OSCAR platform. There are 2 main components to do so, a bash script and a yaml service definition file.
85+
86+
These can be found in the respective subdirectories:
87+
`OSCAR/oscar_hydromt`, `OSCAR/oscar_wflow`, and `OSCAR/oscar_surrogate`
88+
89+
To run the use case we have created a sample Jupyter notebook `example/usecase.ipynb` which can be used to run the use case using the openEO API.
90+
91+
The example shows how the three components are linked together to create a drought forecasting workflow.
92+
93+
## openEO OSCAR integration
94+
95+
The integration of openEO with OSCAR is done in the dask/xarray implementation of openEO called `openeo-processes-dask`. The openEO backend is the main orchestration component of the use case. It is responsible for managing the execution of the different components of the use case on OSCAR.
96+
97+
The backend now implements the `oscar_python` library to submit tasks to OSCAR from the process graph.
98+
99+
When the process graph is executed, the `run_oscar` process authenticates with the OSCAR platform, validates the service definition file and submits the job to OSCAR. The process then monitors the job status and retrieves the results once the job is completed. If the service definition contains a process not yet registered in OSCAR it will be created on the fly. The process parameters are passed as environment variables to the container where the scripts are executed. The results are stored as STAC collections and returned to openEO as a string URL to the collection.
100+
66101
## Tests
67102

68103
The components of the use case are set up in `Docker containers`. We have a set of scripts available to build and run the base images. These can be found in the `/tests` directory and can be run from `root` directory of the repository.
@@ -73,8 +108,12 @@ For example:
73108
./tests/test_hydromt.sh
74109
```
75110

76-
## TODO: Use case demonstration
77-
78111
## License
79112

80113
This project is licensed under the Apache 2.0 - see the [LICENSE](LICENSE) file for details.
114+
115+
## Project framework
116+
117+
interTwin is an EU-funded project with the goal to co-design and implement the prototype of an interdisciplinary Digital Twin Engine – an open source platform based on open standards that offers the capability to integrate with application-specific Digital Twins.
118+
119+
interTwin is funded by the European Union Grant Agreement Number 101058386

docker/dummy/Dockerfile

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
FROM python:3.13-slim
2+
3+
WORKDIR /app
4+
5+
COPY script.py .

docker/dummy/script.py

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
import os
2+
import time
3+
import logging
4+
5+
logging.basicConfig(
6+
level=logging.INFO,
7+
format="%(asctime)s - %(levelname)s - %(message)s",
8+
datefmt="%Y-%m-%d %H:%M:%S",
9+
)
10+
11+
logger = logging.getLogger(__name__)
12+
13+
14+
def main():
15+
# Read environment variables (simulate required inputs)
16+
input1 = os.getenv("DUMMY_INPUT1", "default1")
17+
input2 = os.getenv("DUMMY_INPUT2", "default2")
18+
input_stac = os.getenv("INPUT_STAC", "default_stac")
19+
logger.info(f"Received inputs: DUMMY_INPUT1={input1},\n DUMMY_INPUT2={input2},\n INPUT_STAC={input_stac}")
20+
21+
# Simulate processing
22+
logger.info("Simulating processing...")
23+
time.sleep(2)
24+
25+
# Return a fixed URL as output
26+
output_url = "https://stac.intertwin.fedcloud.eu/collections/8db57c23-4013-45d3-a2f5-a73abf64adc4_WFLOW_FORCINGS_STATICMAPS"
27+
logger.info(f"STAC OUTPUT URL {output_url}")
28+
29+
if __name__ == "__main__":
30+
main()

docker/hydromt/Dockerfile

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
11
FROM python:3.10-bullseye AS build
22

3-
LABEL version="EC Demo Review"
3+
LABEL version="1.0"
44
LABEL description="Hydromt Docker image for building and updating hydromt models"
5-
LABEL maintainer="Juraj Zvolensky"
5+
LABEL maintainer="Juraj Zvolensky, Iacopo Ferrario"
66
LABEL organization="Eurac Research"
77

88
WORKDIR /hydromt
@@ -28,6 +28,8 @@ RUN cd .
2828

2929
RUN pip uninstall -y pydantic && pip install pydantic==2.8.2 openeo_pg_parser_networkx==2024.10.0
3030

31+
RUN pip install pystac==1.14.0
32+
3133
##################### HydroMT Components setup #####################
3234

3335
RUN mkdir -p /hydromt/output /hydromt/data
@@ -39,13 +41,15 @@ COPY data_catalog.yaml /hydromt/data_catalog.yaml
3941
COPY stac.py /hydromt/stac.py
4042
COPY config_gen.py /hydromt/config_gen.py
4143
COPY convert_lowercase.py /hydromt/convert_lowercase.py
44+
COPY decode_keys.sh /hydromt/decode_keys.sh
4245
COPY build.sh /hydromt/build.sh
4346
##################### Set executables #####################
4447

4548
RUN chmod +x /hydromt/stac.py \
4649
/hydromt/config_gen.py \
4750
/hydromt/build.sh \
48-
/hydromt/convert_lowercase.py
51+
/hydromt/convert_lowercase.py \
52+
/hydromt/decode_keys.sh
4953

5054
FROM python:3.10-bullseye
5155

0 commit comments

Comments
 (0)