INFLOW-AI Flood Inundation Prediction Model

Overview

This program is a machine learning model designed to predict flood inundation coverage over the INFLOW study area. It leverages satellite data, transformer modls, and Monte Carlo simulations to generate 2-month predictions and 95% confidence intervals. It also automates the processing, normalisation, and visualisation of data, providing actionable insights into flood dynamics in the While Nile basin.

Inundation masks of the White Nile basin from the MODIS satellite mission.

Requirements

Python 3.11 or higher
Required Python libraries:
- datetime==5.5
- geopandas==1.0.1
- h5py==3.12.1
- loguru==0.7.3
- matplotlib==3.10.0
- netCDF4==1.7.2
- numpy==1.26.4
- pandas==2.2.2
- pathlib==1.0.1
- py_hydroweb==1.0.2
- rasterio==1.4.3
- requests==2.32.3
- scikit-learn==1.6.1
- scipy==1.13.1
- tensorflow==2.18.0
- tqdm>=4.66.1
- typer==0.15.1
- wget==3.2
- xarray==2025.1.1
Operating system: Windows, macOS, or Linux

Installation

Clone the repository: 'git clone https://github.com/your-repository/flood-inundation-prediction.git` cd flood-inundation-prediction
Download the data:
- Data can be downloaded from the Google Drive
- Save the folder as data in the parent directory
Set up a virtual environment (optional but recommended): python3 -m venv env source env/bin/activate # On Windows, use env\Scripts\activate
Install the required libraries: pip install -r requirements.txt

Usage

Data Preparation: Ensure your input data file temporal_data_seasonal_df.csv is correctly formatted and located in the designated folder (/data by default).
Run the Program: Use the following command to start the program:

python __main__.py

Visualisation: After execution, view the generated graphs and reports in the /output directory.

Program Workflow

Data Ingestion:

The program reads temporal and baseline data from CSV files.

Data Normalisation:

Data is scaled to ensure compatibility with machine learning models.

Model Training:

Trains a regression model on the baseline data.

Monte Carlo Simulation:

Generates predictions and confidence intervals based on 10,000 iterations.

Visualisation:

Produces heatmaps and time-series plots of flood inundation predictions.

Output:

Saves prediction data and visualisations in the /output directory.

Model Description

This model is a Temporal Transformer-based Recurrent Model designed for sequence prediction. It leverages the transformer architecture for capturing long-range dependencies and incorporates Monte Carlo Dropout for uncertainty estimation. Below is a detailed breakdown of its components and functionality.

Model Architecture

Input: The model takes in sequences of temporal data, where each sequence consists of multiple features. The input shape is (seq_len, num_features), where seq_len represents the length of the input sequence (e.g., number of time steps), and num_features is the number of features at each time step.
Positional Encoding:
- The model incorporates positional encoding to capture the order of the time steps in the input sequence. This encoding is added to the input data before being fed into the transformer encoder. The positional encoding is generated using a sinusoidal function, which is common in transformer models for sequence processing.
Transformer Encoder:
- The core of the model is the Transformer Encoder, which processes the sequential input data. The encoder consists of multi-head self-attention layers and feed-forward layers, both equipped with dropout regularisation and layer normalisation.
- The attention mechanism allows the model to focus on different parts of the sequence when making predictions, and the feed-forward layers learn non-linear relationships.
Feed-forward Network:
- After the transformer encoder, a feed-forward network is applied, consisting of dense layers with ReLU activations. This network helps the model to learn more complex patterns in the data.
Output:
- The final output layer produces predict_ahead predictions, which represent the forecasted values for the next predict_ahead time steps. The output is a dense layer with no activation function (i.e., linear output).

Loss Function

The model uses a custom loss function that combines the Mean Squared Error (MSE) with two additional penalties:

Sign Penalty: A penalty term that penalises predictions where the sign of the predicted values does not match the sign of the true values. This helps the model to maintain consistency in the directionality of the predictions.
Sum Penalty: A penalty that ensures the sum of predicted values closely matches the sum of true values across all time steps in the sequence. This is particularly useful in temporal data where the overall trend or aggregate behavior of the series is important.

The final loss is calculated as a weighted combination of the MSE, sign penalty, and sum penalty.

Training Process

Data Preparation: The data is preprocessed into overlapping sequences with a specified look_back window (past observations) and predict_ahead (future steps to predict).
Monte Carlo Dropout: The model employs Monte Carlo Dropout during inference to estimate uncertainty in the predictions. Dropout is kept active during prediction to generate multiple stochastic predictions, from which the mean and standard deviation are computed to capture the uncertainty in the model’s forecasts.
Early Stopping: During training, early stopping is used to prevent overfitting by monitoring the validation loss and stopping training when the performance plateaus for a specified number of epochs.
Optimisation: The model is compiled using the Adam optimiser with a learning rate of 0.0001, and the model is trained to minimise the custom loss function.

Model Evaluation

After training, the model is evaluated using the mean squared error (MSE) and mean absolute error (MAE) metrics on the test data.
The model’s predictions are compared against the actual values, and the performance metrics are printed for analysis.

Model Saving and Loading

The trained model can be saved for later use using the .save() method and can be loaded back using the load_model() function from TensorFlow/Keras for inference.

Key Features:

Transformer-based architecture with multi-head attention.
Monte Carlo Dropout for uncertainty estimation.
Custom loss function with penalties for sign consistency and sum preservation.
Early stopping to prevent overfitting.
Model evaluation using MSE and MAE.

License

This project is licensed under the MIT License.

Contact

For questions or issues, please contact:

Email: jessicakristenr@gmail.com

Name		Name	Last commit message	Last commit date
Latest commit History 65 Commits
model		model
predictions		predictions
processing		processing
.gitignore		.gitignore
LICENCE.md		LICENCE.md
README.md		README.md
__main__.py		__main__.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

INFLOW-AI Flood Inundation Prediction Model

Overview

Table of Contents

Requirements

Installation

Usage

Program Workflow

Model Description

Model Architecture

Loss Function

Training Process

Model Evaluation

Model Saving and Loading

Key Features:

License

Contact

About

Uh oh!

Releases 1

Packages

Uh oh!

Languages

License

rapsoj/INFLOW-AI

Folders and files

Latest commit

History

Repository files navigation

INFLOW-AI Flood Inundation Prediction Model

Overview

Table of Contents

Requirements

Installation

Usage

Program Workflow

Model Description

Model Architecture

Loss Function

Training Process

Model Evaluation

Model Saving and Loading

Key Features:

License

Contact

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Languages

Packages