Skip to content

Latest commit

 

History

History
80 lines (56 loc) · 2.69 KB

File metadata and controls

80 lines (56 loc) · 2.69 KB

ISRO_BAH Proof Of Concept

Chase the Cloud: Leveraging Diffusion Models for Cloud Motion Prediction using INSAT-3DR/3DS Imagery

Conditional Diffusion Model for Satellite Frame Prediction

This repository contains a prototype implementation of a conditional diffusion model to predict the next geostationary satellite image frame using past satellite observations. The primary input is the TIR1 channel, with optional use of WV as a conditional feature.

Objective

The goal is to explore generative modeling for short-term cloud motion forecasting using satellite imagery. This proof of concept predicts the next TIR1 frame from previous timesteps using a simple UNet-based architecture within a diffusion framework.

Directory Structure

The data is organized in the following format:

output/ YYYYMMDD/ HHMM/ IMG_TIR1.png IMG_TIR2.png IMG_WV.png IMG_MIR.png IMG_SWIR.png IMG_VIS.png

Each subfolder contains six grayscale PNG images representing different spectral channels at that timestep.

Features

  • Conditional diffusion model using UNet
  • TIR1 as the base prediction target
  • WV used as an optional conditional feature
  • Sliding window frame input for temporal context
  • Evaluation using SSIM, PSNR, and MAE
  • Trained on geostationary satellite imagery with real spatial resolution

How It Works

  • The model takes in three consecutive timesteps of TIR1 and WV images
  • It predicts the next TIR1 frame
  • The model is trained using a sliding window strategy over the available dataset
  • During training, the prediction from the fixed sample (first four instances of 1st June) is saved every epoch for visual validation

Model

The model is a shallow convolutional encoder-decoder with 6 input channels and 1 output channel. It is trained with MSE loss and optimized with Adam.

Evaluation

To evaluate a saved model:

  • Load a specific triplet of input timesteps
  • Predict the next frame using the trained model
  • Compute SSIM, PSNR, and MAE against the ground truth

Planned Extensions

  • Add support for more conditional channels (e.g. MIR, SWIR)
  • Train or integrate a self-supervised encoder and decoder setup for latent conditioning
  • Explore improved temporal encoders (ConvLSTM, 3D CNN)
  • Replace UNet with a deeper or hierarchical model
  • Switch to latent diffusion for efficiency and scalability

Requirements

  • Python 3.8+
  • PyTorch
  • torchvision
  • scikit-image
  • tqdm
  • Pillow
  • OpenCV

Notes

  • Make sure to adjust input paths and device configuration as per your system
  • Training may take time depending on image resolution and hardware

License

This project is for research and prototyping purposes only.