Skip to content

nikhil-agrawal123/Cloud-diffusion

 
 

Repository files navigation

ISRO_BAH Proof Of Concept

Chase the Cloud: Leveraging Diffusion Models for Cloud Motion Prediction using INSAT-3DR/3DS Imagery

Conditional Diffusion Model for Satellite Frame Prediction

This repository contains a prototype implementation of a conditional diffusion model to predict the next geostationary satellite image frame using past satellite observations. The primary input is the TIR1 channel, with optional use of WV as a conditional feature.

Objective

The goal is to explore generative modeling for short-term cloud motion forecasting using satellite imagery. This proof of concept predicts the next TIR1 frame from previous timesteps using a simple UNet-based architecture within a diffusion framework.

Directory Structure

The data is organized in the following format:

output/ YYYYMMDD/ HHMM/ IMG_TIR1.png IMG_TIR2.png IMG_WV.png IMG_MIR.png IMG_SWIR.png IMG_VIS.png

Each subfolder contains six grayscale PNG images representing different spectral channels at that timestep.

Features

  • Conditional diffusion model using UNet
  • TIR1 as the base prediction target
  • WV used as an optional conditional feature
  • Sliding window frame input for temporal context
  • Evaluation using SSIM, PSNR, and MAE
  • Trained on geostationary satellite imagery with real spatial resolution

How It Works

  • The model takes in three consecutive timesteps of TIR1 and WV images
  • It predicts the next TIR1 frame
  • The model is trained using a sliding window strategy over the available dataset
  • During training, the prediction from the fixed sample (first four instances of 1st June) is saved every epoch for visual validation

Model

The model is a shallow convolutional encoder-decoder with 6 input channels and 1 output channel. It is trained with MSE loss and optimized with Adam.

Evaluation

To evaluate a saved model:

  • Load a specific triplet of input timesteps
  • Predict the next frame using the trained model
  • Compute SSIM, PSNR, and MAE against the ground truth

Planned Extensions

  • Add support for more conditional channels (e.g. MIR, SWIR)
  • Train or integrate a self-supervised encoder and decoder setup for latent conditioning
  • Explore improved temporal encoders (ConvLSTM, 3D CNN)
  • Replace UNet with a deeper or hierarchical model
  • Switch to latent diffusion for efficiency and scalability

Requirements

  • Python 3.8+
  • PyTorch
  • torchvision
  • scikit-image
  • tqdm
  • Pillow
  • OpenCV

Notes

  • Make sure to adjust input paths and device configuration as per your system
  • Training may take time depending on image resolution and hardware

License

This project is for research and prototyping purposes only.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Jupyter Notebook 85.2%
  • Python 14.8%