Improved Multi-modal Image Fusion with Attention and Dense Networks

This repository contains the official PyTorch implementation of the paper: "Improved Multi-modal Image Fusion with Attention and Dense Networks: Visual and Quantitative Evaluation" Published in: Communications in Computer and Information Science, Springer (2024)

Link to Paper

📝 Abstract

This paper proposes a novel deep learning architecture for fusing multi-modal images (such as Infrared and Visible). The model integrates DenseNet blocks for robust feature extraction and Convolutional Block Attention Modules (CBAM) to focus on salient spatial and channel-wise features. The approach demonstrates superior performance in both visual quality and quantitative metrics compared to existing state-of-the-art methods.

🏗️ Architecture

The network consists of:

Dual-Branch Feature Extraction: Two DenseNet branches process Infrared and Visible images independently.
Attention Mechanism: CBAM blocks refine the features by emphasizing important channels and spatial regions.
Reconstruction: A series of convolutional layers merge the features to generate the final fused image.

📂 Project Structure

. ├── checkpoints/ # Saved model weights ├── datasets/ # Dataset directory │ ├── train/ │ │ ├── IR/ │ │ └── VIS/ │ └── test/ │ ├── IR/ │ └── VIS/ ├── model_attention_dense.py # Model architecture (CBAMFuse) ├── input_data.py # Dataloader ├── pytorch_ssim.py # SSIM Loss function ├── train_cbam.py # Training script ├── test_cbam.py # Testing/Inference script └── requirements.txt # Dependencies

🚀 Getting Started

Prerequisites

Install the required dependencies:

pip install -r requirements.txt

Dataset Preparation
Organize your data into train and test folders. Ensure that Infrared (IR) and Visible (VIS) images have matching filenames or are sorted alphabetically.

Training
To train the model from scratch:
python train_cbam.py --ir_dataroot ./datasets/train/IR --vis_dataroot ./datasets/train/VIS --epoch 20

Testing / Inference
To test the model using pre-trained weights:
python test_cbam.py --ir_dataroot ./datasets/test/IR --vis_dataroot ./datasets/test/VIS --output_root ./results/

If you find this work useful in your research, please cite:
@InProceedings{10.1007/978-3-031-58535-7_20,
author="Banerjee, Ankan
and Patra, Dipti
and Roy, Pradipta",
title="Improved Multi-modal Image Fusion with Attention and Dense Networks: Visual and Quantitative Evaluation",
booktitle="Communications in Computer and Information Science",
year="2024",
publisher="Springer Nature Switzerland",
pages="242--255",
isbn="978-3-031-58535-7"
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Improved Multi-modal Image Fusion with Attention and Dense Networks

📝 Abstract

🏗️ Architecture

📂 Project Structure

🚀 Getting Started

Prerequisites

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
checkpoints		checkpoints
README.md		README.md
input_data.py		input_data.py
model_attention_dense.py		model_attention_dense.py
pytorch_ssim.py		pytorch_ssim.py
requirements.txt		requirements.txt
test_cbam.py		test_cbam.py
train_cbam.py		train_cbam.py

Folders and files

Latest commit

History

Repository files navigation

Improved Multi-modal Image Fusion with Attention and Dense Networks

📝 Abstract

🏗️ Architecture

📂 Project Structure

🚀 Getting Started

Prerequisites

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages