Spatiotemporal Brain Decoding Using Diffusion Models and Transformer-Augmented U-Net on fMRI #67

RishavAr · 2025-04-07T21:54:30Z

RishavAr
Apr 7, 2025

Hi mentors and Emory BMI community

I'm Rishav Aryan, a Research Assistant in the Department of Bioengineering at George Mason University and a graduate student in the Data Analytics & Engineering program. I am thrilled to apply for the GSoC 2025 project titled "Advancing Brain Decoding and Cognitive Analysis: Leveraging Diffusion Models for Spatiotemporal Pattern Recognition in fMRI Data", proposed by Emory BMI.

Motivation & Background
I’m deeply passionate about combining neuroscience with deep learning. My research over the past year has focused on designing and optimizing 3D segmentation models for medical MRI. I have hands-on experience in developing architectures that handle spatial-temporal features, attention modules, and challenging real-world biomedical data.

Here’s a quick snapshot of my recent contributions:
Developed a novel 3D Attention U-Net for segmenting extraocular muscles from ocular MRI, integrating attention blocks for spatial focus and robustness across patient datasets.

Designed a custom loss function combining Dice, cross-entropy, and centroid alignment loss to enhance anatomical consistency.

Conducted in-depth validation with visualizations, slice-wise comparisons, volume-wise video creation, and automated evaluation scripts (F1, accuracy, PR-curve).

VBUNet, a novel hybrid CNN for breast tumor segmentation, which has been accepted in Neural Computing and Applications (Springer, SCIE-indexed).

Also authored a deep learning pipeline for left atrium segmentation, achieving SOTA metrics (IEEE INDICON).
These experiences have not only improved my deep learning skills but also taught me the importance of clean pipeline design, experiment reproducibility, and cross-validation in biomedical research.

Why This Project?
The Emory BMI project directly aligns with both my research background and long-term goals.

I’m particularly interested in:

How DDPMs can be extended to 4D neuroimaging data
The challenge of encoding spatial and temporal brain activity patterns
Task-conditioned diffusion modeling for cognitive state decoding
Having already worked on 3D CNNs, attention fusion, and segmentation on MR volumes, I see this project as a natural extension of my prior work, now moving into generative modeling and neural signal decoding.

Technical Plan (Brief Overview)

Preprocessing: Use NiBabel, Nilearn, and FSL to preprocess BOLD data, align subjects, normalize intensities, and format into 4D tensors

Model Architecture: A hybrid 3D U-Net with spatial encoding and Transformer encoder for temporal BOLD signals

Diffusion Process Implement DDPM forward and reverse noise processes, apply classifier-free guidance using task labels

Loss Functions: Mix of denoising MSE + classification loss (cross-entropy)
Evaluation MSE, PSNR, SSIM for reconstruction + F1, ROC-AUC for classification

Visualization :Overlay predicted vs actual BOLD using Nilearn + slice-wise GIFs

Dataset Plan
OpenNeuro: Access task-fMRI datasets such as ds002748 and ds003701
Human Connectome Project (HCP): For high-res BOLD datasets with task labels

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Spatiotemporal Brain Decoding Using Diffusion Models and Transformer-Augmented U-Net on fMRI #67

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Spatiotemporal Brain Decoding Using Diffusion Models and Transformer-Augmented U-Net on fMRI #67

Uh oh!

RishavAr Apr 7, 2025

Replies: 0 comments

RishavAr
Apr 7, 2025