|
| 1 | +--- |
| 2 | +type: "project" # DON'T TOUCH THIS ! :) |
| 3 | +date: "2025-06-13" # Date you first upload your project. |
| 4 | +title: "CVAE-based ADHD neuroimaging analysis" |
| 5 | +names: [Cian-Ya Lan, Jia-Ling Sun] |
| 6 | +github_repo: https://github.com/Cleo-Lan-school/BHS_2025-project |
| 7 | +website: |
| 8 | +tags: [adhd, mri, cvae, iq] |
| 9 | +summary: "This project applies a contrastive variational autoencoder (CVAE) to Burner-preprocessed MRI data from the ADHD-200 dataset to disentangle ADHD-specific brain features from shared anatomical variation. We explore latent representations using RSA and clustering to better understand neuroanatomical heterogeneity in ADHD." |
| 10 | +image: "cover.png" |
| 11 | +--- |
| 12 | + |
| 13 | +## Project definition |
| 14 | + |
| 15 | +### Background |
| 16 | + |
| 17 | +- Reconstruct 3D brain MRIs using CVAEs. |
| 18 | +- Disentangle "salient" ADHD-related features from "background" features common to both ADHD and typically developing children. |
| 19 | +- Assess how well the learned latent spaces reflect behavioral and clinical variation using: |
| 20 | + - **Silhouette Analysis** |
| 21 | + - **Representational Similarity Analysis (RSA)** |
| 22 | + |
| 23 | +### Tools |
| 24 | + |
| 25 | +This project used: |
| 26 | +- Python (NumPy, Pandas, SciPy, Scikit-learn, Matplotlib, Seaborn) |
| 27 | +- Keras (TensorFlow backend) for deep learning |
| 28 | +- UMAP for latent space visualization |
| 29 | +- Representational Similarity Analysis (RSA) using Kendall’s tau |
| 30 | +- Silhouette analysis for latent space separability |
| 31 | +- GitHub for version control |
| 32 | + |
| 33 | +### Data |
| 34 | + |
| 35 | +This project used the publicly available [ADHD-200 dataset](http://fcon_1000.projects.nitrc.org/indi/adhd200/), specifically the **Burner-preprocessed version**: |
| 36 | +- Structural MRI data processed via voxel-based morphometry (VBM) |
| 37 | +- Normalized 3D gray matter volumes (64×64×64) |
| 38 | +- Accompanied by phenotypic variables: age, sex, diagnosis, subtype, medication, IQ, etc. |
| 39 | + |
| 40 | +### Deliverables |
| 41 | + |
| 42 | +At the end of the project, we produced: |
| 43 | +- A working CVAE framework for modeling neuroanatomical variation |
| 44 | +- Visualizations of latent features and synthetic brain reconstructions |
| 45 | +- RSA and clustering results relating brain features to clinical data |
| 46 | +- Well-documented code and reproducible analysis notebooks |
| 47 | + |
| 48 | +## Results |
| 49 | + |
| 50 | +### Progress overview |
| 51 | + |
| 52 | +We trained a CVAE model with two latent components: |
| 53 | +- `s` (salient features): ADHD-specific anatomical variation |
| 54 | +- `z` (background features): common/shared variation |
| 55 | + |
| 56 | +We evaluated the model using: |
| 57 | +- **Silhouette scores** for latent clustering |
| 58 | +- **RSA** to correlate latent dimensions with phenotypic variables |
| 59 | +- **GMM clustering and BIC** to test for discrete vs. continuous subtype structure |
| 60 | + |
| 61 | +### Tools I learned during this project |
| 62 | + |
| 63 | +- Contrastive representation learning in generative models |
| 64 | +- Implementation of CVAEs in Keras |
| 65 | +- Preprocessing and working with VBM MRI data |
| 66 | +- Representational Similarity Analysis |
| 67 | +- Model interpretability techniques for brain data |
| 68 | + |
| 69 | +### Results |
| 70 | + |
| 71 | +#### Deliverable 1: CVAE analysis |
| 72 | + |
| 73 | +- Salient features (`s`) were significantly correlated with: |
| 74 | + - **ADHD Index** |
| 75 | + - **Inattentive and Hyperactive/Impulsive scores** |
| 76 | + - **Medication status** and **age** |
| 77 | + |
| 78 | +- Shared features (`z`) were more related to: |
| 79 | + - **IQ**, **gender**, and **scan site** |
| 80 | + |
| 81 | +#### Deliverable 2: Clustering analysis |
| 82 | + |
| 83 | +- Gaussian Mixture Models (GMM) with Bayesian Information Criterion (BIC) showed: |
| 84 | + - **Lowest BIC at 1 cluster**, suggesting **continuous heterogeneity** |
| 85 | + - Results align with dimensional models of psychiatric disorders |
| 86 | + |
| 87 | +#### Deliverable 3: Code and notebook |
| 88 | + |
| 89 | +- Full training pipeline in `Train-CVAE-ADHD200.ipynb` |
| 90 | +- Visualization and RSA in helper scripts |
| 91 | +- Readme documentation and reproducibility checklist included |
| 92 | + |
| 93 | +## Conclusion and acknowledgement |
| 94 | + |
| 95 | +This project demonstrates the potential of contrastive deep generative models like CVAEs to disentangle disorder-specific neuroanatomical features from shared variation. Our findings suggest ADHD may be better described along a spectrum rather than discrete subtypes. We thank the Brainhack School instructors and the open neuroimaging community for providing tools and data that made this project possible. |
0 commit comments