Skip to content

Latest commit

 

History

History
219 lines (159 loc) · 13.8 KB

File metadata and controls

219 lines (159 loc) · 13.8 KB

NoMAISI: Nodule-Oriented Medical AI for Synthetic Imaging

PiNS Logo

Nodule-Oriented Medical AI for Synthetic Imaging and Augmentation in Chest CT

License: CC BY-NC 4.0 Docker Python Medical Imaging PyTorch MONAI PiNS CaNA

Abstract

Medical imaging datasets are increasingly available, yet abnormal and annotation-intensive cases such as lung nodules remain underrepresented. We Introduced NoMAISI (Nodule-Oriented Medical AI for Synthetic Imaging), a generative framework built on foundational backbones with flow-based diffusion and ControlNet conditioning. Using NoMAISI, we curated a large multi-cohort lung nodule dataset and applied context-aware nodule volume augmentation, including relocation, shrinkage to simulate early-stage disease, and expansion to model progression. Each case was rendered into multiple synthetic variants, producing a diverse and anatomically consistent dataset. Fidelity was evaluated with cross-cohort similarity metrics, and downstream integration into lung nodule detection, and classification tasks demonstrated improved external test performance, particularly in underrepresented lesion categories. These results show that nodule-oriented synthetic imaging and curated augmentation can complement clinical data, reduce annotation demands, and expand the availability of training resources for healthcare AI.

🧩 Workflow Overview

The overall pipeline for organ, body, and nodule segmentation with alignment is shown below:

Segmentation Pipeline

Workflow for constructing the NoMAISI development dataset. The pipeline includes (1) organ segmentation using AI models, (2) body segmentation with algorithmic methods, (3) nodule segmentation through AI-assisted and ML-based refinement, and (4) segmentation alignment to integrate organs, body, and nodules segmentations into anatomically consistent volumes.

NoMAISI_train_and_infer

Overview of our flow-based latent diffusion model with ControlNet conditioning for AI-based CT generation. The pipeline consists of three stages: (top) Pretrained VAE for image compression, where CT images are encoded into latent features using a frozen VAE; (middle) Model fine-tuning, where a Rectified Flow ODE sampler, conditioned on segmentation masks and voxel spacing through a fine-tuned ControlNet, predicts velocity fields in latent space and is optimized with a region-specific contrastive loss emphasizing ROI sensitivity and background consistency; and (bottom) Inference, where segmentation masks and voxel spacing guide latent sampling along the ODE trajectory to obtain a clean latent representation, which is then decoded by the VAE into full-resolution AI-generated CT images conditioned by body and lesion masks.

📊 Dataset Composition

The table below summarizes the datasets included in this project, with their split sizes (Patients, CT scans, and Nodules) and the annotation types available.

Dataset Patients
n (%)
CT Scans
n (%)
Nodules
n (%)
Organ Seg Nodule Seg Nodule CCC Nodule Box
LNDbv4 223 (3.17) 223 (2.52) 1132 (7.84)
NSCLC-R 415 (5.89) 415 (4.69) 415 (2.87)
LIDC-IDRI 870 (12.35) 870 (9.84) 2584 (17.89)
DLCS-24 1605 (22.79) 1605 (18.15) 2478 (17.16)
Intgmultiomics 1936 (27.49) 1936 (21.90) 1936 (13.40)
LUNA-25 1993 (28.30) 3792 (42.89) 5899 (40.84)
TOTAL 7042 (100) 8841 (100) 14444 (100)

Notes

  • Percentages indicate proportion relative to the total for each column.
  • ✔︎ = annotation available, ✗ = annotation not available.
  • “Nodule CCC” = nodule center coordinates.
  • “Nodule Box” = bounding-box annotations.

📚 Dataset citations References

AI-Generated CT Evaluations

📉 Fréchet Inception Distance (FID) Results

Fréchet Inception Distance (FID) of the MAISI-v2 baseline and NoMAISI models with multiple public clinical datasets (test dataset) as the references (Lower is better).

FID (Avg.) LNDbv4 NSCLC-R LIDC-IDRI DLCS-24 Intgmultiomics LUNA-25
Real LNDbv4 5.13 1.49 1.05 2.40 1.98
Real NSCLC-R 5.13 3.12 3.66 1.56 2.65
Real LIDC-IDRI 1.49 3.12 0.79 1.44 0.75
Real DLCS-24 1.05 3.66 0.79 1.56 1.00
Real Intgmultiomics 2.40 1.56 1.44 1.56 1.57
Real LUNA-25 1.98 2.65 0.75 1.00 1.57
AI-Generated MAISI-V2 3.15 5.21 2.70 2.32 2.82 1.69
AI-Generated NoMAISI (ours) 2.99 3.05 2.31 2.27 2.62 1.18

📉 FID Parity Plot

Parity comparison of FID for real↔real vs AI-generated CT across datasets

Comparison of Fréchet Inception Distance (FID) between real↔real and AI-generated CT datasets. Each point represents a clinical dataset (LNDbv4, NSCLC-R, LIDC-IDRI, DLCS24, Intgmultiomics, LUNA25) under different generative models (MAISI-V2, NoMAISI).The x-axis shows the median FID computed between real datasets, while the y-axis shows the FID of AI-generated data compared to real.
The dashed diagonal line denotes parity (y = x), where AI-generated fidelity would match real↔real fidelity.

🖼️ Example Results

Comparison of CT generation from anatomical masks.

  • Left: Input organ/body segmentation mask.
  • Middle: Generated CT slice using MAISI-V2.
  • Right: Generated CT slice using NoMAISI (ours).
  • Yellow boxes highlight lung nodule regions for comparison.

Comparison of MAISI-V2 vs NoMAISI on lung CT with input masks

Comparison of MAISI-V2 vs NoMAISI on lung CT with input masks

Comparison of MAISI-V2 vs NoMAISI on lung CT with input masks

Inference Guide

  1. Project Structure
  2. Configuration Files

Model Weights

Model weights are available upon request. Please email the authors: tushar.ece@duke.edu.

📁 Project Structure

NoMAISI/
├── configs/                          # Configuration files
│   ├── config_maisi3d-rflow.json    # Main model configuration
│   ├── infr_env_NoMAISI_DLCSD24_demo.json  # Environment settings
│   └── infr_config_NoMAISI_controlnet.json # ControlNet inference config
├── scripts/                          # Python inference scripts
│   ├── infer_testV2_controlnet.py   # Main inference script
│   ├── infer_controlnet.py          # ControlNet inference
│   └── utils.py                     # Utility functions
├── models/                           # Pre-trained model weights
├── data/                            # Input data directory
├── outputs/                         # Generated results
├── logs/                           # Execution logs
└── inference.sub                   # SLURM job script

⚙️ Configuration Files

1. Main Model Configuration (config_maisi3d-rflow.json): Controls the core diffusion model parameters:

  • Model architecture settings; Sampling parameters; Image dimensions and spacing

2. Environment Configuration (infr_env_NoMAISI_DLCSD24_demo.json): Defines runtime environment

  • Data paths and directories; GPU settings; Memory allocation

3. ControlNet Configuration (infr_config_NoMAISI_controlnet.json): ControlNet-specific settings

  • Conditioning parameters; Generation controls; Output specifications

🚀 Running Inference

cd /path/NoMAISI/
# Create logs directory if it doesn't exist
mkdir -p logs
# Submit job to SLURM
sbatch inference.sub
# Run inference directly
cd /path/NoMAISI/
python -m scripts.infer_testV2_controlnet \
    -c ./configs/config_maisi3d-rflow.json \
    -e ./configs/infr_env_NoMAISI_DLCSD24_demo.json \
    -t ./configs/infr_config_NoMAISI_controlnet.json

Downstream Task:

  • Cancer vs. No-Cancer Classification
  • Nodule Detection Coming Soon.
  • Nodule Segmentation Coming Soon.


🔬 Downstream Task: Cancer vs. No-Cancer Classification

Cancer/No-Cancer Classification Results

Shown. AUC vs. the % of clinical data retained (x-axis: 100%, 50%, 20%, 10%). Curves (additive augmentation — we add AI-generated nodules; we never replace clinical samples):

  • Clinical (LUNA25) — baseline using only the retained clinical data.
  • Clinical + AI-gen. (n%) — at each point, add AI-generated data equal to the same percentage as the retained clinical fraction.
    Examples: at 50% clinical → +50% AI-gen; 20% → +20%; 10% → +10%.
  • Clinical + AI-gen. (100%) — at each point, add AI-generated data equal to 100% of the full clinical dataset size, regardless of the retained fraction.
    Example: at 10% clinical → +100% AI-gen.

Takeaways

  • AI-generated nodules improve data-efficiency: at low clinical fractions (50%→10%), Clinical + AI-gen. (n%) typically matches or exceeds clinical-only AUC.
  • Bigger synthetic boosts (100%) can help in some regimes but may underperform the matched n% mix depending on cohort → ratio-balanced augmentation is often safer.
  • Trends generalize to external cohorts, indicating usability beyond the development data.

Acknowledgements

We gratefully acknowledge the open-source projects that directly informed this repository: the MAISI tutorial from the Project MONAI tutorials, the broader Project MONAI ecosystem, our related benchmark repo AI in Lung Health – Benchmarking, and our companion toolkits PiNS – Point-driven Nodule Segmentation and CaNA – Context-Aware Nodule Augmentation. We thank these communities and contributors for their exceptional open-source efforts. If you use our models or code, please also consider citing these works (alongside this repository) to acknowledge their contributions.

References