🧠 Deep Reinforcement Learning for Personalized Radiotherapy Beam Orientation

This repository contains the code and experiments for my work on patient-specific Beam Orientation Optimization (BOO) in head-and-neck radiotherapy using Deep Q-Learning (DQN). The agent learns to select clinically meaningful gantry angles directly from voxel-level anatomy, without repeated Monte Carlo dose simulations.

TL;DR: Given CT anatomy and organ masks, we learn to predict 5 high-value beam angles in under a second, improving PTV coverage and sparing OARs compared to standard equiangular plans.

🔗 Quick Navigation

🔍 Overview & Motivation
📈 Results (100 Patients)
📂 Repository Structure
⚙️ Installation
▶️ Evaluation
🏋️ Training
🧬 Model Summary
🔮 Future Work
📄 Citation
🙏 Acknowledgements

📌 Overview / Problem

Selecting clinically optimal beam orientations is crucial in radiotherapy.
Conventional BOO methods:

Not personalized to anatomy ❌
Computationally infeasible at large search spaces ❌
Insensitive to voxel-level geometry ❌
Require repeated full dose simulations ❌

🚀 Core Idea

We formulate BOO as a sequential decision-making problem and train a Deep Q-Network to:

Extract voxel-level anatomical structure from CT + organ masks
Sequentially choose 5 distinct beam angles
Accumulate a pseudo-physical dose surrogate over timesteps
Optimize reward balancing:
- PTV coverage (good)
- OAR sparing (avoid toxicity)

Inference time: <1 second per patient.

📁 Repository Structure

Deep-Reinforcement-Learning-for-Personalized-Radiotherapy-Beam-Orientation-Optimization/
├── configs/
│   └── experiments.json
├── figures/
│   ├── success_cases/      # Best examples
│   ├── typical_cases/      # Typical
│   ├── failure_cases/     # Failure cases
│   └── anomaly_cases/     # Special discussion
├── models/
│   └── best_dqn_model.pt
├── results/
│   ├── summary_results.md
│   └── test_results.csv
├── utils/
│   └── repro.py
├── baselines.py
├── eval_main.py
├── train.py
├── requirements.txt
└── README.md

⚙️ Installation

git clone https://github.com/krishdef7/Deep-Reinforcement-Learning-for-Personalized-Radiotherapy-Beam-Orientation-Optimization.git
cd Deep-Reinforcement-Learning-for-Personalized-Radiotherapy-Beam-Orientation-Optimization
pip install -r requirements.txt

📂 Dataset (OpenKBP)

We use the OpenKBP dataset (head-and-neck):

CT volumes
PTV mask
OAR masks (cord, brainstem, L/R parotids, mandible)

Split:

Train: 200
Validation: 40
Test: 100

🛠 Users must download OpenKBP separately and update paths in configs/experiments.json

▶️ Running Evaluation (Generate Results + Figures)

python eval_main.py

Outputs will include:

results/test_results.csv — per-patient metrics (D95, coverage, OAR doses)
figures/patient_XXX_dose_dqn.png — DQN dose maps overlaid on CT
figures/patient_XXX_dvh_dqn.png — dose–volume histogram plots

🏋️ Training from Scratch

python train.py

Training summary:

Replay buffer: 3000
Batch size: 32
γ = 0.95
ε-greedy: 0.90 → 0.10
Target network update every 5 epochs
Converges in ~3.5 hours CPU

📈 Results — 100 Patient Evaluation

Method	Coverage	D95
DQN (ours)	0.8059	0.2405
Equiangular	0.6867	0.1207
Heuristic	0.6397	0.0949
RandomMean	0.5883	0.0554

Key Highlights

+11.9% absolute improvement in PTV coverage
~2× improvement in D95
<1 second per patient (post-training)
Strong generalization across 100 unseen CT cases

🧬 Model Summary

State (8 channels):

CT
PTV mask
5 OAR masks
Accumulating dose surrogate

Actions:

36 discrete gantry angles (0–350° at 10° spacing)
DQN selects 5 sequential non-repeating beams

Architecture:

5× Conv layers + BN + ReLU
Bottleneck: 4×4×256
Fully connected head
Masking to prevent repeated beams
Model parameters: ~3.4M

Dose Surrogate:

Ray-traced geometric field
Gaussian blur → approximate scatter
Accumulate dose per timestep

Reward:

Terminal reward based on:
- ↑ D95 and coverage
- ↓ mean OAR dose

🎨 Qualitative Examples

Located in:

figures/success_cases/
figures/typical_cases/
figures/failure_cases/
figures/anomaly_cases/

High-dose regions remain inside PTV and spare critical OARs.
DVH curves reflect improved target coverage.

📊 Baselines Implemented

All evaluated under identical surrogate dose to ensure fair comparison:

Equiangular beams
Geometry heuristic
Random non-repeating beams (mean)

🧭 Clinical Interpretation

Higher D95 → higher local tumor control likelihood
OAR avoidance reduces severe toxicity risk
<1s runtime enables:
- Adaptive planning
- Online replanning
- QA workflow assistive tools

🚧 Limitations (Honest Assessment)

Surrogate dose ≠ true Monte Carlo dose
Current version operates on single 2D slice
Trained only on head-and-neck geometry
Research prototype — not clinically deployable

🔮 Future Directions

3D DQN / U-Net encoders
GPU-based Monte Carlo integration
Learned neural surrogate physics
Multi-objective RL (Pareto optimal)
Online robustness against anatomical changes
Multi-disease training (lung, pelvis, liver)

📄 Citation

If you use this repository, please cite:

Deep Reinforcement Learning for Personalized Radiotherapy Beam Orientation Optimization.
Krish Garg, IIT Roorkee, 2025.

🙏 Acknowledgements

OpenKBP dataset contributors
IIT Roorkee — Department of Physics (institutional affiliation)
No external funding used

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🧠 Deep Reinforcement Learning for Personalized Radiotherapy Beam Orientation

🔗 Quick Navigation

📌 Overview / Problem

🚀 Core Idea

📁 Repository Structure

⚙️ Installation

📂 Dataset (OpenKBP)

▶️ Running Evaluation (Generate Results + Figures)

🏋️ Training from Scratch

📈 Results — 100 Patient Evaluation

Key Highlights

🧬 Model Summary

🎨 Qualitative Examples

📊 Baselines Implemented

🧭 Clinical Interpretation

🚧 Limitations (Honest Assessment)

🔮 Future Directions

📄 Citation

🙏 Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 63 Commits
configs		configs
figures		figures
models		models
results		results
utils		utils
README.md		README.md
baselines.py		baselines.py
eval_main.py		eval_main.py
requirements.txt		requirements.txt
train.py		train.py

Folders and files

Latest commit

History

Repository files navigation

🧠 Deep Reinforcement Learning for Personalized Radiotherapy Beam Orientation

🔗 Quick Navigation

📌 Overview / Problem

🚀 Core Idea

📁 Repository Structure

⚙️ Installation

📂 Dataset (OpenKBP)

▶️ Running Evaluation (Generate Results + Figures)

🏋️ Training from Scratch

📈 Results — 100 Patient Evaluation

Key Highlights

🧬 Model Summary

🎨 Qualitative Examples

📊 Baselines Implemented

🧭 Clinical Interpretation

🚧 Limitations (Honest Assessment)

🔮 Future Directions

📄 Citation

🙏 Acknowledgements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages