Official implementation of the paper Pain in 3D: Generating Controllable Synthetic Faces for Automated Pain Assessment.
π Official project page: https://xinlei55555.github.io/pain-in-3d.github.io/
This codebase supports training on both synthetic 3D pain face datasets and the UNBC-McMaster Shoulder Pain Expression Archive.
This implementation provides tools for automated pain assessment through:
- Reference-guided Vision Transformers (ViTPain) for pain intensity estimation
- Multi-task learning combining PSPI regression and Action Unit prediction
- Support for both synthetic and real-world pain datasets
- Comprehensive evaluation metrics including regression, classification, and correlation measures
Controllable 3D pain face synthesis using parametric facial models with AU-based deformations
Reference-guided Vision Transformer with DinoV3 backbone, LoRA adapters, and AU query head for pain assessment
- Visual Overview
- Installation
- Dataset Setup
- Training
- Model Architecture
- Evaluation Metrics
- Project Structure
- Citation
- License
python -m venv pain_env
source pain_env/bin/activate # On Windows: pain_env\Scripts\activatepip install -r requirements.txt- PyTorch >= 2.0.0
- PyTorch Lightning >= 2.0.0
- Transformers (for ViT models)
- timm (for DinoV3 models)
- wandb (for experiment tracking)
- huggingface-hub (for downloading pretrained weights)
The code expects datasets to be organized in a datasets/ directory at the project root:
PainGeneration_clean/
βββ datasets/
β βββ UNBC-McMaster/ # UNBC-McMaster dataset
β β βββ frames_unbc_2020-09-21-05-42-04.hdf5
β β βββ annotations_unbc_2020-10-13-22-55-04.hdf5
β β βββ UNBC_CVFolds_2019-05-16-15-16-36.hdf5
β βββ pain_faces/ # Synthetic 3D pain faces
β βββ meshes_inpainted/ # RGB images
β βββ annotations/ # JSON annotations
βββ data/
βββ lib/
βββ scripts/
βββ ...
The UNBC-McMaster Shoulder Pain Expression Archive Dataset should be obtained from the official source and converted to HDF5 format with the following files:
frames_unbc_2020-09-21-05-42-04.hdf5- Face image framesannotations_unbc_2020-10-13-22-55-04.hdf5- AU annotations and PSPI scoresUNBC_CVFolds_2019-05-16-15-16-36.hdf5- Cross-validation fold splits
The 3D Pain synthetic dataset is available on Hugging Face:
# Using git-lfs
git lfs install
git clone https://huggingface.co/datasets/SoroushMehraban/3D-Pain datasets/pain_faces
# Or using the Hugging Face datasets library
python -c "from datasets import load_dataset; load_dataset('SoroushMehraban/3D-Pain')"Training follows a two-stage pipeline: (1) pretrain on synthetic 3DPain data, then (2) fine-tune on UNBC-McMaster using the pretrained checkpoint.
We provide pretrained weights on Hugging Face that were trained on the 3D-Pain synthetic dataset:
from huggingface_hub import hf_hub_download
# Download pretrained checkpoint
checkpoint_path = hf_hub_download(
repo_id="xinlei55555/ViTPain",
filename="vitpain-epoch=141-val_regression_mae=1.859.ckpt",
cache_dir="./checkpoints"
)Or download via command line:
pip install huggingface-hub
huggingface-cli download xinlei55555/ViTPain \
vitpain-epoch=141-val_regression_mae=1.859.ckpt \
--local-dir ./experiment/vitpain_pretrain/checkpoints/Model Card: https://huggingface.co/xinlei55555/ViTPain
# Option A: Download pretrained weights (recommended, see above)
# Option B: Train from scratch on synthetic data
./scripts/train_synthetic_pretrain.sh
# Then: 5-fold cross-validation on UNBC
./scripts/train_unbc_5fold.sh
# Evaluate results
python scripts/evaluate_unbc.py experiment/unbc_5fold_cvSkip this step if you downloaded pretrained weights above.
Otherwise, pretrain the ViTPain model on synthetic 3D pain faces:
python train_vitpain.py \
--data_dir datasets/pain_faces \
--split_csv data/splits/uniform_data_70_20_10_split.csv \
--model_size large_dinov3 \
--batch_size 48 \
--max_epochs 150 \
--learning_rate 1e-4 \
--weight_decay 1e-1 \
--au_loss_weight 1.0 \
--pspi_loss_weight 1.0 \
--lora_rank 8 \
--lora_alpha 16 \
--use_neutral_reference \
--output_dir experiment/vitpain_pretrainThe best checkpoint is saved to experiment/vitpain_pretrain/checkpoints/.
Run 5-fold CV on UNBC using a pretrained checkpoint (either downloaded or trained from scratch):
# Using downloaded Hugging Face checkpoint
python scripts/run_unbc_5fold_cv.py \
--pretrained_checkpoint ./checkpoints/vitpain-epoch=141-val_regression_mae=1.859.ckpt \
--data_dir datasets/UNBC-McMaster \
--model_size large_dinov3 \
--batch_size 100 \
--max_epochs 50 \
--au_loss_weight 0.1 \
--use_weighted_sampling \
--use_neutral_reference \
--lora_rank 8 \
--lora_alpha 16 \
--output_dir experiment/unbc_5fold_cv
# Or using checkpoint trained from scratch in Stage 1
python scripts/run_unbc_5fold_cv.py \
--pretrained_checkpoint experiment/vitpain_pretrain/checkpoints/best.ckpt \
--data_dir datasets/UNBC-McMaster \
--model_size large_dinov3 \
--batch_size 100 \
--max_epochs 50 \
--au_loss_weight 0.1 \
--use_weighted_sampling \
--use_neutral_reference \
--lora_rank 8 \
--lora_alpha 16 \
--output_dir experiment/unbc_5fold_cvTrain a final model on ALL UNBC data for deployment (use downloaded or trained checkpoint):
# Using downloaded Hugging Face checkpoint
python scripts/train_unbc_production.py \
--synthetic_pretrained_checkpoint ./checkpoints/vitpain-epoch=141-val_regression_mae=1.859.ckpt \
--data_dir datasets/UNBC-McMaster \
--batch_size 100 \
--max_epochs 50 \
--au_loss_weight 0.1 \
--use_neutral_reference \
--use_weighted_sampling \
--output_dir experiment/unbc_productionEvaluate 5-fold cross-validation results:
python scripts/evaluate_unbc.py experiment/unbc_5fold_cvWith multi-shot inference (averages predictions across N neutral references):
python scripts/evaluate_unbc.py experiment/unbc_5fold_cv \
--use_neutral_reference --multi_shot_inference 3Key metrics reported:
- Pearson Correlation: Mean across folds with 95% CI
- AUROC: At PSPI thresholds 1, 2, 3 (mean across folds)
- Train-Calibrated F1: Threshold tuned on train set, applied to test set
- Most realistic evaluation for production use
- Reported at PSPI thresholds 1, 2, 3 and macro average
For detailed metrics (test-calibrated F1, uncalibrated F1, combined metrics), use:
python scripts/evaluate_unbc_verbose.py experiment/unbc_5fold_cv--model_size:small_dinov3,base_dinov3, orlarge_dinov3--use_neutral_reference: Enable neutral reference images--multi_shot_inference: Number of neutral refs for ensemble (default: 1)--lora_rank: LoRA rank (default: 8)--lora_alpha: LoRA alpha (default: 16)
--batch_size: Batch size per GPU (default: 48 for synthetic, 100 for UNBC)--max_epochs: Maximum epochs (default: 150 for synthetic, 50 for UNBC)--learning_rate: Learning rate (default: 1e-4)--weight_decay: Weight decay (default: 1e-1)--precision: 16 or 32 (default: 16)--au_loss_weight: Weight for AU prediction loss (default: 1.0)--pspi_loss_weight: Weight for PSPI regression loss (default: 1.0)
--data_dir: Path to dataset--fold: CV fold for UNBC (0-4)--split_csv: Train/val/test split CSV (for synthetic data)--use_weighted_sampling: Handle class imbalance
--output_dir: Checkpoint and log directory--wandb_project: W&B project name--run_name: Custom run name
experiment/
βββ vitpain_pretrain/ # Stage 1: Pretrained model
β βββ checkpoints/
β βββ best.ckpt
βββ unbc_5fold_cv/ # Stage 2: Cross-validation
β βββ fold_0/
β β βββ checkpoints/
β βββ fold_1/
β βββ ...
β βββ combined_evaluation_results_corr.txt
βββ unbc_production/ # Stage 3: Production model
βββ checkpoints/
βββ best.ckpt
PainGeneration_clean/
βββ data/ # Data loaders
β βββ unbc_loader.py # UNBC-McMaster dataset loader
β βββ pain3d_loader.py # 3D synthetic pain face dataset loader
β βββ split_utils.py # Split utilities
βββ lib/ # Library code
β βββ models/ # Model definitions
β βββ vitpain.py # ViTPain model
β βββ pspi_evaluator_mixin.py # Evaluation metrics
βββ scripts/ # Training & evaluation scripts
β βββ train_synthetic_pretrain.sh # Pretrain on synthetic data
β βββ train_unbc_5fold.sh # 5-fold cross-validation
β βββ train_unbc_production_simple.sh # Production training
β βββ run_unbc_5fold_cv.py # 5-fold CV Python script
β βββ train_unbc_production.py # Production training script
β βββ evaluate_unbc.py # Main evaluation (clean output)
β βββ evaluate_unbc_verbose.py # Verbose evaluation (all metrics)
βββ configs/ # Configuration management
β βββ __init__.py # Config dataclasses and parser
βββ train_vitpain.py # Train on synthetic data
βββ train_unbc.py # Train on UNBC-McMaster
βββ requirements.txt # Python dependencies
βββ README.md # This file
ViTPain is a reference-based Vision Transformer designed for pain assessment:
- Backbone: DinoV3 vision transformer (always enabled)
- Fine-tuning: LoRA adapters for efficient training (always enabled)
- AU Query Head: Cross-attention with learnable queries for AU prediction (always enabled)
- Input: Target (pain) face + optional neutral reference face
- Outputs:
- PSPI score (0-16 regression)
- Action Unit intensities (AU4, AU6, AU7, AU9, AU10, AU43)
- DinoV3 Backbone: State-of-the-art vision features
- LoRA Fine-tuning: Memory-efficient training with adapters
- AU Query Head: Attention-based AU prediction
- Multi-task Learning: Joint prediction of PSPI and AUs
- Multi-Shot Inference: Ensemble predictions with multiple neutral references (optional)
The code reports the following core metrics:
- Mean Absolute Error (MAE)
- Pearson Correlation (Corr)
- Binary F1, Precision, Recall (pain vs. no-pain)
- AUROC
If you use this code, pretrained weights, or the 3DPain dataset in your research, please cite:
@article{lin2025pain,
title={Pain in 3D: Generating Controllable Synthetic Faces for Automated Pain Assessment},
author={Lin, Xin Lei and Mehraban, Soroush and Moturu, Abhishek and Taati, Babak},
journal={arXiv preprint arXiv:2509.16727},
year={2025}
}This project is licensed under the MIT License.
