Skip to content

Official code for the ICCV 2025 paper "egoPPG: Heart Rate Estimation from Eye Tracking Cameras in Egocentric Systems to Benefit Downstream Vision Tasks".

Notifications You must be signed in to change notification settings

eth-siplab/egoPPG

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

43 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

egoPPG: Heart Rate Estimation from Eye Tracking Cameras in Egocentric Systems to Benefit Downstream Vision Tasks (ICCV 2025)

Björn Braun, Rayan Armani, Manuel Meier, Max Moebus, Christian Holz

Sensing, Interaction & Perception Lab, Department of Computer Science, ETH Zürich, Switzerland

👓💗 egoPPG

egoPPG is a novel vision task for egocentric systems to recover a person’s cardiac activity to aid downstream vision tasks. Our method, PulseFormer continuously estimates the person’s photoplethysmogram (PPG) from areas around the eyes and fuses motion cues from the headset’s inertial measurement unit to track HR values. We demonstrate egoPPG’s downstream benefit for a key task on EgoExo4D, an existing egocentric dataset for which we find PulseFormer’s estimates of HR to improve proficiency estimation by 14%. Overview

🎥 egoPPG-DB dataset

To train and validate PulseFormer, we collected a dataset of 13+ hours of eye tracking videos from Project Aria and contact-based PPG signals as well as an electrocardiogram (ECG) for ground-truth HR values. Similar to EgoExo4D, 25 participants performed diverse everyday activities such as office work, cooking, dancing, and exercising, which induced significant natural motion and HR variation (44–164 bpm).

To download the egoPPG-DB dataset, which we recorded ourselves for training and evaluating egoPPG, please visit the following link: egoPPG-DB dataset.

You will need to sign a Data Transfer and Use Agreement (DTUA) form to agree to our terms of use. Please note that only members of an institution (e.g., a PI or professor) can sign this DTUA. After you have signed the DTUA, you will receive a download link via email. The dataset is around 220GB in size. The dataset is only for non-commercial, academic research purposes.

🔧 Setup

To create the environment that we used for our paper, simply run:

conda env create -f environment.yml`
conda activate egoPPG

📁 Code structure

egoPPG task

The following three folders contain all code related to the egoPPG task of predicting HR from eye-tracking videos on egoPPG-DB and EgoExo4D datasets:

  • configs/: contains all config files to run preprocessing and ML experiments
  • preprocessing/: code to preprocess the egoPPG-DB and EgoExo4D dataset to get training and inference data for ML models
  • ml/: code to run ML experiments for HR estimation from eye-tracking videos from egoPPG-DB and EgoExo4D using different models

Downstream task: Proficiency estimation on EgoExo4D

Code will be up soon. The proficiency_estimation/ folder contains all code related to the downstream proficiency estimation task on EgoExo4D using the TimeSformer architecture and the estimated HR values from the egoPPG task:

⚙️ Preprocessing for egoPPG

Run all scripts from the root folder (egoPPG) due to relative imports.

Preprocessing egoPPG-DB

  1. Adjust the config file in configs/preprocessing/config_preprocessing_egoppg.yaml according to your paths and desired preprocessing parameters (downsampling, upsampling, window size, etc.).
  2. Run the preprocessing script using the config file config_preprocessing_egoppg.yaml:
python -m preprocessing.preprocessing_egoppg --cfg_path configs/preprocessing/config_preprocessing_egoppg.yaml

Preprocessing EgoExo4D

To predict HR on EgoExo4D (for the egoPPG task), you have to preprocess the EgoExo4D dataset first so that you can then run the inference with the models trained on egoPPG-DB.

  1. To preprocess the EgoExo4D dataset, update the configs/preprocessing/config_preprocessing_egoexo4d.yaml file according to your paths and desired preprocessing parameters (downsampling, upsampling, window size, etc.).
  2. Run the preprocessing script using the config file config_preprocessing_egoexo4d.yaml:
python -m preprocessing.preprocessing_egoppg --cfg_path configs/preprocessing/config_preprocessing_egoexo4d.yaml

Important:

  • The eye tracking videos of the EgoExo4D dataset are originally recorded at only 10 fps, whereas the egoPPG-DB dataset is recorded at 30 fps.
  • We found that upsampling the EgoExo4D eye-tracking videos to 30 fps using linear interpolation between frames during preprocessing improves the HR estimation performance. Therefore, we recommend the following settings when aiming to predict HR on EgoExo4D:
    • config_preprocessing_egoexo4d.yaml: set upsampling factor to 3 (and downsampling factor to 1)
    • config_preprocessing_egoppg.yaml: set downsampling factor to 3 and upsampling factor to 3. This first downsamples the egoPPG-DB data from 30 fps to 10 fps and then upsamples it back to 30 fps using linear interpolation, matching exactly the preprocessing of the EgoExo4D data. This ensures a smaller domain gap between training and inference data when training on egoPPG-DB and inferring on EgoExo4D.
      • Note: This data will be saved into a folder named Down1_*_Up3. 1 is the effective downsampling factor (downsampling//upsampling).

⚡ Training and inference for egoPPG

Run all scripts from the root folder (egoPPG) due to relative imports.

General instructions

To train and evaluate ML models for the egoPPG task, you simply have to run the ml/main_ml.py script with different config files located in configs/ml/. The config files specify which model is used and on which dataset the model is trained and evaluated (egoPPG-DB or EgoExo4D). You also have to specify the paths to the preprocessed data in the config files (important: adjust to the configs chosen during the preprocessing step). For example, to train and evaluate the PulseFormer model on egoPPG-DB for HR estimation, run:

python -m ml.main_ml --cfg_path configs/ml/egoppg_egoppg_PulseFormer.yaml

Most important config parameters

  • TOOLBOX_MODE: specify whether you want to train and test or only test
  • INPUT_SIGNALS/LABEL_SIGNALS: specify which input and label signals to use (e.g., eye videos as input and PPG as label). The names have to match the names used during preprocessing.
  • DATA_PATH: overall path to the folder where you saved the preprocessed data of all datasets during the preprocessing step
  • FILE_PATH: if you want to save small files, such as logs, to a different folder than DATA_PATH. Can be the same as DATA_PATH.
  • CACHED_PATH: specify the path to the dataset, on which you want to train and test. train and test can be specified separately if you, e.g., want to train on egoPPG-DB and test on EgoExo4D.
  • MODEL_NAME: specify which model to use (PulseFormer, TS-CAN, DeepPhys, etc.)

Adding a new model

To add a new model, you need to create three new files and make changes in two other files:

  1. Create a new model file in ml/models/, e.g., my_model.py, which contains the architecture of your model. You can use existing model files as a template.
  2. Create a new trainer file in ml/trainer/, e.g., my_model_trainer.py, which contains the training and evaluation logic for your model. You can use existing trainer files as a template.
  3. Create a new config file in configs/ml/, e.g., egoppg_egoppg_my_model.yaml, which specifies the parameters for training and evaluating your model. Naming is usually in the format: trainset_testset_model.yaml.
  4. In ml/trainer/__init__.py, import your new trainer class.
  5. In ml/main_ml.py, add a new condition to instantiate your trainer class in BOTH the train_and_test and the test function.

Using a new dataset

To use a new dataset, you need to create a new preprocessing script and a new config file:

  1. Create a new preprocessing script in preprocessing/, e.g., preprocessing_my_dataset.py, which contains the logic to preprocess your dataset. You can use existing preprocessing scripts as a template.
  2. Create a new config file in configs/preprocessing/, e.g., config_preprocessing_my_dataset.yaml, which specifies the parameters for preprocessing your dataset. Naming is usually in the format: config_preprocessing_my_dataset.yaml.
  3. After the preprocessing is done, you can create new config files in configs/ml/ to train and evaluate models on your new dataset.

Other notes

  • The validation set size is set to 10% of the training set. Adjust in ml/ml_helper.py if needed.

📊 Results for egoPPG

For all training and inference, we used one GeForce RTX 4090 GPU.

Seed 0

The following table contains the results reported in our paper using only random seed 0.

Model MAE ↓ RMSE ↓ MAPE ↓ r ↑
Yue et al. [Yue et al., 2023] 29.63 32.99 37.86 0.10
DeepPhys [Chen et al., 2018] 28.26 31.97 36.68 0.08
TS-CAN [Liu et al., 2020] 26.32 32.39 29.13 0.11
ContrastPhys+ [Sun et al., 2024] 19.12 24.13 22.57 0.21
RhythmMamba [Zou et al., 2024] 15.05 19.78 17.46 -0.16
Baseline eyes 14.60 18.18 18.37 0.20
PhysMamba [Yan et al., 2024] 13.94 16.86 17.76 0.61
RhythmFormer [Zou et al., 2024] 13.13 17.43 14.73 0.51
Baseline skin 12.40 15.54 15.29 0.50
PhysNet [Yu et al., 2019] 12.09 15.43 15.14 0.66
PhysFormer [Yu et al., 2022] 10.71 13.97 12.69 0.72
PulseFormer w/o SA (ours) 10.49 13.62 12.83 0.73
FactorizePhys [Joshi et al., 2024] 10.07 13.43 12.36 0.67
PulseFormer w/o MITA (ours) 8.82 12.03 10.82 0.81
🔥 PulseFormer (ours) 7.67 10.69 9.45 0.85

Table: Results for HR prediction from eye-tracking videos using different models (PulseFormer, PulseFormer w/o SA, PulseFormer w/o MITA, and established rPPG baselines).

PulseFormer w/o MITA refers to the model PhysNetSA in our repository here.

Multiple seeds

Additionally to the results reported in our paper, we now also evaluated the performance of PulseFormer and some of the baselines across three seeds to get a better estimate of the performance and its variance across different seeds/machines. The mean results and STD across three seeds are reported below:

Model MAE ↓ RMSE ↓ MAPE ↓ r ↑
Baseline eyes 14.60 18.18 18.37 0.20
FactorizePhys [Joshi et al., 2024] 13.55 ± 1.13 16.64 ± 0.96 17.55 ± 1.52 0.67 ± 0.03
Baseline skin 12.40 15.54 15.29 0.50
PhysFormer [Yu et al., 2022] 11.52 ± 0.45 15.53 ± 0.51 12.68 ± 0.54 0.64 ± 0.03
PhysNet [Yu et al., 2019] 11.32 ± 0.43 14.61 ± 0.43 14.22 ± 0.52 0.69 ± 0.02
PulseFormer w/o MITA (ours) 9.68 ± 0.59 12.67 ± 0.61 12.06 ± 0.91 0.80 ± 0.01
🔥 PulseFormer (ours) 8.53 ± 0.62 11.64 ± 0.70 10.49 ± 0.74 0.82 ± 0.03

Table: Results for HR prediction from eye-tracking videos using different models (averaged across three random seeds).

PulseFormer w/o MITA refers to the model PhysNetSA in our repository here.

🎯 Proficiency estimation on EgoExo4D

Code will be up soon. To dos:

  • Explain usage (training and inference)
  • Explain create_split_files: script to create the split files for proficiency estimation task
  • Show number of used/excluded takes

Download EgoExo4D data

The EgoExo4D data has to be downloaded from the official EgoExo4D repository: https://ego-exo4d-data.org/#intro You need the annotations, VRS files (for IMU data), the ET videos and the POV videos (for the downstream proficiency estimation task). Alternatively, you can also run PulseFormer without the motion-informed temporal attention (MITA) module. Then, you do not need the VRS files (IMU data). To get the data needed for the proficiency estimation, run the following commands with the Ego4D downloader:

egoexo -o PATH_SAVE_FOLDER --parts take_vrs_noimagestream metadata annotations downscaled_takes/448

📜 Citation

If you find our paper, code or dataset useful for your research, please cite our work.

@article{braun2025egoppg,
  title={egoppg: Heart rate estimation from eye-tracking cameras in egocentric systems to benefit downstream vision tasks},
  author={Braun, Bj{\"o}rn and Armani, Rayan and Meier, Manuel and Moebus, Max and Holz, Christian},
  journal={arXiv preprint arXiv:2502.20879},
  year={2025}
}

👓🎭 egoEMOTION

Make sure to also check out egoEMOTION, our work on emotion recognition from egocentric vision systems. egoEMOTION includes over 50 hours of recordings from 43 participants doing emotion-elicitation tasks and naturalistic activities while self-reporting their affective state using the Circumplex Model and Mikels’ Wheel as well as their personality via the Big Five model. Each session provides synchronized data from the Project Aria glasses, videos of the participants' faces, nose PPG, and physiological baselines (ECG, EDA, and breathing rate) for reference.

📄 Disclaimer

The structure of the code in this repository is strongly inspired by the rPPG-Toolbox. Make sure to also check it out for other rPPG methods and datasets!

Releases

No releases published

Packages

No packages published

Languages