This repository provides codes and documentation for the new release of the WAUC dataset. The following changes/additions have been made for this new release:
- Data has been reformatted into easy to read
csvformat. - The
csvfiles have data separated and classified into sessions (1 to 6) along with baselines (eyes closed/still and movement only) - A data integrity section has been provided which provides information regarding missing sensors or sessions
- Demographc info has been provided in
demographics.csvfile including age, and sex. - More detail has been provided regarding the data structure and format for easier access and analysis
A sample dataset for one subject is provided at the following link
If you use this dataset in your research, please cite it as follows:
@article{albuquerque2020wauc,
title={Wauc: a multi-modal database for mental workload assessment under physical activity},
author={Albuquerque, Isabela and Tiwari, Abhishek and Parent, Mark and Cassani, Raymundo and Gagnon, Jean-Fran{\c{c}}ois and Lafond, Daniel and Tremblay, S{\'e}bastien and Falk, Tiago H},
journal={Frontiers in Neuroscience},
volume={14},
pages={549524},
year={2020},
publisher={Frontiers Media SA}
}There are few instances of missing sensors and data. These include:
The instances of missing data for BH3 include:
| Subject | Description |
|---|---|
| 1004 | Session #5 missing (high physical activity) |
| 1013 | No baseline markers for session #1 |
| 1019 | Session #3 missing (high physical activity) |
| 1020 | No BH3 data |
| 1032 | Session #2 missing (high physical activity) |
| 1035 | Session #1 missing |
The instances of missing data for E4 include:
| Subject | Description |
|---|---|
| 1001 | No E4 data |
| 1013 | No baseline markers for session #1 |
| 1025 | No E4 data |
| 1028 | No E4 data |
| 1038 | No E4 data |
For the Enobio EEG cap, only baseline markers for Subject 1013 for session 1 are missing
Subject 1028 has no ratings or session ground truth info avalible
This section covers how the dataset is organized and the structure of the csv files
WAUC_Dataset
├── 1001/
├── 1002/
│ ├── raw/
│ │ ├── {sensor1}_{signal1}.csv
│ │ ├── {sensor1}_{signal2}.csv
│ │ ├── {sensor2}_{signal1}.csv
│ │ ├── {sensor2}_{signal2}.csv
│ │ └── ...
│ ├── processed/
│ └── features/
├── 1003/
├── ...
├── ...
├── 1047/
├── 1048/
├── subjective_ratings_with_labels.csv
└── demographics.csv
Each subject folder has raw, processed, and features subfolders.
For now, only the raw folder is populated and includes the raw sensor information collected named as {sensor}_{signal}.csv, the sensors and corresponding signals are shown in table below:
| Sensor | Signals |
|---|---|
bh3 |
acc, ecg, br |
e4 |
acc, ppg, temp, gsr |
enobio |
eeg |
The subjective ratings including NASA-TLX and percieved exertion scale
along with ground truth of MATB-II (column: mwl) and physical activity (column: pwl) is provided in subjective_ratings_with_labels.csv.
These values can be mapped to their corresponding physiological information using the session number columns (column: session_no)
The demographics.csv file includes age, sex (M or F), weight (kg), and height (cm) along with the activity type (Treadmill or Bike) performed by the participants.
All signal csv files contain the following columns:
| Columns | Description |
|---|---|
unix |
Unix time of experiment |
time |
Time (in s) from the start of recording |
fs |
Sampling Frequency of the signal |
info |
baseline-1 = eye closed + still, baseline-2 = physical activity only, session = main recording session |
session_no |
Session number (same as subjective ratings file) |