Follow the steps below to set up the environment, process the data, and train the models.
It's recommended to use a virtual environment to manage dependencies and avoid conflicts across projects.
-
Create the virtual environment
Replacevenvwith your preferred environment name:python3 -m venv venv
-
Activate the virtual environment
-
Linux/macOS
source venv/bin/activate -
Windows
.\venv\Scripts\activate
-
-
Install project dependencies
pip install -r requirements.txt
-
Deactivate the virtual environment when you're done
deactivate
📌 Make sure you’re using Python 3.7 or higher. You can check your version with:
python3 --versionDownload the MARBLE dataset from the official website:
👉 MARBLE Dataset - EveryWareLab
Once downloaded, create a folder named dataset in the root directory of the project, and place the extracted MARBLE dataset inside. In addition, create a folder named all_data which will be used for data loading. Your folder structure should look like this:
Multimodal_Prediction/
├── dataset/
│ └── MARBLE/
│ └── dataset/
│ ├── A1a/
│ ├── A1e/
│ └── ...
├── all_data/
...
📝 Note: Make sure the dataset structure matches this layout exactly to avoid path errors during preprocessing.
Run the preprocessing script from the root directory to process the raw MARBLE data:
python data_preprocess.py --marble_dataset_path="./dataset/MARBLE/dataset"
This will generate synchronized sensor data and natural language summaries used for model training and evaluation.
Run the processing script to process the synced MARBLE data:
python data_process.py --synced_marble_data_path="./all_data/synced_marble_data.csv" --save_path="./all_data"
This will generate the .joblib and .npy files necessary for data loading.
Run the main script for model training and evaluation.
python main.py --imu_joblib_file="./all_data/MARBLE_IMU.joblib" --embeddings_dir="./all_data/" --sentence_encoder="all-MiniLM-L12-v2" --batch_size=250 --train --evaluate="fusion"
This will load the data and train the models.