Multi Users Activities Recognition for Human-Robot Collaboration

This project explores the challenge of gathering data for multi-user interactions in Human-Robot collaboration. By merging the data collected from individual users to produce multi users data, the study aims to simplify the process of dataset creation. Using 3D skeleton poses of activities performed by single users, the project demonstrates the feasibility of training machine learning models, such as LSTM networks and VAEs with STGCNs, to recognize activities of user pairs. The results indicate that this approach achieves comparable performance to training data collected from groups of individual users, offering a promising solution for advancing research in multi-party Human-Robot interaction and collaboration.

Overview

config/ includes json files saving the model hyperparameter settings
csf/ includes codes to control the CSF3
data/ includes skeleton data
preprocess/ includes code to preprocess data
visualization/ includes code for visualizing the data and some visualization

Data

The data is obtained from the research by Francesco et al. (2023). Single Data is the recording of one person and Pair Data is the recording of two people. In this research, three tasks are defined:

Working: manipulating the tool at their chest heigh
Preparing: fetching the new item for the next task from the working table
Requesting: raising a hand and hold it to request the next task

In Single Data, the participant acts all of the three tasks and this recording is merged with other(or same) participant's recordings to generate 9 combinations of label. In Pair Data, two participants act 9 combinations in one recording.

Single Data	Pair Data

Data Pre-processing

From 32 joints of original skeleton, only 11 joints are selected for this research. Two normalizations are done in the data pre-processing step: Naval-Neck Normalization and Min-Max Normalization.

Raw Data	Normalized Data

In case of Single Data, it is merged with other or same participant's data to generate the same form of Pair Data.

Single Data	Merged Data

Then, time window is generated with 130 frames of skeleton and three datasets are obtained:

Grouped Dataset: generated by two Single Data of the different participants
Single Grouped Dataset: generated by two Single Data of the same participants
Paired Dataset: generated by Pair Data, serving as benchmark

Design of Experiments

Four experiments are conducted to validate the hypothesis by utilizing the three datasets. Those experiments are listed here:

Grouped-Grouped Experiment: validates the Grouped Data is trainable or not from the Deep learing model
Paired-Paired Experiment: provides benchmark as the model trained on Paired Dataset
Grouped-Paired Experiment: validates the Grouped Dataset can replace the Paired Dataset
Single Grouped-Paired Experiment: validates the Single Grouped Dataset can replace the Paired Dataset

The first term represents the training dataset and the second term represents the testing dataset. For example, in Grouped-Paired Experiment, the model is trained on Grouped Dataset and tested on Paired Dataset.

Implementation of Deep Learning Model

Long Short-Term Memory(LSTM)

LSTM is trained on four different experiments using supervised learning.

Variational Autoencoder(VAE) with Spatio-Temporal Graph Convolutional Networks(ST-GCN)

VAE with ST-GCN is trained on four different experiments using transfer learning. The first image is the architecture of the model to train it by semi-supervised learning. Then, red blocks are extracted to utilize the knowledge gained from the first training.

Extracted blocks are connected to dense layer for the classification. This model is trained by supervised learning.

Results

Result of LSTM

In the experiment with LSTM, the performance from the dataset generated by proposed method (Grouped) shows similar scores with benchmark one. This indicates that the Grouped Dataset can replace the Paired Dataset. However, the performance from the Single Grouped Dataset is not that significant but the model still learned something as it can guess half of them correctly.

Result of VAE with ST-GCN

The experiment with VAE with ST-GCN shows similar result, while the performance from Grouped Dataset outperformed the Paired Dataset. This suggests that the Grouped Dataset is even better to train the model compared to the Paired Dataset. The Single Grouped Dataset is still showed unsatisfactory results.

Name		Name	Last commit message	Last commit date
Latest commit History 78 Commits
config		config
csf		csf
data		data
img		img
preprocess		preprocess
visualization		visualization
README.md		README.md
data_setup.py		data_setup.py
engine.py		engine.py
main.ipynb		main.ipynb
models.py		models.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Multi Users Activities Recognition for Human-Robot Collaboration

Overview

Table of Contents

Data

Data Pre-processing

Design of Experiments

Implementation of Deep Learning Model

Results

About

Uh oh!

Releases

Packages

Languages

jungwoo9/Multi_Users_Activities_Recognition_for_Human-Robot_Collaboration

Folders and files

Latest commit

History

Repository files navigation

Multi Users Activities Recognition for Human-Robot Collaboration

Overview

Table of Contents

Data

Data Pre-processing

Design of Experiments

Implementation of Deep Learning Model

Results

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages