Video Classification with UCF50

This project implements a video classification pipeline using the UCF50 dataset. It leverages a Long-term Recurrent Convolutional Network (LRCN) model that extracts spatial features from individual video frames via a ResNet backbone and learns temporal dynamics through an LSTM. The project includes scripts for preprocessing, training, and testing the model.

Dataset Preparation

Step 0: Download and Unzip Dataset

Download Dataset:
Download the UCF50 dataset from here. This dataset contains videos of 50 different human action classes.
Unzip and Organize:
Unzip the downloaded dataset. The expected folder structure should be as follows:
```
 - UCF50
     - Action_Class1
     - Action_Class2
     ... ... ... ...
     - Action_Class50
```

Each subdirectory represents a different action class.

Environment Setup

Python Version:
This project requires Python 3.7 or higher.
Dependencies:
Install the required Python packages by running:

pip install -r requirements.txt

Key Libraries

PyTorch
torchvision
OpenCV
scikit-learn
tqdm
numpy
Pillow

Hardware Requirements

A CUDA-enabled GPU is recommended for training. The code automatically detects GPU availability.

Preprocessing and Frame Extraction

Before training, the raw video files must be converted into frame sequences. The preprocessing module includes functions for:

Uniform Frame Sampling

The get_frames function uses OpenCV to sample a fixed number of frames per video.

Saving Frames to Disk

The store_frames function writes the extracted frames as JPEG images.

To run the preprocessing script, call the following in the Command Line:

python preprocess.py

Training the Model

Step 1: Run Training

Configure Training Parameters

The training is managed via a bash script (e.g., train.sh) that calls the main training module.
Important: Update the --frame_dir argument in the script to point to the directory where your preprocessed frame data is stored. You can also adjust other parameters (e.g., number of frames per video, batch size, learning rate) to see how they affect the experiment.

Run the Training Script

Execute the training script from your terminal:

python run.py --frame_dir UCF50 --train_size 0.75 --test_size 0.15 --model_type lrcn --n_classes 50 --fr_per_vid 16 --batch_size 4 --mode 'train' | Out-File train_log.txt

or

bash train.sh

During Training, the Script Will:

Load the frame dataset.
Split the dataset into training, validation, and test sets using stratified sampling.
Apply data augmentation techniques (resizing, random flips, affine transformations).
Create custom PyTorch Datasets and DataLoaders.
Initialize the LRCN model using a specified ResNet backbone.
Set up the loss function, optimizer, and learning rate scheduler.
Run the training loop while tracking loss and accuracy, saving the best model weights.

Testing and Evaluation

Step 2: Run Testing

Configure Testing Parameters:
Update the --ckpt argument in your testing script (e.g., test.sh) to point to the saved best model weights generated during training.
Run the Testing Script:
Execute the testing script from your terminal:

python run.py --ckpt models\best_model_wts.pt --model_type lrcn --n_classes 50 --model_type lrcn --batch_size 4 --mode eval | Out-File test_log.txt

or

bash test.sh

Testing Script Overview

The testing script will:

Load the dataset splits (previously saved during training).
Create a DataLoader for the test set.
Load the trained model checkpoint.
Evaluate the model on the test data by computing overall accuracy, generating classification reports, and optionally producing confusion matrices.

Customization and Hyperparameters

You can modify several parameters to experiment with different settings:

Data Parameters

--frame_dir: Path to your preprocessed frames.
--fr_per_vid: Number of frames to sample per video.

Model Parameters

--model_type: Choose between 'lrcn' (default) or other supported models.
--cnn_backbone: Options include resnet18, resnet34, resnet50, resnet101, or resnet152.
--rnn_hidden_size and --rnn_n_layers: Configure the LSTM network.

Training Parameters

--batch_size, --learning_rate, --n_epochs, and --dropout control the training dynamics.
--train_size and --test_size determine dataset splits.

By tweaking these parameters, you can study their impact on model performance and experiment with different network configurations.

Summary of Steps

Step 0: Dataset Preparation
Download, unzip, and organize the UCF50 dataset into subdirectories by action class. The preprocess.py script will organize the subdirectories for you.
Step 1: Run Training
Execute train.sh after configuring the --frame_dir and other hyperparameters to train the model.
Step 2: Run Testing
Execute test.sh after updating the --ckpt argument to point to the best model checkpoint to evaluate the model.

Happy Training!

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
experiments.pdf		experiments.pdf
models.py		models.py
preprocess.py		preprocess.py
requirements.txt		requirements.txt
run.py		run.py
run_training.py		run_training.py
test.py		test.py
test.sh		test.sh
train.py		train.py
train.sh		train.sh
utils.py		utils.py
video_datasets.py		video_datasets.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Video Classification with UCF50

Table of Contents

Dataset Preparation

Step 0: Download and Unzip Dataset

Environment Setup

Key Libraries

Hardware Requirements

Preprocessing and Frame Extraction

Uniform Frame Sampling

Saving Frames to Disk

Training the Model

Step 1: Run Training

Configure Training Parameters

Run the Training Script

During Training, the Script Will:

Testing and Evaluation

Step 2: Run Testing

Testing Script Overview

Customization and Hyperparameters

Data Parameters

Model Parameters

Training Parameters

Summary of Steps

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

dna-witch/video-classification-lrcn

Folders and files

Latest commit

History

Repository files navigation

Video Classification with UCF50

Table of Contents

Dataset Preparation

Step 0: Download and Unzip Dataset

Environment Setup

Key Libraries

Hardware Requirements

Preprocessing and Frame Extraction

Uniform Frame Sampling

Saving Frames to Disk

Training the Model

Step 1: Run Training

Configure Training Parameters

Run the Training Script

During Training, the Script Will:

Testing and Evaluation

Step 2: Run Testing

Testing Script Overview

Customization and Hyperparameters

Data Parameters

Model Parameters

Training Parameters

Summary of Steps

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages