Object Detection on BDD100K Dataset
This repository contains the code for BDD Object Detection Dataset analysis. The code is containerized using Docker to ensure it can run on any machine.
- Docker installed on your machine. You can download it from here.
- IMPORTANT Make sure the bdd data is stored in the folder named assignment_data_bdd in the same directory.
-
Clone the repository:
git clone <repository-url> cd <repository-directory>
-
Build the Docker image:
docker build -t bdd-object-detection .
-
Run the Docker container:
docker run -p 8888:8888 -v $(pwd):/app bdd-object-detection
The data flow within this project is structured as follows:
- Dataset: The BDD100K dataset is expected to be located in the
assignment_data_bddfolder, structured with images and labels. - Data Loading (
data.py):- The
BDDObjectDetectionDatasetclass handles loading and preprocessing the BDD100K dataset. - It supports splitting the data into training and validation sets, specified during dataset initialization.
- Annotations are loaded from JSON files and cached as parquet files for faster loading in subsequent runs.
- The dataset class provides methods to access images and their corresponding bounding box annotations.
- The
- Data Transformation (
data.py):- The
__getitem__method retrieves an image and its target (bounding boxes and labels). - The
custom_collate_fnfunction is used to collate batches of data, converting PIL Images to tensors.
- The
The training process is managed by train_model.py. Here's a breakdown of the key steps:
-
Data Loading (
train_model.py):- The
train_functioninitializes the training and validation datasets usingBDDObjectDetectionDataset. DataLoaderis used to create iterable data loaders for training and validation sets, usingcustom_collate_fnto handle batching.
- The
-
Model Definition (
model.py):- The
ObjectDetectionModelclass defines the object detection model as a PyTorch Lightning module. - It uses a pre-trained Faster R-CNN model with a ResNet-50 backbone, obtained from
torchvision.models. - The
get_pretrained_modelfunction configures the model, replacing the classifier with a new one suitable for the BDD100K dataset (10 object classes + background). - The backbone's pre-trained weights are frozen during initial training to stabilize training and leverage pre-trained features.
- The
-
Training Loop (
train_model.pyandmodel.py):- The
train_functionsets up the training and validation data loaders, the model, and the PyTorch Lightning trainer. - Loss Function (
losses.py): TheObjectDetectionLossclass calculates the combined loss (classification and regression) for the object detection task. It includes Focal Loss for classification and Smooth L1 Loss for bounding box regression. - Optimizer (
model.py): The Adam optimizer is used with a learning rate of 1e-3. - Callbacks (
train_model.py):ModelCheckpoint: Saves the best model based on validation loss. Checkpoints are saved in thecheckpoints/directory. The filename includes the epoch number and validation loss.EarlyStopping: Stops training when the validation loss stops improving, with a patience of 10 epochs.
- Logging (
train_model.py): TensorBoard is used for logging metrics during training. Logs are stored in thelightning_logsdirectory. - The
training_stepandvalidation_stepmethods inObjectDetectionModeldefine the training and validation logic, respectively.
- The
-
Running Training (
train_model.py): To start the training, execute thetrain_model.pyscript:python bdd_object_detection/train_model.py
The result_analysis.ipynb notebook provides a comprehensive analysis of the model's performance. Here's a breakdown of the key steps:
-
Data Loading (
result_analysis.ipynb):- Prediction and ground truth data are loaded from parquet files (
bdd100k_val_cache_predictions.parquetandbdd100k_val_cache.parquet, respectively). - These files should contain the bounding box predictions and ground truth annotations for the validation set.
- Prediction and ground truth data are loaded from parquet files (
-
Metric Calculation (
losses.pyandresult_analysis.ipynb):- The
calculate_metricsfunction inlosses.pycalculates object detection metrics such as Precision, Recall, and F1-score based on Intersection over Union (IoU) between predicted and ground truth bounding boxes. - The notebook iterates through different score thresholds and IoU thresholds to evaluate the model's performance under various conditions.
- Metrics are calculated for each category and for all categories combined.
- The
-
Visualization (
result_analysis.ipynb):- The notebook generates various plots to visualize the model's performance:
- Precision-Recall curves: Plots of precision vs. recall for different IoU thresholds.
- F1-score curves: Plots of F1-score vs. score threshold for different IoU thresholds.
- Bar charts: Bar charts comparing precision, recall, and F1-score for different categories at a fixed IoU threshold.
- AP (Average Precision) analysis: Analysis of AP, AP50, AP75, and AR (Average Recall) metrics, including plots for individual classes and overall performance.
- The plots help to identify the optimal score threshold for each category and to understand the model's strengths and weaknesses.
- The notebook generates various plots to visualize the model's performance:
-
Max F1 Score Analysis (
result_analysis.ipynb):- The notebook identifies the score threshold at which the F1-score is maximized for each category.
- This allows for setting a working point for the model where it performs the best, balancing precision and recall.
-
Average Precision Analysis (
result_analysis.ipynb):- The notebook analyzes the Average Precision (AP) for each class using pre-computed results from a RetinaNet model.
- It generates bar charts to visualize the AP for different classes and IoU thresholds.
-
Running Analysis (
result_analysis.ipynb):- To run the analysis, execute the
result_analysis.ipynbnotebook in a Jupyter environment.
- To run the analysis, execute the
- The
Dockerfilesets up the environment and installs all necessary dependencies specified inrequirements.txt. - This
README.mdprovides instructions on how to build and run the Docker container, train the model, and analyze the results