Object Detection Training Guide

This guide explains how to train a Faster R-CNN object detection model using datasets prepared with the image-classification annotation tool.

Overview
Prerequisites
Dataset Preparation
Training
Inference
Model Export
Configuration
Evaluation Metrics

Overview

This object detection system uses:

Model: Faster R-CNN with ResNet50 backbone (trained from scratch)
Dataset Format: YOLO format annotations (created by image-classification app)
Framework: PyTorch with torchvision
Metrics: mAP, IoU, Precision, Recall

Prerequisites

1. Install Dependencies

# For GPU training
pip install -r torch_requirements.txt

# For CPU training
pip install -r torch_requirements_cpu.txt

2. Verify Installation

python -c "import torch; print(f'PyTorch: {torch.__version__}')"
python -c "import torchvision; print(f'TorchVision: {torchvision.__version__}')"

Dataset Preparation

Using the Image Classification App

Navigate to the image-classification directory:
```
cd ../image-classification
```
Build and run the annotation tool:
```
./build.sh
./run_app.sh
```
Annotate your images:
- Select "Object Detection" mode
- Click "Open Folder" and select your images
- Create labels (e.g., "fireball", "meteor", "satellite")
- Draw bounding boxes around objects
- Assign labels to each box
- Click "Save & Next" to save and move to next image

Output structure:

annotated_images/
├── images/           # Copied original images
│   ├── image001.jpg
│   ├── image002.jpg
│   └── ...
├── labels/           # YOLO format annotations
│   ├── image001.txt
│   ├── image002.txt
│   └── ...
└── classes.txt       # Class names (one per line)

YOLO Format (in each .txt file):
```
<class_id> <x_center> <y_center> <width> <height>
```
All coordinates are normalized (0-1).

Move Dataset to Training Directory

# Copy annotated_images to fireball-detector directory
cp -r annotated_images ../fireball-detector/

Training

Basic Training

cd fireball-detector

# Train with default settings
python -m src.train_object_detection

# Or specify the dataset path
python -m src.train_object_detection annotated_images

Advanced Training Options

python -m src.train_object_detection \
    --num_epochs 50 \
    --batch_size 4 \
    --learning_rate 0.001 \
    --save_dir checkpoints

Resume Training

python -m src.train_object_detection \
    --resume_from checkpoints/checkpoint_epoch_20.pth

Training Output

The training script will:

Create a checkpoints/ directory
Save checkpoints every N epochs (configurable)
Save the best model based on validation mAP
Print training statistics and validation metrics

Example output:

Epoch 1/50
--------------------------------------------------------------------------------
Epoch [1], Step [10/100], Loss: 2.3456, Time: 5.23s
...
Epoch 1 Training Summary:
  Total Loss: 2.1234
  Classifier Loss: 0.5432
  Box Reg Loss: 0.3210
  Objectness Loss: 0.8765
  RPN Box Reg Loss: 0.3827
  Learning Rate: 0.001000

Running validation...
Validation mAP: 0.4523
AP per class:
  Class 0: 0.4321
  Class 1: 0.4725

New best model saved: checkpoints/faster_rcnn_best_bs4_ne50.pth (mAP: 0.4523)

Inference

Detect Objects in a Single Image

python -m src.detect \
    --model checkpoints/faster_rcnn_best_bs4_ne50.pth \
    --image path/to/test_image.jpg \
    --show

Detect Objects in a Directory

python -m src.detect \
    --model checkpoints/faster_rcnn_best_bs4_ne50.pth \
    --input_dir path/to/test_images/ \
    --output_dir detections/

Adjust Detection Threshold

python -m src.detect \
    --model checkpoints/faster_rcnn_best_bs4_ne50.pth \
    --image test.jpg \
    --score_threshold 0.7

Detection Output

The script will:

Print detected objects with confidence scores
Save visualizations with bounding boxes and labels
Display images if --show flag is used

Example output:

Detections for test_image.jpg:
Found 3 objects
  1. fireball: 0.923 at [120.5, 85.3, 245.7, 198.2]
  2. meteor: 0.856 at [350.1, 120.8, 420.3, 180.5]
  3. satellite: 0.734 at [500.2, 300.1, 550.8, 340.6]
Saved visualization to: detections/detected_test_image.jpg

Model Export

Export to ONNX and TorchScript

python -m src.detect \
    --model checkpoints/faster_rcnn_best_bs4_ne50.pth \
    --export \
    --export_dir exports/

This creates:

exports/faster_rcnn_model.onnx - ONNX format
exports/faster_rcnn_model.pt - TorchScript format

Use Exported Models

ONNX:

import onnxruntime as ort

session = ort.InferenceSession('exports/faster_rcnn_model.onnx')
outputs = session.run(None, {'input': image_tensor})

TorchScript:

import torch

model = torch.jit.load('exports/faster_rcnn_model.pt')
outputs = model(image_tensor)

Configuration

All configuration parameters are in src/config.py. Key settings:

Training Hyperparameters

OD_NUM_EPOCHS = 50              # Number of training epochs
OD_BATCH_SIZE = 4               # Batch size (reduce if out of memory)
OD_LEARNING_RATE = 0.001        # Initial learning rate
OD_MOMENTUM = 0.9               # SGD momentum
OD_WEIGHT_DECAY = 0.0005        # Weight decay for regularization

Model Settings

OD_BACKBONE = 'resnet50'        # Backbone: resnet50, resnet101, mobilenet_v3
OD_TRAINABLE_BACKBONE_LAYERS = 3  # Number of trainable layers (0-5)
OD_MIN_SIZE = 800               # Min image size
OD_MAX_SIZE = 1333              # Max image size

Detection Settings

OD_BOX_SCORE_THRESH = 0.05      # Score threshold for predictions
OD_BOX_NMS_THRESH = 0.5         # NMS threshold
OD_BOX_DETECTIONS_PER_IMG = 100 # Max detections per image

Evaluation Settings

OD_IOU_THRESHOLD = 0.5          # IoU threshold for metrics
OD_VAL_SPLIT = 0.2              # Validation split ratio

Evaluation Metrics

mAP (mean Average Precision)

The primary metric for object detection. Measures how well the model detects and classifies objects.

Range: 0.0 to 1.0 (higher is better)
Good mAP: > 0.5 for most applications
Excellent mAP: > 0.7

IoU (Intersection over Union)

Measures overlap between predicted and ground truth boxes.

Formula: IoU = (Area of Overlap) / (Area of Union)
Threshold: Typically 0.5 (configurable)

Precision and Recall

Precision: What fraction of detections are correct?
Recall: What fraction of ground truth objects are detected?

Per-Class Metrics

The validation output shows metrics for each class:

AP per class:
  Class 0 (fireball): 0.8234
  Class 1 (meteor): 0.7123
  Class 2 (satellite): 0.6543

Troubleshooting

Out of Memory Error

Reduce batch size in config:

OD_BATCH_SIZE = 2  # or even 1

Low mAP

More training epochs: Increase OD_NUM_EPOCHS
More data: Annotate more images
Better annotations: Ensure bounding boxes are accurate
Adjust learning rate: Try OD_LEARNING_RATE = 0.0005
More trainable layers: Increase OD_TRAINABLE_BACKBONE_LAYERS

Slow Training

Use GPU: Ensure CUDA is available
Reduce image size: Decrease OD_MIN_SIZE and OD_MAX_SIZE
Fewer workers: Reduce OD_NUM_WORKERS

No Detections

Lower threshold: Use --score_threshold 0.3 or lower
Check model: Ensure model is trained properly
Verify classes: Ensure classes.txt matches training data

Example Workflow

Complete End-to-End Example

# 1. Prepare dataset with annotation tool
cd image-classification
./run_app.sh
# Annotate images in Object Detection mode

# 2. Copy dataset
cp -r annotated_images ../fireball-detector/

# 3. Train model
cd ../fireball-detector
python -m src.train_object_detection \
    --num_epochs 50 \
    --batch_size 4

# 4. Run inference
python -m src.detect \
    --model checkpoints/faster_rcnn_best_bs4_ne50.pth \
    --input_dir test_images/ \
    --output_dir results/

# 5. Export model
python -m src.detect \
    --model checkpoints/faster_rcnn_best_bs4_ne50.pth \
    --export

Additional Resources

PyTorch Object Detection Tutorial: https://pytorch.org/tutorials/intermediate/torchvision_tutorial.html
Faster R-CNN Paper: https://arxiv.org/abs/1506.01497
YOLO Format: https://github.com/AlexeyAB/darknet#how-to-train-to-detect-your-custom-objects

Support

For issues or questions:

Check the troubleshooting section above
Review the configuration in src/config.py
Examine training logs for errors
Verify dataset format and annotations

FilesExpand file tree

OBJECT_DETECTION_README.md

Latest commit

History