Algorithms
- Convolutional Neural Network (CNN)
- Core technique for image classification that forms the backbone of the system. CNN is ideal for image data.
- Architecture:
- Input layer to accept 32x32x3 RGB images as input to the network.
- Multiple convolutional layers for hierarchical feature extraction:
- First conv layer has 32 filters of size 3x3 to extract low-level features like edges, colors, gradients etc.
- Second conv layer has 64 filters to extract higher-level features like shapes, textures etc.
- Convolution operation captures spatial relationships between pixels in the image.
- Convolution is applied on the input and previous layer's output.
- Max pooling layers to progressively reduce spatial dimensions:
- Max pooling of filter size 2x2 applied after specific conv layers.
- Takes the maximum value in each 2x2 window to downsample feature maps.
- Reduces computations and parameters, avoiding overfitting.
- Fully connected layers for high-level reasoning:
- Flatten layer to convert 2D feature map matrix to 1D vector.
- Dense layer with 128 units learns non-linear combinations of low-level features.
- Output layer with 43 units and softmax activation to predict class probabilities.
- Training:
- Adam optimizer used for efficient training with adaptive learning rates for each parameter.
- Categorical cross-entropy loss function to quantify difference between target and predicted class probabilities.
- Accuracy metric on test set used to evaluate real-world performance of the model.
- Trained for 5 epochs due to early stopping to prevent overfitting.
- Image Preprocessing
- Resizes all images to 32x32 pixels to standardize input dimensions for the CNN model.
- Normalizes pixel values between 0-1 by dividing by 255 to center input data for stable training.
- Augments dataset using rotations, shifts, zooms etc. to expand diversity of images for better generalization.
- Transfer Learning
- Leverages knowledge from a CNN like InceptionV3 pretrained on large datasets like ImageNet.
- Allows model to converge faster with fewer training samples and epochs.
- Provides much better weight initialization for new task compared to random initialization.
- Object Detection
- Scans input image for potential traffic sign regions using sliding window approach.
- Extracts sign candidates using techniques like Haar cascades, HOG feature matching etc.
- Applies trained CNN model on the extracted regions for accurate classification.
- Optimization Algorithms
- Adam optimizer for efficient training with per-parameter adaptive learning rates.
- Dropout regularization randomly sets layer activations to zero during training. Improves generalization.
- Early stopping stops training when validation loss stops improving after an epoch's training.
- Performance Metrics
- Accuracy, precision, recall etc. used to quantitatively evaluate model's real- world performance.
- Target accuracy of 97% aimed on unseen test set for acceptable performance.