Build a Traffic Sign Recognition Project
The goals / steps of this project are the following:
- Load the data set (see below for links to the project data set)
- Explore, summarize and visualize the data set
- Design, train and test a model architecture
- Use the model to make predictions on new images
- Analyze the softmax probabilities of the new images
- Summarize the results with a written report
I used the numpy library to calculate summary statistics of the traffic signs data set:
- The size of training set is 34799
- The size of the validation set is 4410
- The size of test set is 12630
- The shape of a traffic sign image is 32,32,3
- The number of unique classes/labels in the data set is 43
A visualization of the frequency of samples per class in the Training dataset can be seen below.
Some samples from the dataset can be seen below.
As a first step, I decided to convert the images to grayscale because it reeduces the number of channels which reduces the training time.
Here is an example of a traffic sign image after grayscaling.
I also normalised the pixel values by subtracting 125 and dividing by 125. This restraints them to be in the range of -1 to +1. This makes sure that while backpropogation, the gradients don't blow up.
My final model consisted of the following layers:
Layer | Description |
---|---|
Input | 32x32x1 Grayscale image |
Convolution 5x5 | 1x1 stride, same padding, outputs 28x28x6 |
RELU | |
Convolution 5x5 | 1x1 stride, valid padding, outputs 24x24x12 |
RELU | |
Max pooling | 2x2 stride, outputs 12x12x12 |
Convolution 5x5 | 1x1 stride, valid padding, outputs 8x8x24 |
RELU | |
Convolution 5x5 | 1x1 stride, valid padding, outputs 4x4x24 |
RELU | |
Max pooling | 2x2 stride, outputs 2x2x24 |
Flatten | |
Dense | outputs 240 |
RELU | |
Dropout | Keep_prob 0.7 |
Dense | outputs 240 |
RELU | |
Dropout | Keep_prob 0.7 |
Dense | outputs 120 |
RELU | |
Dropout | Keep_prob 0.7 |
Dense | outputs 43 |
To train the model, I used the Adam Optimizer with a learning rate of 0.001 which I ran for a total of 25 epochs with a batch size of 128. As this was a multi-class classification, I used the Cross Entropy function as my loss function.
My final model results were:
- training set accuracy of 99.0%
- validation set accuracy of 95.1%
- test set accuracy of 92.3% over
Initially I tried the vanilla LeNet-5 but even for a small data, the model was not overfitting which indicated that a more complex network was required. After increasing the number of Convolution and Dense layers, the training accuracy was almost hitting 99% which indicated Overfitting. Hence I used Dropout layers after all the Dense layers with a dropout probability of 0.3. This increased the validation accuracy drastically.
Here are five German traffic signs that I found on the web:
Here are the results of the prediction:
Image | Prediction |
---|---|
Speed Limit (20km/h) | Speed Limit (20km/h) |
Wild Animals Crossing | Wild Animals Crossing |
Keep Right | Keep Right |
No Passing | Children Crossing |
Road Work | Road Work |
Stop | Stop |
Yield | Yield |
The model was able to correctly guess 6 of the 7 traffic signs, which gives an accuracy of 85.6%. These images are taken randomly from the internet.
For the fourth image, the model is relatively sure that this is a End of No Passing sign (probability of 0.6), and the image does not contain a stop sign but No Passing sign. The top five soft max probabilities were
Probability | Prediction |
---|---|
.843 | Children Crossing |
.107 | Right-of-way at the next intersection |
.003 | Vehicles over 3.5 metric tons prohibited |
.0005 | Slippery road |
.0004 | No Passing |