The objective was to identify and benchmark several advanced convolutional neural network (CNN) architectures for distinguishing the full ASL alphabet, and then to embed the most accurate model into a live recognition stream driven by a webcam and MediaPipe hand tracking. This study trained and fine-tuned a suite of pre-trained models like VGG16, ResNet50, InceptionV3, DenseNet201, and MobileNetV2 alongside a custom CNN on a balanced ASL dataset of 27 handshape classes.
The dataset used in this project is publicly available on Kaggle:
👉 Kaggle Dataset Link
- VGG16
- ResNet50
- DenseNet201
- InceptionV3
- MobileNetV2
- Custom CNN
The pretrained model files (.h5, .pkl) are hosted on Kaggle:
👉 Model Download Link
Configuration files are available in the config/ folder.
Refer to the requirements.txt file for installation of libraries.
The indices .json files are also available.
- Validation Accuracy: 93.18%
- Test Accuracy: 98.77%
Visual results and confusion matrix are included in the notebook.
- Python, TensorFlow, Keras, OpenCV
- Matplotlib, Seaborn
- Jupyter Notebook
Siddhant Bahadkar
MSc Data Analytics, National College of Ireland
LinkedIn | GitHub