Skip to content

RashmiJK/PGP-AIML-IndustrialSafety-CNN

Repository files navigation

Computer Vision Essentials with Image Processing and Convolutional Neural Networks

Project Groundwork

This project builds upon several foundational computer vision and deep learning concepts to create and optimize a image classification model. These concepts are essential for improving model efficiency, accuracy, and the ability to generalize well to new, unseen images.

  • Image Preprocessing prepares raw images for analysis. Common steps include resizing images to a consistent shape, normalizing pixel values, and optionally converting to grayscale. Normalization ensures that pixel intensities are scaled appropriately, helping the model train more effectively and converge faster.

  • Feature Extraction involves identifying meaningful patterns or structures in images. In traditional computer vision, algorithms like SIFT or HOG were widely used. In deep learning–based workflows, convolutional layers within CNNs automatically learn hierarchical features, from simple edges to complex object shapes.

  • Data Augmentation enhances model generalization by creating realistic variations of existing images. Techniques such as rotation, flipping, zooming, shifting, and brightness adjustment simulate real-world scenarios, increasing dataset diversity without additional data collection.

  • Convolutional Neural Networks (CNNs) are specialized deep learning architectures designed for image data. CNNs capture spatial hierarchies and local patterns using convolutional operations, making them highly effective for image classification tasks.

  • Pooling Layers reduce the spatial dimensions of feature maps, helping to down-sample information while retaining key features. Max pooling is commonly used, selecting the most dominant value in a region. This reduces computation and adds translational invariance to the model.

  • Fully Connected Layers in a CNN act as the classifier. After convolution and pooling extract feature representations, fully connected layers interpret these features and produce predictions for the target classes.

  • Transfer Learning leverages pre-trained models as a starting point for a related task. This is especially valuable with limited data, as it allows the use of models trained on large datasets (e.g., ImageNet) to provide strong initial feature extraction.

  • Pre-trained Models such as VGG-16 come with learned feature representations from massive datasets. A portion of this project utilizes VGG-16 by reusing its rich feature maps and fine-tuning it for the specific task of helmet classification. This accelerates training and often results in higher accuracy compared to training from scratch.

PDF Preview

Project Objective

Computer vision enables automated understanding of visual data and is widely used in quality inspection, surveillance, and safety monitoring. In industrial environments, these technologies help enforce safety measures by automatically analyzing worker activities and detecting unsafe conditions. This project focuses on building an image classification model that identifies whether a worker is wearing a safety helmet or not, forming the core of an automated monitoring system aimed at improving workplace safety and compliance.

Data Description

The dataset contains hundreds of images of workers in real industrial settings, divided into two classes—with helmet and without helmet. A detailed data overview is included in the accompanying notebook. The images vary in lighting, environments, camera angles, and worker activities, providing the diversity needed for robust and realistic model training.

Environment Setup

  • It is recommended to use Google Colab for this Notebook. Google Colab offers ready-to-use environment that avoids the time-consuming and error-prone setup involved in local installations.

  • To boost performance, make sure to set the runtime to use the T4 GPU. To ensure you're using the T4 GPU runtime in Google Colab, follow these steps:

    • From the menu bar at the top of the page, select Runtime.
    • In the dropdown menu that appears, select Change runtime type.
    • In the "Runtime type" dropdown within the dialog box, select T4 GPU.
    • Click the Save button.

About

This repository contains educational material (pdf) and practical Notebook for computer vision application in workplace safety (HelmNet).

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors