This project utilizes facial recognition techniques to identify and track individuals across multiple video streams. The system is built using PyTorch and leverages pre-trained models for face detection and recognition.
- Face detection using FastMTCNN (Multi-task Cascaded Convolutional Networks)
- Face recognition using a fine-tuned InceptionResnetV1 model
- Real-time tracking across multiple video streams
- Data augmentation for improved model performance
- Support for both image and video processing
- Python 3.x
- PyTorch
- OpenCV
- facenet-pytorch
- NumPy
The project consists of several key components:
-
Data Preparation:
- The system uses a dataset of facial images organized in directories by person.
- Data augmentation techniques are applied to enhance the training dataset.
-
Model Training:
- An InceptionResnetV1 model is fine-tuned on the prepared dataset.
- The training process includes both training and validation phases.
-
Face Detection and Recognition:
- FastMTCNN is used for face detection in images and video frames.
- The trained InceptionResnetV1 model is used for face recognition.
-
Video Processing:
- The system can process multiple video streams.
- Detected faces are tracked and labeled in real-time.
-
Output Generation:
- Processed frames are compiled into an output video with labeled faces.
- Prepare your dataset in a directory structure where each subdirectory represents a person and contains their images.
- Run the training script to fine-tune the InceptionResnetV1 model on your dataset.
- Use the trained model to process images or videos for face detection and recognition.
collate_fn: Custom function for data loadingcreate_video_from_frames: Generates output video from processed frames
- Face Detection: FastMTCNN
- Face Recognition: Fine-tuned InceptionResnetV1
- Training: Uses CrossEntropyLoss and Adam optimizer
https://drive.google.com/file/d/1TtfFJIqW03E5JM2v8KxJ0txt6Pc0oAeB/view?usp=drive_link