
GUI application for training face recognition models using InsightFace embeddings and an SVM classifier. With automatic face verification and real-time training progress.
🔍 Automatic Face Verification - Validates dataset images for single face detection
🧠 Multiple Embedding Models - Support for Buffalo L/S/SC and Antelope V2 models
🎯 SVM Classification - High-performance linear SVM for face recognition
📄 JSON Metadata Export - Training information saved for future reference
- Python 3.8 or higher
- CUDA-compatible GPU (recommended) or CPU
-
Clone the repository
git clone https://github.com/AldyRevigustian/InsightFace-Trainer.git cd InsightFace-Trainer -
Install dependencies
pip install -r requirements.txt
-
Run the application
python main.py
Organize your dataset in the following folder structure:
InsightFace-Trainer/
└── Dataset/
├── Person_1/
│ ├── image1.jpg
│ ├── image2.png
│ └── image3.jpeg
├── Person_2/
│ ├── photo1.jpg
│ └── photo2.png
└── Person_N/
├── face1.jpg
└── face2.jpg
- Folder Names: Use person names or IDs as folder names
- Image Formats: Supports JPG, JPEG, PNG, BMP, TIFF, WEBP, AVIF
- Image Quality: Use clear, well-lit face images
- Face Count: Each image should contain exactly one face
- Image Size: No strict requirements, but 224x224 or higher recommended
| Images Count | Accuracy | Training Time | Use Case |
|---|---|---|---|
| 10-100 | Good | Fast | Testing/Development |
| 100-300 | High | Medium | Production Use |
| 300-1000 | Very High | Slow | High Accuracy Required |
| 1000+ | Excellent | Very Slow | Research/Critical Applications |
For reference, you can check out this example dataset:
- tripleS Member Dataset: Kaggle Dataset - tripleS Member face dataset
- tripleS Member Recognition: HuggingFace Model - Trained to classify tripleS members using using this application
Images will be automatically moved to the Invalid folder if they:
- No face detected: Image doesn't contain any recognizable face
- Multiple faces detected: Image contains more than one face
- Corrupted files: Unable to load or process the image
- Poor quality: Face is too blurry, dark, or unclear for detection
- Create a
Datasetfolder - Create subfolders for each person
- Add face images to respective person folders
- Dataset Folder: Browse and select your dataset directory
- Embedding Model: Choose from available InsightFace models:
- Buffalo L ⭐ (275 MB) - Recommended for production
- Buffalo S (122 MB) - Faster processing
- Buffalo SC (14.3 MB) - Lightest option
- Antelope V2 (344 MB) - Best performance
- Output Settings: Set model output folder and filename
- Click "🔍 Verify Faces" to validate your dataset
- The application will:
- Check each image for face detection
- Move invalid images to
Invalidfolder (no face or multiple faces detected) - Provide detailed verification results
- Update dataset statistics
- Important: Only images with exactly one face will be used for training
- After successful verification, click "🚀 Start Training"
- Monitor real-time progress in the Training Progress section
- View detailed logs in the Logs tab
- Trained model:
Model/face_model.pkl - Training info:
Model/face_model_names.json - Invalid images:
Invalid/folder for review
| Model | Size | Speed | Accuracy | Use Case |
|---|---|---|---|---|
| Buffalo L ⭐ | 275 MB | Medium | High | Production |
| Buffalo S | 122 MB | Fast | Good | Real-time apps |
| Buffalo SC | 14.3 MB | Very Fast | Basic | Resource-limited |
| Antelope V2 | 344 MB | Slow | Highest | Research/Quality |
- Skip Verification: Use if you've already verified your dataset
- Custom Model Path: Load your own embedding model
- Output Folder: Customize where models are saved
- GPU/CPU Selection: Automatic detection with fallback
- GPU Usage: Ensure CUDA is properly installed for faster processing
- Memory: Close other applications during training
- Dataset Size: Start with smaller datasets for testing
- Image Quality: Use high-quality, well-lit face images
- Optimal Dataset: 15-30 images per person gives best accuracy vs training time balance
- Face Position: Center the face in the image for better detection
- Lighting: Ensure good, even lighting across the face
face_model.pkl- Trained SVM classifier with label encoderface_model_names.json- Training metadata including:{ "model_file": "face_model.pkl", "training_date": "2025-01-15 14:30:25", "total_persons": 4, "total_samples": 156, "person_names": ["Person_1", "Person_2", "Person_3", "Person_4"], "embedding_model": "buffalo_l", "dataset_path": "Dataset" }
Load and use your trained model:
import joblib
import json
# Load model
svm_classifier, label_encoder = joblib.load('Model/face_model.pkl')
# Load training info
with open('Model/face_model_names.json', 'r') as f:
training_info = json.load(f)
print(f"Model trained on {training_info['total_persons']} persons")
print(f"Person names: {training_info['person_names']}")- InsightFace Repository: https://github.com/deepinsight/insightface
- Example Dataset: tripleS Member Dataset on Kaggle
- Pre-trained Model: tripleS Member Recognition on HuggingFace