This repository contains our exploration and development journey in building an OCR and face verification pipeline for ID card processing. The goal was to extract text from ID cards and verify whether the holder's face matches the face on the ID.
While the final implementation shifted from our initial plan, this README documents the process, decisions, and lessons learned throughout development.
Our initial idea was to:
- Perform OCR on ID cards using a Python-based script.
- Extract relevant information such as name, ID number, and birth date.
- Perform face verification to ensure the cardholder matches the face on the ID.
During development, we experimented with various tools, frameworks, and models before ultimately deciding not to use Tesseract OCR or DeepFace in the final version. This README captures that journey.
- We initially explored Google Tesseract, a widely used OCR engine.
- We used the
pytesseractwrapper for integration in Python. - Outcome: The text extraction quality for ID cards was below our expectations.
- After consulting with peers experienced in Machine Learning, we realized:
- Tesseract struggles with low-resolution ID cards.
- OCR performance is sensitive to lighting, fonts, and ID card layouts.
- For production-quality OCR, a custom-trained OCR model would be more reliable.
Although time constraints prevented us from building a custom OCR model, we plan to:
- Train a specialized OCR model tailored to ID card layouts.
- Explore modern deep learning OCR frameworks (e.g., CRNN, TrOCR, PaddleOCR).
We explored DeepFace, an easy-to-use Python library for face verification.
- Can be installed directly with
pip install deepface. - Supports multiple models: VGG-Face, FaceNet, ArcFace, etc.
- Automatically downloads model weights when switching models.
Through testing, we found:
- FaceNet performed significantly better on ID card photos
- VGG-Face (DeepFace default model) was inconsistent for ID card images
Despite promising results, we ultimately chose not to use DeepFace due to:
- Time limitations for fine-tuning verification thresholds
- Model inconsistency with certain ID formats
- Need for a more tailored and robust face matching solution
In the final version of the project, we did not use:
- Tesseract OCR
- DeepFace
Instead, the implemented solution focuses on the aspects that aligned best with our time constraints and performance requirements.
This documentation remains as a reference for our experimentation journey and as a guide for future development.
If you plan to explore the original tools we tested:
- Install Google Tesseract (OS-specific installers)
- Add Tesseract executable to your system PATH
- Install Python wrapper:
pip install pytesseractpip install deepfaceModels will be downloaded automatically when selected.
- Build a dedicated OCR model specialized for Indonesian ID cards
- Implement a lightweight face verification model optimized for real-time inference
- Improve preprocessing pipeline for image enhancement
- Add automated confidence scoring for ID verification
This project taught us the importance of: While we pivoted from our initial plan, the lessons learned will directly inform future iterations of this product.
reference: https://towardsdatascience.com/googles-tesseract-ocr-how-good-is-it-on-documents-d71d4bf7640 https://pypi.org/project/deepface/ https://arxiv.org/pdf/2101.05214.pdf https://medium.com/analytics-vidhya/optical-character-recognition-using-tensorflow-533061285dd3