A practical authentication workflow that can combine:
- Face verification (InsightFace embeddings)
- Speaker verification (SpeechBrain speaker embeddings)
- Optional offline speech-to-text (STT) for capturing a name using Vosk
This repository is published in a privacy-safe way: the project structure is complete, but any sensitive datasets/logs/identities were removed or replaced with placeholders.
We typically run the complete system using:
python run_system.py --stt_name --vosk_model vosk-model-small-en-us-0.15 --name_seconds 6 --cooldown 10 --pc_ui_lang en--stt_nameEnables offline speech-to-text for capturing/confirming a spoken name.--vosk_model <folder>Path (or folder name) of the Vosk model directory.--name_seconds 6How many seconds to listen for the name.--cooldown 10Cooldown time between attempts (helps avoid repeated triggers).--pc_ui_lang enUI / prompts language on PC side.
- ✅ Multi-factor verification (Face + Voice) with fusion
- ✅ Offline STT option (Vosk) for name capture
- ✅ Privacy-safe dataset/DB templates included (no real user data)
- ✅ Prompt assets for PC UI language selection (AR/EN)
- ✅ Modular scripts for building databases and running the pipeline
This repo intentionally excludes any private or identifying data.
What you will notice (and why it’s normal):
dataset/exists as structure only (no images/audio are shipped).db/teachers.jsonanddb/pending.jsonare included as templates/placeholders to show the expected schema.logs/attempts.jsonlis empty.
Why are some files “empty”? Because publishing real images, recordings, teacher identities, or attempt logs would violate privacy. This repository is meant to be safe to share while still being understandable and runnable once you provide your own data.
.
├─ run_system.py # Main runner (full system)
├─ main.py # FastAPI server (if used by your flow)
├─ pc_client.py # PC-side logic (STT + UI prompts)
├─ verify_fusion.py # Fusion logic (face + voice)
├─ face_model_insightface.py # Face embeddings using InsightFace
├─ voice_model.py # Speaker embeddings using SpeechBrain
├─ db/
│ ├─ teachers.json # Template (placeholder)
│ └─ pending.json # Template (placeholder)
├─ dataset/ # Template only (no private media)
├─ logs/
│ └─ attempts.jsonl # Empty placeholder
└─ assets/prompts/ # Audio/text prompts (AR/EN)
- Python 3.9+ (recommended)
numpy,requests,tqdmopencv-pythontorch,torchaudiospeechbraininsightfacefastapi,uvicorn(if you usemain.py)- Optional:
vosk(only needed if you use--stt_name)
python -m venv .venv
# Windows:
.venv\Scripts\activate
# macOS/Linux:
source .venv/bin/activatepip install -U pip
pip install numpy requests tqdm opencv-python
pip install torch torchaudio
pip install speechbrain insightface
pip install fastapi uvicorn
# Optional (only if using STT):
pip install voskNote: Torch/torchaudio installation can vary by OS/GPU. If you face install issues, prefer CPU-only builds or install from the official PyTorch instructions.
A folder like vosk-model-small-en-us-0.15 is an external pre-trained model downloaded from a third-party source.
To avoid licensing/ownership issues (and reduce repo size), the model is not committed here.
- Download any compatible Vosk model (English or other language).
- Place the model folder inside the project directory, for example:
Authentication-System/
└─ vosk-model-small-en-us-0.15/
No problem — just pass its name/path via --vosk_model:
python run_system.py --stt_name --vosk_model vosk-model-small-en-us-0.22 --name_seconds 6 --cooldown 10 --pc_ui_lang enAlternatively, you can change the default in the code where the argument is defined (search for
--vosk_modelinrun_system.py).
From the project root:
python run_system.py --stt_name --vosk_model vosk-model-small-en-us-0.15 --name_seconds 6 --cooldown 10 --pc_ui_lang en- Run without STT (if supported by your setup):
python run_system.py --vosk_model vosk-model-small-en-us-0.15 --cooldown 10 --pc_ui_lang en- Switch UI language (if you have prompts for Arabic):
python run_system.py --stt_name --vosk_model vosk-model-small-en-us-0.15 --name_seconds 6 --cooldown 10 --pc_ui_lang arTo actually enroll/verify identities you must provide your own data:
- Add your own images/audio into the expected
dataset/structure. - Fill your local teacher/user list (CSV/JSON templates) with your own IDs.
- Generate embeddings/databases according to the scripts you use in your workflow.
- The system can’t find the Vosk model
Make sure the folder exists and the path matches the
--vosk_modelvalue. - Empty dataset / missing identities This repo ships without private media by design. Add your own data locally.
- Repeated triggers / too many attempts
Increase
--cooldownto reduce back-to-back attempts.
This repository contains project code and placeholder templates only. Third-party models (e.g., Vosk) are governed by their original licenses and must be obtained separately.
If you build on this project, feel free to open an issue or submit a pull request.