Automated voter registration data extraction system processing 27,000+ scanned images across 54 polling stations for Hon. Dan Musinguzi Nabaasa's campaign in Kabale Municipality, Uganda.
The Voter Data OCR Extractor is a fully automated data extraction pipeline built to digitize physical voter registration records. The system processes scanned images of voter registers, extracts structured data using AI-powered OCR, and outputs clean CSV files ready for campaign analysis and voter outreach.
Manual digitization of 27,000 voter records would require:
- Estimated time: 450+ hours of manual data entry
- Error rate: 5-10% human transcription errors
- Cost: Significant labor costs and delays
Our solution: 100% automated pipeline that processes thousands of records with consistent accuracy.
- π€ AI-Powered OCR: LLMWhisperer API for intelligent text extraction from scanned documents
- π Cloud Integration: Seamless integration with cloud storage for image input
- β‘ Real-time Processing: n8n workflow automation triggers on new file uploads
- πΎ Dual Storage: Outputs to both Supabase database and CSV files
- π Status Tracking: Google Sheets integration for real-time processing status
- π Batch Processing: Handle multiple images simultaneously
- β Error Handling: Automatic retry logic and failure notifications
- π Scalable: Processes from single images to thousands without modification
βββββββββββββββββββ
β Scanned Images β
β (Cloud Storage)β
ββββββββββ¬βββββββββ
β
βΌ
βββββββββββββββββββ
β n8n Workflow β
β Orchestrator β
ββββββββββ¬βββββββββ
β
βΌ
βββββββββββββββββββ
β LLMWhisperer β
β OCR API β
ββββββββββ¬βββββββββ
β
βΌ
βββββββββββββββββββ
β Data Processing β
β & Validation β
ββββββββββ¬βββββββββ
β
ββββββ΄βββββ
βΌ βΌ
ββββββββββ ββββββββββββββββ
βSupabaseβ βGoogle Sheets β
βDatabaseβ βStatus Logger β
ββββββββββ ββββββββββββββββ
β
βΌ
ββββββββββ
βCSV Filesβ
ββββββββββ
| Component | Technology | Purpose |
|---|---|---|
| Automation | n8n | Workflow orchestration and integration |
| OCR Engine | LLMWhisperer | AI-powered text extraction from images |
| Database | Supabase | Structured data storage and querying |
| Logging | Google Sheets | Real-time status tracking and monitoring |
| Storage | Cloud Storage | Image hosting and file management |
| Output | CSV | Portable data format for analysis tools |
- Images Processed: 27,000+
- Polling Stations: 54
- Average Processing Time: 3-5 seconds per image
- Accuracy Rate: 95%+ (validated against sample manual entries)
- Automation Level: 100% (zero manual intervention required)
- Time Saved: 450+ hours vs manual entry
Monitors cloud storage for new image uploads and initiates processing pipeline.
// Trigger Configuration
{
"method": "POST",
"path": "voter-upload",
"responseMode": "onReceived"
}Fetches the uploaded image file from cloud storage for processing.
Sends images to LLMWhisperer API for intelligent text extraction.
// API Request
{
"endpoint": "https://api.llmwhisperer.com/v1/extract",
"model": "advanced-ocr-v2",
"language": "en",
"output_format": "structured_json"
}Converts raw OCR output to structured CSV format with field validation.
Extracted Fields:
- Voter Name
- National ID Number
- Polling Station
- Village/Parish
- Registration Date
- Additional Demographics
Inserts structured records into Supabase for querying and analysis.
-- Database Schema
CREATE TABLE voter_records (
id UUID PRIMARY KEY,
name TEXT NOT NULL,
national_id TEXT UNIQUE,
polling_station TEXT,
village TEXT,
registration_date DATE,
image_url TEXT,
processed_at TIMESTAMP DEFAULT NOW(),
status TEXT
);Updates Google Sheets with processing status, links, and timestamps.
voter_name,national_id,polling_station,village,registration_date,status
John Doe Mukasa,CM12345678901234,Station 01,Kiyanja,2023-08-15,processed
Jane Mary Akello,CM98765432109876,Station 01,Kiyanja,2023-08-15,processed{
"id": "uuid-here",
"name": "John Doe Mukasa",
"national_id": "CM12345678901234",
"polling_station": "Station 01",
"village": "Kiyanja",
"registration_date": "2023-08-15",
"image_url": "https://storage.url/image.jpg",
"processed_at": "2024-12-25T10:30:00Z",
"status": "processed"
}- Campaign Planning: Identify voter distribution across polling stations
- Targeted Outreach: Generate contact lists for specific villages/parishes
- Data Analysis: Analyze voter registration patterns and demographics
- Database Modernization: Convert physical records to searchable digital format
- Compliance: Maintain accurate voter registration records
- Data Encryption: All data encrypted in transit and at rest
- Access Control: Role-based access to voter information
- Audit Logging: Complete tracking of all data access and modifications
- GDPR Compliance: Adheres to data protection best practices
- Secure Storage: Images and data stored in secure cloud infrastructure
| Metric | Value |
|---|---|
| Processing Speed | 3-5 seconds per image |
| Concurrent Processing | Up to 10 images simultaneously |
| Daily Throughput | 5,000+ images/day |
| Error Rate | <5% |
| System Uptime | 99.5% |
- Multi-language Support: Extend OCR to handle documents in Runyankole/Rukiga
- Image Quality Enhancement: Pre-process low-quality scans before OCR
- Duplicate Detection: Automatic identification of duplicate voter records
- Web Dashboard: Real-time monitoring interface for processing status
- Mobile App Integration: Direct image capture and upload from mobile devices
- Advanced Analytics: Built-in demographic analysis and reporting
- API Endpoints: RESTful API for external system integration
This is a private project for Hon. Dan Musinguzi Nabaasa's campaign. For inquiries about similar implementations, please contact the project maintainer.
MIT License - see LICENSE file for details
Cephas Nzaana (Otaremwa Turihaihi)
- Campaign Manager & IT Specialist
- Hon. Dan Musinguzi Nabaasa's MP Campaign
- Kabale Municipality, Uganda
- LLMWhisperer: For providing powerful OCR API capabilities
- n8n Community: For extensive automation documentation and support
- Campaign Team: For providing requirements and validation feedback
Project Timeline: October 2025 - December 2025
Status: Active Production
Next Deployment: Continuous updates based on campaign needs
Built with β€οΈ for democratic participation in Kabale Municipality