Production-ready REST API for Optical Character Recognition (OCR) with support for Yoruba, Igbo, Hausa languages and Naira symbol (β¦) recognition.
- π Multi-language OCR: Supports English, Yoruba, Igbo, and Hausa
- π° Naira Symbol Recognition: Accurately detects and counts β¦ symbols
- πΌοΈ Image Preprocessing: Automatic greyscale conversion, DPI normalization, and contrast enhancement
- β‘ Worker Pool: Efficient Tesseract.js worker pool management with auto-scaling
- π‘οΈ Production Ready: Rate limiting, error handling, logging, health checks, and metrics
- π³ Docker Support: Multi-stage Docker build with language pack caching
- π Prometheus Metrics: Built-in metrics endpoint for monitoring
- β Fully Tested: Unit and integration tests with Jest
- Node.js 20+ with TypeScript (strict mode)
- Express.js - Web framework
- Tesseract.js - OCR engine
- Sharp - Image processing
- Pino - Structured logging
- Jest - Testing framework
- Docker - Containerization
- Node.js 20 or higher
- npm or yarn
- Docker (optional, for containerized deployment)
- Clone the repository:
git clone <repository-url>
cd ocr-api- Install dependencies:
npm install- Create
.envfile from.env.example:
cp .env.example .env- Download Tesseract.js language packs:
mkdir -p langs
cd langs
wget https://tessdata.projectnaptha.com/4.0.0_fast/eng.traineddata.gz
wget https://tessdata.projectnaptha.com/4.0.0_fast/yor.traineddata.gz
wget https://tessdata.projectnaptha.com/4.0.0_fast/ibo.traineddata.gz
wget https://tessdata.projectnaptha.com/4.0.0_fast/hau.traineddata.gz
gunzip *.gz
cd ..- Build the project:
npm run build- Start the server:
npm startOr for development with hot-reload:
npm run devThe API will be available at http://localhost:3000
- Build the Docker image:
npm run docker:build- Start the container:
npm run docker:upThe Dockerfile automatically downloads language packs during the build process.
| Variable | Description | Default |
|---|---|---|
NODE_ENV |
Environment (development/production) | development |
PORT |
Server port | 3000 |
WORKER_POOL_SIZE |
Number of Tesseract workers | 4 |
MAX_IMAGE_SIZE |
Maximum image size in bytes | 5242880 (5 MB) |
RATE_LIMIT_WINDOW_MS |
Rate limit window in milliseconds | 60000 (1 minute) |
RATE_LIMIT_MAX |
Maximum requests per window | 60 |
LOG_LEVEL |
Logging level (debug/info/warn/error) | info |
Perform OCR on an uploaded image.
Request:
- Method:
POST - Content-Type:
multipart/form-data - Body:
image(required): Image file (PNG, JPG, JPEG) - Max 5 MBlanguage(optional): Language hint for better accuracy- Single language:
eng,yor,ibo,hau - Multiple languages:
eng+yor,yor+ibo, etc. - If omitted, uses all available languages
- Single language:
Response (200 OK):
{
"success": true,
"data": {
"text": "β¦5 000 for garri",
"confidence": 92.3,
"language": "yor",
"nairaCount": 1
}
}Best Practices for High Confidence:
- Specify language: Use
?language=yorfor Yoruba text,?language=haufor Hausa, etc. - Single language is better:
enggives higher confidence thaneng+yor+ibo+hau - Image quality: Use high-resolution images (minimum 300 DPI), clear text, good lighting
- Avoid compression: PNG is better than heavily compressed JPEG
Error Responses:
400- Bad Request (no image, invalid image)413- Payload Too Large (file exceeds size limit)415- Unsupported Media Type (invalid file type)429- Too Many Requests (rate limit exceeded)500- Internal Server Error
Example with cURL:
# Basic OCR (auto-detect language)
curl -X POST http://localhost:3000/api/v1/ocr \
-F "image=@receipt.jpg"
# OCR with language hint (recommended for better accuracy)
curl -X POST "http://localhost:3000/api/v1/ocr?language=yor" \
-F "image=@receipt.jpg"
# OCR with multiple languages
curl -X POST "http://localhost:3000/api/v1/ocr?language=eng+yor" \
-F "image=@receipt.jpg"Example with JavaScript (fetch):
const formData = new FormData();
formData.append('image', fileInput.files[0]);
// With language hint for better accuracy
const language = 'yor'; // or 'eng', 'ibo', 'hau', 'eng+yor', etc.
const response = await fetch(`http://localhost:3000/api/v1/ocr?language=${language}`, {
method: 'POST',
body: formData
});
const result = await response.json();
console.log(result);
// { success: true, data: { text: "...", confidence: 95.2, language: "yor", nairaCount: 3 } }Health check endpoint.
Response (200 OK):
{
"status": "ok",
"uptime": 1234.56,
"version": "1.0.0",
"workers": {
"total": 4,
"inUse": 1,
"available": 3
}
}Prometheus-format metrics endpoint.
Response (200 OK):
# HELP http_requests_total Total number of HTTP requests
# TYPE http_requests_total counter
http_requests_total 42
# HELP tesseract_workers_total Total number of Tesseract workers
# TYPE tesseract_workers_total gauge
tesseract_workers_total 4
...
Run all tests:
npm testRun tests in watch mode:
npm run test:watchRun tests with coverage:
npm test -- --coveragetests/nairaCounter.test.ts- Unit tests for Naira symbol countingtests/ocr.test.ts- Integration tests for OCR API endpoints
| Script | Description |
|---|---|
npm run dev |
Start development server with hot-reload |
npm run build |
Compile TypeScript to JavaScript |
npm start |
Start production server |
npm test |
Run tests with coverage |
npm run lint |
Run ESLint |
npm run lint:fix |
Fix ESLint errors |
npm run format |
Format code with Prettier |
npm run docker:build |
Build Docker image |
npm run docker:up |
Start Docker container |
npm run docker:down |
Stop Docker container |
- TypeScript: Strict mode enabled
- ESLint: Configured with TypeScript rules
- Prettier: Code formatting
- Jest: Unit and integration testing
src/
βββ config/ # Configuration (logger, Tesseract)
βββ controllers/ # Request handlers
βββ middleware/ # Express middleware (error handling, validation, rate limiting)
βββ routes/ # API routes
βββ services/ # Business logic (OCR service)
βββ utils/ # Utilities (image preprocessing, Naira counter)
βββ app.ts # Express app setup
βββ server.ts # Server entry point
The API automatically preprocesses images for optimal OCR accuracy:
- Greyscale conversion - Reduces noise and improves text recognition
- DPI normalization - Resizes to 300 DPI equivalent
- Contrast normalization - Enhances text visibility
- Sharpening - Improves edge detection
- Configurable pool size (default: 4 workers)
- Automatic worker initialization
- Idle timeout (30 seconds) with auto-reinitialization
- Request queuing when all workers are busy
- Default: 60 requests per minute per IP address
- Configurable via environment variables
- Returns
429 Too Many Requestswhen exceeded - Includes rate limit headers in responses
Monitor service health via /health endpoint.
Prometheus-format metrics available at /metrics:
- HTTP request count and duration
- Tesseract worker pool statistics
- Request duration percentiles (p50, p95)
If you see errors about missing language packs:
- Ensure
langs/directory exists in project root - Verify language pack files are present:
eng.traineddatayor.traineddataibo.traineddatahau.traineddata
For best results:
- Specify the language: Use
?language=yorfor Yoruba text instead of auto-detect - Use single language when possible:
engis more accurate thaneng+yor+ibo+hau - Image quality matters:
- Use high-resolution images (minimum 300 DPI)
- Ensure clear, well-lit images
- Avoid heavily compressed JPEGs (use PNG when possible)
- Ensure text is not rotated or skewed
- For receipts: Text should be horizontal and clearly visible
- Diacritics: The API now supports Yoruba/Igbo/Hausa diacritics (α», αΉ£, α»₯, α», Ε, etc.) for better accuracy
- Increase
WORKER_POOL_SIZEfor higher concurrency - Monitor worker stats via
/healthendpoint - Check logs for worker initialization errors
MIT
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests
- Submit a pull request
For issues and questions, please open an issue on GitHub.