RapidOCR RunPod CPU Serverless

Ultra-fast OCR deployment on RunPod CPU instances with pre-bundled PP-OCRv5 models.

Features

✅ CPU-optimized: Runs on cheap CPU instances ($0.10-0.20/hour)
✅ Latest PP-OCRv5: Better accuracy and speed than v4
✅ Tiny bundled models: Only 21MB total (included in repo!)
✅ Instant cold starts: <2 seconds (models pre-bundled)
✅ Multi-threaded: OpenMP/MKL optimizations for CPU performance
✅ Fast inference: ~1-1.5 seconds per page on 8-core CPU
✅ 90+ languages: Supports English, Chinese, Japanese, Korean, etc.
✅ No downloads: Models bundled in Docker image

Deployment Options

This repo supports two deployment modes:

Serverless (Queue) - For async job processing with RunPod's queue system
Load Balancer (HTTP) - For synchronous HTTP requests with auto-scaling

Option 1: Serverless (Queue) Deployment

1. Build on RunPod

# Login to RunPod
runpod login

# Build image (uses RunPod's fast build servers)
runpod build \
  --repo https://github.com/YOUR_USERNAME/rapidocr-runpod-cpu.git \
  --branch main \
  --dockerfile Dockerfile \
  --tag latest \
  --public

2. Create Serverless Endpoint

Go to https://www.runpod.io/console/serverless
Click "New Endpoint"
Select your built image
Choose CPU instance: 8-16 vCPU recommended
Set Active Workers = 1 (or more for scaling)
Deploy!

3. Test Serverless Endpoint

# Set environment variables
export RUNPOD_API_KEY="your_api_key"
export RUNPOD_ENDPOINT_ID="your_endpoint_id"

# Install dependencies
pip install requests pillow pdf2image

# Run batch OCR
python3 batch_ocr.py sample.pdf --max-workers 5

Option 2: Load Balancer (HTTP) Deployment

1. Build Load Balancer Image

# Build with Load Balancer Dockerfile
runpod build \
  --repo https://github.com/YOUR_USERNAME/rapidocr-runpod-cpu.git \
  --branch main \
  --dockerfile Dockerfile.loadbalancer \
  --tag loadbalancer \
  --public

2. Create Load Balancer Endpoint

Go to https://www.runpod.io/console/serverless
Click "New Endpoint"
Select your built image (with loadbalancer tag)
Choose CPU instance: 8-16 vCPU recommended
Endpoint Type: Load Balancer (not Queue!)
Set Active Workers = 1 (or more for auto-scaling)
Deploy!

3. Test Load Balancer Endpoint

# Set endpoint URL
export RAPIDOCR_ENDPOINT_URL="https://your-endpoint-id.runpod.net"

# Install dependencies
pip install requests pillow pdf2image

# Run test
python3 test_loadbalancer.py sample.pdf

API Usage

Single Image

import requests
import base64

# Read image
with open("image.png", "rb") as f:
    img_b64 = base64.b64encode(f.read()).decode()

# Call API
response = requests.post(
    "https://api.runpod.ai/v2/YOUR_ENDPOINT/run",
    headers={"Authorization": f"Bearer {RUNPOD_API_KEY}"},
    json={"input": {"images": [img_b64]}}
)

result = response.json()
print(result)

Response Format

{
  "success": true,
  "results": [
    {
      "text_lines": [
        {
          "text": "Hello World",
          "confidence": 0.98,
          "bbox": {"x": 10, "y": 20, "width": 100, "height": 30},
          "polygon": [[10, 20], [110, 20], [110, 50], [10, 50]]
        }
      ],
      "image_index": 0,
      "total_lines": 1
    }
  ]
}

Performance

CPU Instance Recommendations

vCPUs	RAM	Cost/hr	Pages/sec	Use Case
4	8GB	~$0.10	0.3-0.5	Light load
8	16GB	~$0.15	0.5-0.8	Recommended
16	32GB	~$0.25	0.8-1.2	Heavy load

Benchmark (8 vCPU)

Single page: ~1.5-2s
100 pages: ~3-4 minutes
Cold start: <3 seconds (models pre-cached)

Cost Comparison

Solution	Hardware	Cost/1000 pages	Speed
RapidOCR CPU	8 vCPU	$0.08	1.5s/page
Surya A100	80GB GPU	$3.20	0.5s/page
Google Vision API	Cloud	$1.50	1s/page

Models

Pre-bundled PP-OCRv5 models (included in /models/):

Detection: PP-OCRv5 mobile (4.6MB)
Recognition: PP-OCRv5 mobile (16MB)
Classification: Mobile v2.0 (0.5MB) - disabled for speed

Total: 21MB (bundled in Docker image, no download needed!)

Deployment Notes

Environment Variables

Set in Dockerfile (already configured):

OMP_NUM_THREADS=16       # OpenMP threads
MKL_NUM_THREADS=16       # Intel MKL threads
OPENBLAS_NUM_THREADS=16  # OpenBLAS threads

Scaling

Horizontal: Increase Active Workers for concurrent requests
Vertical: Use larger CPU instances (16-32 vCPUs)
Auto-scale: Set min/max workers in RunPod UI

Troubleshooting

Models not pre-cached?

Check build logs for "Pre-downloading RapidOCR models..." step.

Slow performance?

Increase vCPUs (8+ recommended)
Check OMP_NUM_THREADS matches vCPU count
Verify models loaded (check worker logs)

Out of memory?

Use smaller instance or reduce max_workers
Process images in smaller batches

License

MIT

Credits

RapidOCR - Fast OCR toolkit
PaddleOCR - Model source
RunPod - Serverless platform

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
models		models
.gitignore		.gitignore
Dockerfile		Dockerfile
Dockerfile.loadbalancer		Dockerfile.loadbalancer
README.md		README.md
batch_ocr.py		batch_ocr.py
config_v5.yaml		config_v5.yaml
handler.py		handler.py
handler_http.py		handler_http.py
prewarm.py		prewarm.py
test_curl.sh		test_curl.sh
test_loadbalancer.py		test_loadbalancer.py
test_results.json		test_results.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RapidOCR RunPod CPU Serverless

Features

Deployment Options

Option 1: Serverless (Queue) Deployment

1. Build on RunPod

2. Create Serverless Endpoint

3. Test Serverless Endpoint

Option 2: Load Balancer (HTTP) Deployment

1. Build Load Balancer Image

2. Create Load Balancer Endpoint

3. Test Load Balancer Endpoint

API Usage

Single Image

Response Format

Performance

CPU Instance Recommendations

Benchmark (8 vCPU)

Cost Comparison

Models

Deployment Notes

Environment Variables

Scaling

Troubleshooting

Models not pre-cached?

Slow performance?

Out of memory?

License

Credits

About

Uh oh!

Releases

Packages

Languages

GunitBindal/rapidocr-runpod-cpu

Folders and files

Latest commit

History

Repository files navigation

RapidOCR RunPod CPU Serverless

Features

Deployment Options

Option 1: Serverless (Queue) Deployment

1. Build on RunPod

2. Create Serverless Endpoint

3. Test Serverless Endpoint

Option 2: Load Balancer (HTTP) Deployment

1. Build Load Balancer Image

2. Create Load Balancer Endpoint

3. Test Load Balancer Endpoint

API Usage

Single Image

Response Format

Performance

CPU Instance Recommendations

Benchmark (8 vCPU)

Cost Comparison

Models

Deployment Notes

Environment Variables

Scaling

Troubleshooting

Models not pre-cached?

Slow performance?

Out of memory?

License

Credits

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages