OCR for scanned PDFs #10

Open

Labels

enhancementhelp wanted

opened

Add OCR-based text extraction using pdf2image + pytesseract. Detect scanned pages and extract text.

Implement scanned page detection
Integrate OCR extraction

Metadata

Assignees

No one assigned

Labels

enhancementhelp wanted

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests