Skip to content

Tesseract OCR Integration #10

@jsnapoli1

Description

@jsnapoli1

Description

Integrate the Tesseract OCR engine into the daemon, consuming preprocessed frames and outputting detected text strings. Includes confidence thresholding and handling of partial/noisy results.

Definition of Done

[ ] Tesseract C API (TessBaseAPI) initialized with English language data
[ ] Accepts a preprocessed image buffer and returns extracted text as a string
[ ] Confidence score is retrieved per-word; words below a configurable threshold are discarded
[ ] Handles edge cases: blank page (no text detected), very small text, mixed fonts
[ ] Tested against at least 10 sample images with ground truth; accuracy results documented
[ ] Average OCR processing time per frame documented (target: under 3 seconds on RPi4)
[ ] Source committed as ocr.c / ocr.h in the app repo
[ ] Memory is properly freed after each OCR pass (no leaks confirmed via manual review or valgrind)

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions