-
Notifications
You must be signed in to change notification settings - Fork 10.3k
Data Files Contributions
ameera3 edited this page Jul 10, 2019
·
16 revisions
This page lists repositories with Tesseract4 compatible tessdata (for --oem 1 - LSTM) by Tesseract community.
Such tessdata contributions should ideally document everything needed to reproduce the training process (fonts, images, ground truth, texts, scripts, documentation, ...).
| Language Code | Language | Data File | Contributor | Info |
|---|---|---|---|---|
| khmLimon | Khmer | best | OpenInstituteCambodia/phyrumsk | PR in tessdata_best |
| cop | Coptic | best | shreeshrii/tessdata_coptic | tesseract-ocr forum post |
| jpn_vert | Japanese Vertical | best | zodiac3539/jpn_vert | tesseract-ocr forum post |
| ocrb_plus | MRZ | best | shreeshrii/tessdata_ocrb | tesseract-ocr forum post |
| jav_java | Aksara Jawa | best | Shreeshrii/tessdata_jav_java | tesseract-ocr forum post |
| mrz | MRZ | best | DoubangoTelecom/tesseractMRZ | tesseract-ocr forum post |
| dot_matrix | MRZ | best | ameera3/OCR_Expiration_Date | tesseract-ocr forum post |
Use the template below for adding new files.
| Lang_Code | Language | best | User_Repo | tesseract-ocr forum post |
|---|
Old wiki - no longer maintained. The pages were moved, see the new documentation.
These wiki pages are no longer maintained.
All pages were moved to tesseract-ocr/tessdoc.
The latest documentation is available at https://tesseract-ocr.github.io/.