- 
                Notifications
    
You must be signed in to change notification settings  - Fork 156
 
Open
Description
English traineddata file does not contain the '±' character?
Environment
Tesseract Version: 5.00 Downloaded from: https://github.com/UB-Mannheim/tesseract/wiki
Platform: Windows 10 64bit
I am trying to OCR using the English dictionary file found:
https://tesseract-ocr.github.io/tessdoc/Data-Files
I notice the character is not included here either:
https://github.com/tesseract-ocr/langdata_lstm/blob/main/eng/eng.unicharset
Are there any plans to add it? Are there any language files that contain successfully OCR this character?
Many thanks to whoever can assist here. I am attaching the file I used to test this behavior for this character here: (https://github.com/tesseract-ocr/langdata_lstm/files/9870674/Special.Symbols.pdf)
Metadata
Metadata
Assignees
Labels
No labels