-
Notifications
You must be signed in to change notification settings - Fork 18
Description
Howdy,
Im currently working on a OCR-PDF solution for visual impaired people.
https://github.com/chrys87/ocrpdf
its just a early state but in my work with this i recognized that set_page_seg_mode is not respected correctly.
I have a multicolumn layout here as example:
https://crivatec.de/page/uploads/images/ocrTransformed1.png
I try the following:
tess = tesseract("/usr/share", self._languageCode) #language is "deu" in this case
tess.set_page_seg_mode(tesserwrap.PageSegMode.PSM_AUTO )
tess.set_variable('tessedit_pageseg_mode','3')# also tryed this
print( tess.get_page_seg_mode() ) #prints 3
self._OCRText[Page_p] = tess.ocr_image(self._modifiedImg[Page_p]) # the image above as pillow image
it seems that the mode is "set" correctly, because the print give the correct value but it still just proceed PSM_SINGLE_BLOCK (6)
so the columns are not recognized.
if i run tesseract from the commandline
tesseract ocrTransformed1.png ocrTransformed1 -l deu -psm 3
works awsome. the result is much more better. the correct psm is used and the columns are recognized.
could you take a look into this?
I m running a current Arch linux with latest tesseract, pillow and python
it seems that years ago a similar problem exists in tesseract itself:
https://code.google.com/p/tesseract-ocr/issues/detail?id=394
by the way, i really enjoy your python api. damn cool stuff :).