PaddleOCR OCR analyzes Left-to-Right instead of Right-to-Left for Arabic- How to process RTL languages correctly? #14971
Replies: 2 comments
-
PaddleOCR currently does not fully support automatic handling of Right-to-Left (RTL) languages like Arabic in terms of reading order. While the model can accurately recognize Arabic characters, the OCR output is returned in a Left-to-Right (LTR) sequence, which leads to reversed word and sentence structures. As of PaddleOCR v2.10, there is no built-in parameter such as ocr_order_method or an automatic RTL layout analyzer that handles RTL script flow natively. This is a known limitation for languages like Arabic, Persian, and Hebrew. 📌 Workaround: Post-process the output to reverse the order of text You can manually adjust the text direction using Python after obtaining the OCR results. Here's a suggested approach:
Here is an example post-processing snippet: # Example post-processing for RTL (Arabic)
from paddleocr import PaddleOCR
from operator import itemgetter
ocr = PaddleOCR(use_angle_cls=True, lang='ar')
img_path = 'image5.jpg'
result = ocr.ocr(img_path, cls=True)[0]
# Group lines based on y-coordinate (with some threshold)
lines = {}
threshold = 10
for box in result:
position = box[0]
text = box[1][0]
y_center = (position[0][1] + position[2][1]) / 2
added = False
for key in lines:
if abs(y_center - key) < threshold:
lines[key].append((position, text))
added = True
break
if not added:
lines[y_center] = [(position, text)]
# Sort lines top to bottom
sorted_lines = sorted(lines.items(), key=lambda x: x[0])
for y, line_items in sorted_lines:
# Sort boxes right to left based on x-coordinate of the rightmost point
sorted_line = sorted(line_items, key=lambda x: -max(p[0] for p in x[0]))
rtl_line = ' '.join([text for _, text in sorted_line])
print(rtl_line) 📝 Notes:
🔮 Future Support: While PaddleOCR supports multilingual OCR including Arabic in recognition, it currently lacks native document layout analysis or direction correction for RTL scripts. It’s recommended to follow or raise a feature request in the PaddleOCR GitHub Issues for improved RTL language support. No currently available Issues or Discussions reference this particular RTL text order feature in PaddleOCR directly. You may consider opening a new Feature Request on GitHub. Response generated by 🤖 feifei-bot | chatgpt-4o-latest |
Beta Was this translation helpful? Give feedback.
-
how to annotated for real data, reading order is from right to left,the cut images for training is not suitable for the label |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
🔎 Search before asking
🐛 Bug (问题描述)
I am using PaddleOCR with the Arabic language model (lang='ar') to perform OCR on Arabic images. While PaddleOCR correctly recognizes the Arabic characters, it processes the text in a Left-to-Right (LTR) order, which is incorrect for Arabic, a Right-to-Left (RTL) language. This results in the words and sentences being in reverse order.
I have reviewed the paddleocr --help output to see if there are any options to explicitly set the text direction or handle RTL languages like Arabic.
My question is:
Is there a specific option in PaddleOCR, possibly using ocr_order_method or another parameter, to correctly handle Right-to-Left languages like Arabic and ensure the output text is in the correct RTL order?
If there isn't a built-in option, what are the recommended workarounds to post-process the OCR output to reorder the text correctly for RTL languages in Python?
Any guidance or solutions on how to get PaddleOCR to output Arabic text in the correct Right-to-Left order would be greatly appreciated.
🏃♂️ Environment (运行环境)
OS Windows 11
Environment Pycharm
Python 3.8
PaddOCR 2.10
🌰 Minimal Reproducible Example (最小可复现问题的Demo)
from paddleocr import PaddleOCR, draw_ocr
ocr = PaddleOCR(use_angle_cls=True, lang='Ar')
img_path = 'image5.jpg'
result = ocr.ocr(img_path, cls=True)
for idx in range(len(result)):
res = result[idx]
for line in res:
print(line)
draw result
from PIL import Image
result = result[0]
image = Image.open(img_path).convert('RGB')
boxes = [line[0] for line in result]
txts = [line[1][0] for line in result]
scores = [line[1][1] for line in result]
im_show = draw_ocr(image, boxes, txts, scores, font_path='./doc/fonts/arabic.ttf')
im_show = Image.fromarray(im_show)
im_show.save('result.jpg')
Beta Was this translation helpful? Give feedback.
All reactions