Replies: 5 comments
-
Hi! 👋 I just ran into exactly the same issue when trying to use lang="ru" with PaddleOCR — instead of proper Cyrillic letters, I’m getting Latin transliterations in the output (e.g., “TMMH POCCNMCKOM” instead of “ГИМН РОССИЙСКОЙ”). I’ve followed all the setup instructions carefully and also tested with several images, but the problem persists. Could a developer please take a look or advise if there’s any known fix for this? 🙏 Thanks in advance! |
Beta Was this translation helpful? Give feedback.
-
Also duplicate it here just in case.I've started using this tool in my small project — so far everything is going very well. However, I’ve come across a similar issue with Cyrillic characters. Is there any solution available? Maybe there are any special settings for proper operation with the Russian language? Would appreciate any help |
Beta Was this translation helpful? Give feedback.
-
I think the model simply just isn't good enough. All the other non-CJK & non-English models are equally broken. |
Beta Was this translation helpful? Give feedback.
-
It may not be that complex. In my investigation, I found that using a Chinese model to detect other languages, such as Bangla, even without a publicly available detection model for that language, can yield good results. For your question, you simply need to create a custom dataset and fine-tune the model. 🤔🤔🤔 |
Beta Was this translation helpful? Give feedback.
-
Hello, Russian is already supported. Please refer to the documentation at https://github.com/PaddlePaddle/PaddleOCR/blob/main/docs/version3.x/algorithm/PP-OCRv5/PP-OCRv5_multi_languages.en.md and use the model eslav_PP-OCRv5_mobile_rec. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
GitHub Issue 模板 (中文版)
标题:
🚨 PaddleOCR 无法正确识别西里尔字母(俄语)— 输出结果为音译
问题描述:
您好!感谢您开发了如此优秀的PaddleOCR工具。🇨🇳🤝🇷🇺
我在使用PaddleOCR识别俄语文本(西里尔字母)时遇到了问题。模型可以运行,但输出的不是西里尔字母,而是拉丁字母的音译结果。
问题复现
输入代码:
实际输出:
期望输出:
已尝试方案
lang="ru"
参数疑问
环境信息
Ubuntu 22.04
)3.10.12
)3.0.0
3.0.0
衷心感谢您的帮助!中国开发者的技术令人钦佩,我们俄罗斯开发者非常珍视与你们的合作!🚀
GitHub Issue Template (English Version)
Title:
🚨 PaddleOCR Fails to Recognize Cyrillic (Russian) — Output Shows Transliteration
Description:
Hello! First, thank you for this amazing PaddleOCR tool. 🇨🇳🤝🇷🇺
I'm encountering an issue with Russian (Cyrillic) text recognition. The model runs but outputs transliterated Latin letters instead of proper Cyrillic characters.
How to Reproduce
Input Code:
Actual Output:
Expected Output:
Troubleshooting Attempted
lang="ru"
parameter is setQuestions
Environment
Ubuntu 22.04
)3.10.12
)3.0.0
3.0.0
Thank you sincerely for your help! Chinese developers' work is truly admirable, and we Russian developers deeply value this collaboration. 🚀
Beta Was this translation helpful? Give feedback.
All reactions