Replies: 1 comment 2 replies
-
|
Hi @Reiji777! I'm Dosu and I’m helping the MinerU team. 是的,MinerU 支持直接从文本型 PDF 中提取表格文字,而不使用 OCR。 您遇到的问题是因为 解决方法:将 "backend": "hybrid-auto-engine",
"parse_method": "txt", # 强制直接文本提取,不使用OCR
"formula_enable": False,
"table_enable": True,
参考:相关讨论 To reply, just mention @dosu. How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
目前我使用的参数是:
"backend": "hybrid-auto-engine",
"parse_method": "auto",
"formula_enable": False,
"table_enable": True,
我发现提取出来的表格的文字是ocr出来的,而不是直接抽取出来的。
Beta Was this translation helpful? Give feedback.
All reactions