针对可解析的pdf文件能否不用OCR,而根据bbox坐标直接从pdf文件中直接提取 #12578
Unanswered
SopmmmodII
asked this question in
Q&A
Replies: 3 comments
-
同样的问题 |
Beta Was this translation helpful? Give feedback.
0 replies
-
蹲一个解答 |
Beta Was this translation helpful? Give feedback.
0 replies
-
这个是可以的。参考博客:pdfplumber和pdfminer.six提取PDF中文本行内容及对应坐标 |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
请提供下述完整信息以便快速定位问题/Please provide the following information to quickly locate the problem
我们提供了AceIssueSolver来帮助你解答问题,你是否想要它来解答(请填写yes/no)?/We provide AceIssueSolver to solve issues, do you want it? (Please write yes/no):yes
需要将非扫描版的pdf提取文章的组织结构还原成一棵树,如果使用OCR可能会增加耗时并且识别错误,能否利用版面分析结果中的bbox坐标直接通过读取原始pdf文件的方式获取内容?
Beta Was this translation helpful? Give feedback.
All reactions