利用PDF文本层的内容
#16202
Replies: 1 comment
-
您好,这个需要修改PP-SturctureV3 的代码,且在提供OCR识别内容的同时还需要提供文本框位置,可以到 PaddleX中去修改和使用。该产线推理代码的位置如下: https://github.com/PaddlePaddle/PaddleX/blob/a8ec0bdec40d4108859ebda48e079d2fdcfb5b82/paddlex/inference/pipelines/layout_parsing/pipeline_v2.py#L890 |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
请问在使用PP-StructureV3对一个pdf文件进行结构化文字提取时,如果PDF文件在当前位置有文本层的情况下,有办法直接整合文本层的内容吗,,不使用OCR的内容,尽量避免文字识别错误的情况
Beta Was this translation helpful? Give feedback.
All reactions