PaddleOCR/main/en/version3.x/module_usage/layout_detection #15691
Replies: 2 comments 8 replies
-
When using Layout Detection, how can one extract the text inside each layout box, without having to use another model object and predict ? (when using a PDF) model = LayoutDetection(model_name="PP-DocLayout_plus-L", layout_merge_bboxes_mode="large") # keep the largest outer box, remove inner overlapping boxes
layout_result = model.predict(pdf_path, batch_size=1, layout_nms=True) To get the layout and it works fine. But I would like to different activity with different box labels identified. |
Beta Was this translation helpful? Give feedback.
-
Hi, I’m using the LayoutDetection module with the model PP-DocLayout_plus-L to perform layout detection on resumes. The problem I’m facing is that the model generates multiple bounding boxes that are very close to each other, especially in sections like Experience and Education. For example, in the Experience section, the model creates one bounding box for every single line, but what I want is a single bounding box for the entire section. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
PaddleOCR/main/en/version3.x/module_usage/layout_detection
Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)
https://paddlepaddle.github.io/PaddleOCR/main/en/version3.x/module_usage/layout_detection.html
Beta Was this translation helpful? Give feedback.
All reactions