PaddleOCR Layout Coordinate Mismatch #15957
-
Hi everyone, I’m working with PaddleOCR’s layout module to extract bounding boxes for different elements in a PDF file. I’ve tried two approaches:
In both cases, when I visualize the predicted bounding boxes immediately after inference, they look correct, the boxes align as expected in the output image. I want to process these coordinates programmatically to group nearby elements in the same column (e.g., merging text elements that logically belong together). The issue: When I try to manually draw these predicted boxes on the PDF page or rendered image, using the coordinates returned by the model (without any modifications), they don’t align correctly with the actual content anymore. The boxes appear offset or scaled incorrectly. Below is the code I used to render the coordinates on the image path extracted from the PDF. def visualize_text_boxes_from_json_image(json_path, image_path, save_path=None):
Below is the inference code. model = LayoutDetection(model_name="PP-DocLayout_plus-L") |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 4 replies
-
For anyone who runs into this issue, is that the image get scaled to a different size, like width and height that's why the bounding boxes are off even though it doesn't look like it. Take a look at the output from the predict function and calculate it's width and height, then make sure to scale the bounding boxes accordingly when you want to show it on your image. |
Beta Was this translation helpful? Give feedback.
For anyone who runs into this issue, is that the image get scaled to a different size, like width and height that's why the bounding boxes are off even though it doesn't look like it. Take a look at the output from the predict function and calculate it's width and height, then make sure to scale the bounding boxes accordingly when you want to show it on your image.