[PP-OCRv5] Table layout issues on slightly skewed scanned pages despite orientation-related options #15873

Dann00CS · 2025-06-26T07:34:58Z

Dann00CS
Jun 26, 2025

Hi and thanks for your excellent work on the PaddleOCR library.

I’m using PPStructureV3 (PaddleOCR v3.0) as part of a pipeline to process scanned documents that contain a mix of text and tables. In general, the pipeline performs very well. However, I’ve encountered a specific limitation:

When processing pages that contain slightly skewed tables (e.g., due to minor scanning angle errors), the layout reconstruction fails to extract the table structure correctly. This occurs even when setting:

PPStructureV3(
    use_doc_orientation_classify=True,
    use_doc_unwarping=True,
    use_textline_orientation=True,
    layout_detection_model_name=None,
    device="cpu"
)

I have confirmed that the text boxes are detected reasonably well, but the resulting table structure is often completely broken or missed when the entire page (or its content) is slightly rotated (~3–7 degrees).

My main questions are:

Is there any internal skew correction in the pipeline beyond the document-level orientation classifier (use_doc_orientation_classify)?
Is the layout detection model (e.g. RT-DETR-L) expected to be rotation-invariant to minor skew, or does it require explicitly deskewed input?
Are there any recommended preprocessing steps (rotation, warping, etc.) you advise prior to using PPStructureV3 for better layout recognition on slightly rotated scans?
Would fine-tuning the layout model on slightly skewed data improve its robustness, or is this already accounted for in the pretraining dataset?

I am open to contribute or test suggested workarounds if needed.
Thanks again for your time and for sharing such a powerful tool.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[PP-OCRv5] Table layout issues on slightly skewed scanned pages despite orientation-related options #15873

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

[PP-OCRv5] Table layout issues on slightly skewed scanned pages despite orientation-related options #15873

Uh oh!

Dann00CS Jun 26, 2025

Replies: 0 comments

Dann00CS
Jun 26, 2025