[Question]Simplified OCR Output without Tags/Details in PaddleOCR #12319
Unanswered
mriamnobody
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
System Environment: Windows 11 22631.3235
PaddleOCR: v2.7.1
Is it possible to get the output or OCR data without the tags/details (only the OCRed text):
Instead of this format:
{"type": "table", "bbox": [136, 351, 2048, 845], "res": {"cell_bbox": [[19.227834701538086, 4.0235595703125, 613.1092529296875, 4.159567356109619, 621.1354370117188, 109.49986267089844, 19.038808822631836, 111.1724624633789], [97.65766906738281, 56.754940032958984, 1715.3077392578125, 55.68955993652344, 1713.501708984375, 165.17510986328125, 96.48912811279297, 167.88694763183594], [72.8014907836914, 161.61546325683594, 1803.4071044921875, 160.20396423339844, 1800.0126953125, 311.4237060546875, 70.69056701660156, 312.72314453125], [57.616302490234375, 263.7750549316406, 1813.93359375, 261.6246032714844, 1812.409912109375, 429.18951416015625, 56.18895721435547, 429.3775329589844], [59.09648895263672, 327.2485656738281, 1732.986083984375, 322.6684265136719, 1730.808349609375, 452.80755615234375, 56.49951171875, 453.76806640625]], "html": "<html><body><table><tbody><tr><td>Gel electrophoresis</td></tr><tr><td>Cloning libraries</td></tr><tr><td>Restriction enzyme mapping PCR</td></tr><tr><td></td></tr><tr><td>Nucleic Acid Hybridization DNA Microarrays</td></tr></tbody></table></body></html>"}, "img_idx": 0} {"type": "reference", "bbox": [212, 353, 1005, 842], "res": [{"text": "Gel electrophoresis", "confidence": 0.9871773719787598, "text_region": [[279.0, 359.0], [764.0, 363.0], [764.0, 408.0], [278.0, 404.0]]}, {"text": "Cloning libraries", "confidence": 0.9837406277656555, "text_region": [[283.0, 444.0], [698.0, 444.0], [698.0, 493.0], [283.0, 493.0]]}, {"text": "Restriction enzyme mapping", "confidence": 0.999187171459198, "text_region": [[279.0, 529.0], [1003.0, 537.0], [1003.0, 585.0], [278.0, 578.0]]}, {"text": "PCR", "confidence": 0.9987562298774719, "text_region": [[278.0, 616.0], [382.0, 616.0], [382.0, 662.0], [278.0, 662.0]]}, {"text": "Nucleic Acid Hybridization", "confidence": 0.9799649119377136, "text_region": [[281.0, 701.0], [953.0, 705.0], [952.0, 749.0], [280.0, 746.0]]}, {"text": "DNA Microarrays", "confidence": 0.9897088408470154, "text_region": [[280.0, 787.0], [714.0, 795.0], [713.0, 837.0], [279.0, 829.0]]}], "img_idx": 0}
Beta Was this translation helpful? Give feedback.
All reactions