关于PPOCRLabel进行表格标注后生成的gt.txt内容的问题 #12711
Replies: 7 comments
-
Beta Was this translation helpful? Give feedback.
-
可以手动调整的,不对就直接改gt.txt |
Beta Was this translation helpful? Give feedback.
-
请问我的gt.txt是空的怎么操作呢 |
Beta Was this translation helpful? Give feedback.
-
ppocrlabel是通过xls导出为html的,需要修改xls的结构,且无法处理有空单元格问题。 |
Beta Was this translation helpful? Give feedback.
-
并且,如果标注的顺序不对,也会导致转换的结果错乱。 |
Beta Was this translation helpful? Give feedback.
-
|
Beta Was this translation helpful? Give feedback.
-
因PPOCRLabel已经搬家了,后续有问题请移步PPOCRLabel提问哈! |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
您好,我在使用PPOCRLabel进行表格标注的工作,在导出表格标注结果后,标注结果保存在gt.txt中,但是恢复“gt”的表格,发现和原图片中的表格不一致,具体如下:
标注表格图片为:

导出的表格标注结果为:
<td colspan="4">教育发展专项资金(职业教育“扩容、提质、强服务”)<td colspan="2">广东省教育厅<td colspan="2">专项资金<td colspan="4">(2019)年-(2021)年<td rowspan="2">资金需求<td colspan="3">18, 000, 000, 000. 00<td colspan="3">6, 000, 000, 000. 00{"filename": "zh_val_1_part1_copy.jpg", "html": {"structure": {"tokens": ["", "", "", "", "<td", " colspan="4"", ">", "", "", "", "", "", "<td", " colspan="2"", ">", "", "", "", "", "", "", "", "", "", "", "", "", "", "<td", " colspan="2"", ">", "", "", "", "", "", "<td", " colspan="4"", ">", "", "", "", "<td", " rowspan="2"", ">", "", "", "", "<td", " colspan="3"", ">", "", "", "", "", "", "<td", " colspan="3"", ">", "", "", ""]}, "cells": [{"tokens": ["项", "目", "名", "称"], "bbox": [[50, 31], [124, 31], [124, 54], [50, 54]]}, {"tokens": ["教", "育", "发", "展", "专", "项", "资", "金", "(", "职", "业", "教", "育", "“", "扩", "容", "、", "提", "质", "、", "强", "服", "务", "”", ")"], "bbox": [[471, 32], [897, 32], [897, 54], [471, 54]]}, {"tokens": ["省", "级", "业", "务", "主", "管", "部", "门"], "bbox": [[14, 75], [158, 75], [158, 94], [14, 94]]}, {"tokens": ["广", "东", "省", "教", "育", "厅"], "bbox": [[375, 75], [480, 75], [480, 94], [375, 94]]}, {"tokens": ["申", "报", "单", "位"], "bbox": [[791, 73], [862, 73], [862, 97], [791, 97]]}, {"tokens": ["广", "东", "省", "教", "育", "厅"], "bbox": [[1027, 77], [1131, 77], [1131, 97], [1027, 97]]}, {"tokens": ["项", "目", "申", "报", "属", "性"], "bbox": [[32, 98], [139, 98], [139, 117], [32, 117]]}, {"tokens": ["新", "增", "安", "排"], "bbox": [[256, 97], [330, 97], [330, 120], [256, 120]]}, {"tokens": ["项", "目", "类", "型"], "bbox": [[518, 98], [589, 98], [589, 117], [518, 117]]}, {"tokens": ["专", "项", "资", "金"], "bbox": [[905, 97], [978, 97], [978, 123], [905, 123]]}, {"tokens": ["项", "目", "实", "施", "周", "期"], "bbox": [[34, 121], [139, 121], [139, 141], [34, 141]]}, {"tokens": ["(", "2", "0", "1", "9", ")", "年", "-", "(", "2", "0", "2", "1", ")", "年"], "bbox": [[598, 124], [779, 124], [779, 143], [598, 143]]}, {"tokens": ["资", "金", "需", "求"], "bbox": [[50, 146], [122, 146], [122, 187], [50, 187]]}, {"tokens": ["总", "金", "额"], "bbox": [[265, 146], [321, 146], [321, 166], [265, 166]]}, {"tokens": ["1", "8", ",", " ", "0", "0", "0", ",", " ", "0", "0", "0", ",", " ", "0", "0", "0", ".", " ", "0", "0"], "bbox": [[731, 148], [881, 148], [881, 168], [731, 168]]}, {"tokens": ["其", "中", ":", "2", "0", "2", "0", "年", "金", "额"], "bbox": [[218, 167], [366, 167], [366, 193], [218, 193]]}, {"tokens": ["6", ",", " ", "0", "0", "0", ",", " ", "0", "0", "0", ",", " ", "0", "0", "0", ".", " ", "0", "0"], "bbox": [[734, 172], [876, 172], [876, 191], [734, 191]]}]}, "gt": "
自己写函数恢复“gt”内容,显示如下:

可以看到,合并单元格的部分没生效,会有多次重复的文本,请问这种问题需要怎么解决,感谢~
Beta Was this translation helpful? Give feedback.
All reactions