Replies: 8 comments
-
我是使用预训练模型微调的 |
Beta Was this translation helpful? Give feedback.
-
请检查数据标注是否正确,每个cell单元格都应该有对应的box标注。 |
Beta Was this translation helpful? Give feedback.
-
|
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
|
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
类似于我画出来的这种,我在excel里面是两列合并的 |
Beta Was this translation helpful? Give feedback.
-
标注了一个box框,就需要对应excel里的一个单元格。 只要一一对应好,确保单元格的个数,和box的个数一致就是对的。 |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
请提供下述完整信息以便快速定位问题/Please provide the following information to quickly locate the problem
motmetrics 1.4.0
networkx 3.1
numpy 1.23.5
opencv-contrib-python 4.6.0.66
opencv-python 4.5.5.64
openpyxl 3.1.2
opt-einsum 3.3.0
packaging 23.2
paddle-bfloat 0.1.7
paddlepaddle-gpu 2.4.2.post117
pandas 2.0.3
Pillow 9.5.0
pip 23.3.1
Polygon3 3.0.9.1
premailer 3.10.0
protobuf 3.20.0
psutil 5.9.7
pyclipper 1.3.0.post5
pycocotools 2.0.7
pycryptodome 3.20.0
PyMuPDF 1.20.2
pyparsing 3.1.1
python-dateutil 2.8.2
pytz 2023.3.post1
PyWavelets 1.4.1
PyYAML 6.0.1
rapidfuzz 3.6.1
python3 tools/train.py -c ch_ppstructure_mobile_v2.0_SLANet_train/config.yml
[2024/02/05 03:14:50] ppocr ERROR: When parsing line {"filename": "bztable926.jpg", "html": {"structure": {"tokens": ["", "", "", "", "<td", " colspan="2"", ">", "", "", "", "<td", " colspan="2"", ">", "", "", "", "", "", "", "", "", "", "<td", " colspan="2"", ">", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "<td", " colspan="3"", ">", "", "<td", " colspan="2"", ">", "", "", "", "", "", "", "", "<td", " rowspan="1"", ">", "", "", "", "<td", " colspan="2"", ">", "", "", "", "", "", "", "", "", "", "", "", "", "", "<td", " colspan="2"", ">", "", "", "", "", "", "", "", "", "", "", "", "", "", "<td", " colspan="2"", ">", "", "", "", "", "", "", "", "", "", "", "", "", "", "<td", " colspan="2"", ">", "", "", "", "", "", "", "", "", "", "", "", "", "", "<td", " colspan="2"", ">", "", "", "", "", "", "", "", "", "", "", "", "", "", "<td", " colspan="2"", ">", "", "", "", "", "", "", "", "", "", "", "", "", "", "<td", " colspan="2"", ">", "", "", "", "", "", "", "", "", "", "", "", "", "", "<td", " colspan="2"", ">", "", "", "", "", "", "", "", "", "", "", "", "", "", "<td", " colspan="2"", ">", "", "", "", "", "", "", "", "", "", "", "", "", "", "<td", " colspan="2"", ">", "", "", "", "", "", "", "", "", "", "", "", "", "", "<td", " colspan="2"", ">", "", "", "", "", "", "", "", "", "", "", "", "", "", "<td", " colspan="2"", ">", "", "", "", "", "", "", "", "", "", "", "", "", "", "<td", " colspan="2"", ">", "", "", "", "", "", "", "", "", "", "", "", "", "", "<td", " colspan="2"", ">", "", "", "", "", "", "", "", "", "", "", "", "", "", "<td", " colspan="2"", ">", "", "<td", " colspan="5"", ">", "", "", "", "", "", "<td", " colspan="2"", ">", "", "<td", " colspan="5"", ">", "", "", "", "", "", "<td", " colspan="2"", ">", "", "<td", " colspan="4"", ">", "", "", "", "", "", "<td", " rowspan="3"", ">", "", "<td", " colspan="2"", ">", "", "", "", "<td", " colspan="3"", ">", "", "", "", "", "", "", "", "", "", "", "", "<td", " colspan="3"", ">", "", "", "", "", "", "<td", " colspan="2"", ">", "", "<td", " colspan="5"", ">", "", "", ""]}, "cells": [{"tokens": ["工", "程", "名", "称"], "bbox": [[48, 29], [198, 29], [198, 72], [48, 72]]}, {"tokens": [], "bbox": [[294, 29], [403, 29], [403, 70], [294, 70]]}, {"tokens": ["结", "构", "类", "型"], "bbox": [[464, 29], [613, 29], [613, 71], [464, 71]]}, {"tokens": [], "bbox": [[684, 29], [825, 29], [825, 70], [684, 70]]}, {"tokens": ["层", "数"], "bbox": [[914, 26], [995, 26], [995, 72], [914, 72]]}, {"tokens": [], "bbox": [[1105, 33], [1194, 33], [1194, 63], [1105, 63]]}, {"tokens": ["施", "工", "单", "位"], "bbox": [[49, 89], [197, 89], [197, 130], [49, 130]]}, {"tokens": [], "bbox": [[284, 94], [403, 94], [403, 131], [284, 131]]}, {"tokens": ["技", "术", "负", "责", "人"], "bbox": [[446, 89], [632, 89], [632, 130], [446, 130]]}, {"tokens": [], "bbox": [[692, 92], [808, 92], [808, 126], [692, 126]]}, {"tokens": ["质", "量", "负", "责", "人"], "bbox": [[867, 89], [1047, 89], [1047, 130], [867, 130]]}, {"tokens": [], "bbox": [[1099, 94], [1207, 94], [1207, 124], [1099, 124]]}, {"tokens": ["序", "号"], "bbox": [[29, 171], [120, 171], [120, 217], [29, 217]]}, {"tokens": ["子", "分", "部", "(", "分", "项", ")", "工", "程", "名", "称"], "bbox": [[132, 173], [571, 173], [571, 212], [132, 212]]}, {"tokens": ["(", "粉", "托", ")", "数"], "bbox": [[594, 145], [849, 145], [849, 237], [594, 237]]}, {"tokens": ["1", "4", "7", "4", "7"], "bbox": [[881, 145], [1049, 145], [1049, 239], [881, 239]]}, {"tokens": ["验", "收", "意", "见"], "bbox": [[1092, 171], [1236, 171], [1236, 213], [1092, 213]]}, {"tokens": ["1"], "bbox": [[54, 641], [83, 641], [83, 678], [54, 678]]}, {"tokens": [], "bbox": [[151, 261], [203, 261], [203, 298], [151, 298]]}, {"tokens": [], "bbox": [[295, 266], [481, 266], [481, 294], [295, 294]]}, {"tokens": [], "bbox": [[649, 268], [773, 268], [773, 300], [649, 300]]}, {"tokens": [], "bbox": [[916, 268], [1010, 268], [1010, 289], [916, 289]]}, {"tokens": [], "bbox": [[1125, 268], [1158, 268], [1158, 283], [1125, 283]]}, {"tokens": [], "bbox": [[164, 329], [207, 329], [207, 353], [164, 353]]}, {"tokens": [], "bbox": [[301, 329], [453, 329], [453, 361], [301, 361]]}, {"tokens": [], "bbox": [[681, 329], [775, 329], [775, 350], [681, 350]]}, {"tokens": [], "bbox": [[934, 322], [977, 322], [977, 340], [934, 340]]}, {"tokens": [], "bbox": [[1105, 322], [1194, 322], [1194, 346], [1105, 346]]}, {"tokens": [], "bbox": [[134, 385], [197, 385], [197, 407], [134, 407]]}, {"tokens": [], "bbox": [[327, 372], [455, 372], [455, 405], [327, 405]]}, {"tokens": [], "bbox": [[673, 377], [794, 377], [794, 407], [673, 407]]}, {"tokens": [], "bbox": [[908, 377], [973, 377], [973, 405], [908, 405]]}, {"tokens": [], "bbox": [[1110, 372], [1186, 372], [1186, 403], [1110, 403]]}, {"tokens": [], "bbox": [[138, 437], [197, 437], [197, 461], [138, 461]]}, {"tokens": [], "bbox": [[310, 437], [445, 437], [445, 466], [310, 466]]}, {"tokens": [], "bbox": [[608, 440], [782, 440], [782, 468], [608, 468]]}, {"tokens": [], "bbox": [[895, 433], [1032, 433], [1032, 466], [895, 466]]}, {"tokens": [], "bbox": [[1090, 439], [1182, 439], [1182, 466], [1090, 466]]}, {"tokens": [], "bbox": [[157, 494], [199, 494], [199, 533], [157, 533]]}, {"tokens": [], "bbox": [[334, 496], [473, 496], [473, 535], [334, 535]]}, {"tokens": [], "bbox": [[651, 498], [790, 498], [790, 522], [651, 522]]}, {"tokens": [], "bbox": [[901, 498], [975, 498], [975, 522], [901, 522]]}, {"tokens": [], "bbox": [[1090, 494], [1160, 494], [1160, 520], [1090, 520]]}, {"tokens": [], "bbox": [[142, 557], [223, 557], [223, 590], [142, 590]]}, {"tokens": [], "bbox": [[349, 559], [458, 559], [458, 579], [349, 579]]}, {"tokens": [], "bbox": [[658, 553], [801, 553], [801, 587], [658, 587]]}, {"tokens": [], "bbox": [[901, 557], [1045, 557], [1045, 590], [901, 590]]}, {"tokens": [], "bbox": [[1110, 555], [1201, 555], [1201, 581], [1110, 581]]}, {"tokens": [], "bbox": [[140, 618], [186, 618], [186, 646], [140, 646]]}, {"tokens": [], "bbox": [[332, 611], [484, 611], [484, 646], [332, 646]]}, {"tokens": [], "bbox": [[670, 613], [758, 613], [758, 633], [670, 633]]}, {"tokens": [], "bbox": [[899, 602], [971, 602], [971, 640], [899, 640]]}, {"tokens": [], "bbox": [[1118, 618], [1216, 618], [1216, 646], [1118, 646]]}, {"tokens": [], "bbox": [[149, 674], [220, 674], [220, 707], [149, 707]]}, {"tokens": [], "bbox": [[323, 676], [477, 676], [477, 705], [323, 705]]}, {"tokens": [], "bbox": [[631, 674], [760, 674], [760, 705], [631, 705]]}, {"tokens": [], "bbox": [[908, 666], [994, 666], [994, 698], [908, 698]]}, {"tokens": [], "bbox": [[1107, 672], [1207, 672], [1207, 700], [1107, 700]]}, {"tokens": [], "bbox": [[144, 737], [220, 737], [220, 763], [144, 763]]}, {"tokens": [], "bbox": [[316, 731], [462, 731], [462, 766], [316, 766]]}, {"tokens": [], "bbox": [[620, 727], [731, 727], [731, 763], [620, 763]]}, {"tokens": [], "bbox": [[899, 727], [997, 727], [997, 763], [899, 763]]}, {"tokens": [], "bbox": [[1092, 739], [1197, 739], [1197, 759], [1092, 759]]}, {"tokens": [], "bbox": [[153, 798], [216, 798], [216, 829], [153, 829]]}, {"tokens": [], "bbox": [[345, 794], [473, 794], [473, 822], [345, 822]]}, {"tokens": [], "bbox": [[658, 794], [801, 794], [801, 824], [658, 824]]}, {"tokens": [], "bbox": [[905, 789], [984, 789], [984, 818], [905, 818]]}, {"tokens": [], "bbox": [[1092, 789], [1203, 789], [1203, 815], [1092, 815]]}, {"tokens": [], "bbox": [[145, 850], [216, 850], [216, 877], [145, 877]]}, {"tokens": [], "bbox": [[295, 846], [464, 846], [464, 883], [295, 883]]}, {"tokens": [], "bbox": [[636, 842], [747, 842], [747, 885], [636, 885]]}, {"tokens": [], "bbox": [[903, 848], [1007, 848], [1007, 872], [903, 872]]}, {"tokens": [], "bbox": [[1099, 846], [1192, 846], [1192, 872], [1099, 872]]}, {"tokens": [], "bbox": [[149, 911], [208, 911], [208, 944], [149, 944]]}, {"tokens": [], "bbox": [[351, 915], [473, 915], [473, 950], [351, 950]]}, {"tokens": [], "bbox": [[653, 907], [808, 907], [808, 940], [653, 940]]}, {"tokens": [], "bbox": [[894, 902], [1010, 902], [1010, 935], [894, 935]]}, {"tokens": [], "bbox": [[1101, 905], [1186, 905], [1186, 937], [1101, 937]]}, {"tokens": [], "bbox": [[147, 968], [194, 968], [194, 994], [147, 994]]}, {"tokens": [], "bbox": [[334, 972], [529, 972], [529, 1000], [334, 1000]]}, {"tokens": [], "bbox": [[636, 965], [762, 965], [762, 1005], [636, 1005]]}, {"tokens": [], "bbox": [[890, 961], [1014, 961], [1014, 985], [890, 985]]}, {"tokens": [], "bbox": [[1118, 970], [1205, 970], [1205, 1005], [1118, 1005]]}, {"tokens": [], "bbox": [[147, 1020], [199, 1020], [199, 1048], [147, 1048]]}, {"tokens": [], "bbox": [[331, 1020], [492, 1020], [492, 1053], [331, 1053]]}, {"tokens": [], "bbox": [[651, 1027], [829, 1027], [829, 1050], [651, 1050]]}, {"tokens": [], "bbox": [[877, 1018], [1027, 1018], [1027, 1059], [877, 1059]]}, {"tokens": [], "bbox": [[1092, 1029], [1214, 1029], [1214, 1061], [1092, 1061]]}, {"tokens": ["2"], "bbox": [[56, 1079], [83, 1079], [83, 1119], [56, 1119]]}, {"tokens": ["质", "量", "控", "制", "资", "料"], "bbox": [[178, 1079], [391, 1079], [391, 1115], [178, 1115]]}, {"tokens": [], "bbox": [[608, 1090], [788, 1090], [788, 1120], [608, 1120]]}, {"tokens": [], "bbox": [[1144, 1090], [1220, 1090], [1220, 1116], [1144, 1116]]}, {"tokens": ["3"], "bbox": [[56, 1161], [83, 1161], [83, 1203], [56, 1203]]}, {"tokens": ["安", "全", "检", "测", "力", "报", "验"], "bbox": [[160, 1137], [414, 1137], [414, 1233], [160, 1233]]}, {"tokens": [], "bbox": [[657, 1163], [820, 1163], [820, 1205], [657, 1205]]}, {"tokens": [], "bbox": [[1121, 1163], [1207, 1163], [1207, 1224], [1121, 1224]]}, {"tokens": ["4"], "bbox": [[54, 1248], [85, 1248], [85, 1286], [54, 1286]]}, {"tokens": ["观", "感", "质", "量", "验", "收"], "bbox": [[174, 1241], [393, 1241], [393, 1286], [174, 1286]]}, {"tokens": [], "bbox": [[647, 1248], [907, 1248], [907, 1281], [647, 1281]]}, {"tokens": [], "bbox": [[1108, 1252], [1225, 1252], [1225, 1277], [1108, 1277]]}, {"tokens": ["单", "收"], "bbox": [[27, 1378], [121, 1378], [121, 1489], [27, 1489]]}, {"tokens": ["施", "工", "单", "位"], "bbox": [[208, 1302], [357, 1302], [357, 1346], [208, 1346]]}, {"tokens": ["项", "目", "经", "理"], "bbox": [[501, 1299], [651, 1299], [651, 1346], [501, 1346]]}, {"tokens": [], "bbox": [[744, 1311], [807, 1311], [807, 1333], [744, 1333]]}, {"tokens": ["年", "月", "日"], "bbox": [[981, 1299], [1225, 1299], [1225, 1350], [981, 1350]]}, {"tokens": ["设", "计", "单", "位"], "bbox": [[210, 1362], [357, 1362], [357, 1404], [210, 1404]]}, {"tokens": ["项", "目", "负", "责", "人"], "bbox": [[486, 1362], [669, 1362], [669, 1404], [486, 1404]]}, {"tokens": [], "bbox": [[736, 1365], [820, 1365], [820, 1400], [736, 1400]]}, {"tokens": ["年", "月", "日"], "bbox": [[981, 1357], [1218, 1357], [1218, 1398], [981, 1398]]}, {"tokens": ["监", "理", "(", "建", "设", ")", "单", "位"], "bbox": [[140, 1475], [426, 1475], [426, 1510], [140, 1510]]}, {"tokens": [], "bbox": [[584, 1472], [1023, 1472], [1023, 1509], [584, 1509]]}]}, "gt": "<td colspan="2"><td colspan="2"><td colspan="2"><td colspan="3">工程名称<td colspan="2"><td rowspan="1">层数<td colspan="2">施工单位<td colspan="2">序号<td colspan="2"><td colspan="2"><td colspan="2"><td colspan="2"><td colspan="2"><td colspan="2"><td colspan="2"><td colspan="2"><td colspan="2"><td colspan="2"><td colspan="2"><td colspan="2"><td colspan="2"><td colspan="5"><td colspan="2">安全检测力报验<td colspan="5"><td colspan="2">4<td colspan="4">观感质量验收<td rowspan="3"><td colspan="2">单收<td colspan="3">项目经理<td colspan="3"><td colspan="2">监理(建设)单位<td colspan="5">
File "/home/zgc/PaddleOCR-release-2.6/ppocr/data/pubtab_dataset.py", line 119, in getitem
outs = transform(data, self.ops)
File "/home/zgc/PaddleOCR-release-2.6/ppocr/data/imaug/init.py", line 56, in transform
data = op(data)
File "/home/zgc/PaddleOCR-release-2.6/ppocr/data/imaug/label_ops.py", line 725, in call
if 'bbox' in cells[bbox_idx] and len(cells[bbox_idx][
IndexError: list index out of range
我们提供了AceIssueSolver来帮助你解答问题,你是否想要它来解答(请填写yes/no)?/We provide AceIssueSolver to solve issues, do you want it? (Please write yes/no):
yes
请尽量不要包含图片在问题中/Please try to not include the image in the issue.
我的yml配置是:
Global:
use_gpu: true
epoch_num: 400
log_smooth_window: 20
print_batch_step: 20
save_model_dir: ./output/SLANet_en_merge_newdict_one
save_epoch_step: 400
eval_batch_step:
cal_metric_during_train: true
pretrained_model: ./ch_ppstructure_mobile_v2.0_SLANet_train/best_accuracy.pdparams
checkpoints:
save_inference_dir: ./output/SLANet_en_merge_newdict_one/infer
use_visualdl: false
infer_img: doc/table/table.jpg
character_dict_path: ppocr/utils/dict/table_structure_dict_ch.txt
character_type: en
max_text_length: 1000
box_format: xyxyxyxy
infer_mode: false
use_sync_bn: true
save_res_path: output/infer
Optimizer:
name: Adam
beta1: 0.9
beta2: 0.999
clip_norm: 5.0
lr:
learning_rate: 0.00025
regularizer:
name: L2
factor: 0.0
Architecture:
model_type: table
algorithm: SLANet
Backbone:
name: PPLCNet
scale: 1.0
pretrained: true
use_ssld: true
Neck:
name: CSPPAN
out_channels: 96
Head:
name: SLAHead
hidden_size: 256
max_text_length: 1000
loc_reg_num: 8
Loss:
name: SLALoss
structure_weight: 1.0
loc_weight: 2.0
loc_loss: smooth_l1
PostProcess:
name: TableLabelDecode
merge_no_span_structure: true
Metric:
name: TableMetric
main_indicator: acc
compute_bbox_metric: false
loc_reg_num: 8
box_format: xyxyxyxy
del_thead_tbody: false
Train:
dataset:
name: PubTabDataSet
data_dir: ./tabel_one_train/train_image
label_file_list:
ratio_list:
transforms:
img_mode: BGR
channel_first: false
learn_empty_box: false
merge_no_span_structure: true
replace_empty_cell_token: false
loc_reg_num: 8
max_text_length: 1000
box_format: xyxyxyxy
max_len: 488
scale: 1./255.
mean:
std:
order: hwc
size:
keep_keys:
loader:
shuffle: true
batch_size_per_card: 1
drop_last: true
num_workers: 1
use_shared_memory: false
Eval:
dataset:
name: PubTabDataSet
data_dir: ./tabel_one_train/val_image/
label_file_list:
transforms:
img_mode: BGR
channel_first: false
learn_empty_box: false
merge_no_span_structure: true
replace_empty_cell_token: false
loc_reg_num: 8
max_text_length: 1000
box_format: xyxyxyxy
max_len: 488
scale: 1./255.
mean:
std:
order: hwc
size:
keep_keys:
loader:
shuffle: false
drop_last: false
batch_size_per_card: 1
num_workers: 1
use_shared_memory: false
profiler_options: null
原图片:
我试了好多图片都出现这个错,使用pubtabnet就没问题,使用的ppocrlabel标注导出的数据。
Beta Was this translation helpful? Give feedback.
All reactions