Replies: 1 comment
-
Could you give me a sample of your data so I can locate it? |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
paddle version: 3.1.0
config: /PaddleOCR-main/configs/rec/PP-OCRv5/PP-OCRv5_server_rec.yml
data figure:
1). The data scale is over 20,000
2). The resolution of each image is 320 pixels wide by 32 pixels high
3). Each picture has black characters on a white background, and each picture contains only one Chinese character
4). The name of each file image follows a naming convention similar to char_4E00
pretrained model: PP-OCRv5_server_rec_pretrained.pdparams
The yml file I modified based on the dataset is as follows:
‘’‘
Global:
model_name: PP-OCRv5_server_rec # To use static model for inference.
debug: false
use_gpu: true
epoch_num: 75
log_smooth_window: 20
print_batch_step: 10
save_model_dir: ./output/rec/ppocrv5_ch
save_epoch_step: 1
eval_batch_step: [0, 2000]
cal_metric_during_train: true
calc_epoch_interval: 1
pretrained_model: /home/aistudio/work/PaddleOCR-main/model/PP-OCRv5_server_rec_pretrained.pdparams
checkpoints:
save_inference_dir:
use_visualdl: false
infer_img: doc/imgs_words/ch/word_1.jpg
character_dict_path: /home/aistudio/work/PaddleOCR-main/proc-data/train_data/ch_dict.txt
max_text_length: 1
infer_mode: false
use_space_char: true
distributed: true
save_res_path: ./output/rec/predicts_ppocrv5.txt
d2s_train_image_shape: [3, 32, 320]
Optimizer:
name: Adam
beta1: 0.9
beta2: 0.999
lr:
name: Cosine
learning_rate: 0.0005
warmup_epoch: 1
regularizer:
name: L2
factor: 3.0e-05
Architecture:
model_type: rec
algorithm: SVTR_HGNet
Transform:
Backbone:
name: PPHGNetV2_B4
text_rec: True
Head:
name: CTCHead
head_list:
- CTCHead:
Neck:
name: svtr
dims: 120
depth: 2
hidden_dims: 120
kernel_size: [1, 3]
use_guide: True
Head:
fc_decay: 0.00001
Loss:
name: CTCLoss
PostProcess:
name: CTCLabelDecode
Metric:
name: RecMetric
main_indicator: acc
Train:
dataset:
name: MultiScaleDataSet
ds_width: false
data_dir: /home/aistudio/work/PaddleOCR-main/proc-data/train_data/
ext_op_transform_idx: 1
label_file_list:
- /home/aistudio/work/PaddleOCR-main/proc-data/train_data/train_list.txt
transforms:
- DecodeImage:
img_mode: BGR
channel_first: false
- KeepKeys:
keep_keys:
- image
- label
sampler:
name: MultiScaleSampler
scales: [[320, 32]]
first_bs: &bs 128
fix_bs: false
divided_factor: [8, 16] # w, h
is_training: True
loader:
shuffle: true
batch_size_per_card: 256
drop_last: true
num_workers: 16
Eval:
dataset:
name: SimpleDataSet
data_dir: /home/aistudio/work/PaddleOCR-main/proc-data/train_data/
label_file_list:
- /home/aistudio/work/PaddleOCR-main/proc-data/train_data/val_list.txt
transforms:
- DecodeImage:
img_mode: BGR
channel_first: false
- RecResizeImg:
image_shape: [3, 32, 320]
- KeepKeys:
keep_keys:
- image
- label
loader:
shuffle: false
drop_last: false
batch_size_per_card: 512
num_workers: 4
‘’‘
After running with this configuration file, the error details obtained are as follows:
'''
Traceback (most recent call last):
File "/home/aistudio/work/PaddleOCR-main/tools/train.py", line 272, in
main(config, device, logger, vdl_writer, seed)
File "/home/aistudio/work/PaddleOCR-main/tools/train.py", line 225, in main
program.train(
File "/home/aistudio/work/PaddleOCR-main/tools/program.py", line 356, in train
preds = model(images, data=batch[1:])
File "/opt/conda/envs/pure-paddle/lib/python3.10/site-packages/paddle/nn/layer/layers.py", line 1571, in call
return self.forward(*inputs, **kwargs)
File "/home/aistudio/work/PaddleOCR-main/ppocr/modeling/architectures/base_model.py", line 99, in forward
x = self.head(x, targets=data)
File "/opt/conda/envs/pure-paddle/lib/python3.10/site-packages/paddle/nn/layer/layers.py", line 1571, in call
return self.forward(*inputs, **kwargs)
File "/home/aistudio/work/PaddleOCR-main/ppocr/modeling/heads/rec_ctc_head.py", line 79, in forward
predicts = self.fc(x)
File "/opt/conda/envs/pure-paddle/lib/python3.10/site-packages/paddle/nn/layer/layers.py", line 1571, in call
return self.forward(*inputs, **kwargs)
File "/opt/conda/envs/pure-paddle/lib/python3.10/site-packages/paddle/nn/layer/common.py", line 223, in forward
out = F.linear(
File "/opt/conda/envs/pure-paddle/lib/python3.10/site-packages/paddle/nn/functional/common.py", line 2310, in linear
return _C_ops.linear(x, weight, bias)
ValueError: (InvalidArgument) Input(Y) has error dim. Y'dims[0] must be equal to 40, but received Y'dims[0] is 2048.
[Hint: Expected y_dims[y_ndim - 2] == K, but received y_dims[y_ndim - 2]:2048 != K:40.] (at ../paddle/phi/kernels/impl/matmul_kernel_impl.h:332)
[operator < linear > error]
'''
I've already run out of ideas for modifications. I hope the experts in the community can provide valuable suggestions for improvement
Beta Was this translation helpful? Give feedback.
All reactions