关于自己训练PaddleOCR识别模型后,模型的泛化能力下降的问题求助 #15148
Unanswered
wsybb252237
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
自己在进行手写字的训练的时候,使用的是自己数据集(大概4256张图片),使用的预训练模型是“ch_PP-OCRv4_rec_server_infer”,但是训练完以后,在自己的数据集上识别效果不错,但是在其他的汉字的识别上效果较差,这些字是预训练模型应该能够识别出来的字,我想请教一下究竟是什么原因导致的,难道自己的数据集微调后,模型在其他数据上的泛化能力就会变差吗,希望能够得到回复,万分感谢!下面是我的参数:
Global:
debug: false
use_gpu: true
epoch_num: 100
log_smooth_window: 20
print_batch_step: 10
save_model_dir: ./output/rec_ppocr_v4_hgnet_server
save_epoch_step: 10
eval_batch_step: [0, 200]
cal_metric_during_train: true
pretrained_model: /PaddleOCR_server/pretrained/ch_PP-OCRv4_rec_server_train/best_accuracy.pdparams
checkpoints:
save_inference_dir:
use_visualdl: false
infer_img: doc/imgs_words/ch/word_1.jpg
character_dict_path: ppocr/utils/ppocr_keys_v1.txt
max_text_length: &max_text_length 25
infer_mode: false
use_space_char: true
distributed: true
save_res_path: ./output/rec/predicts_ppocrv3.txt
d2s_train_image_shape: [3, 48, 320]
Optimizer:
name: Adam
beta1: 0.9
beta2: 0.999
lr:
name: Piecewise
learning_rate: [0.00002, 0.000005]
regularizer:
name: L2
factor: 3.0e-05
Architecture:
model_type: rec
algorithm: SVTR_HGNet
Transform:
Backbone:
name: PPHGNet_small
Head:
name: MultiHead
head_list:
- CTCHead:
Neck:
name: svtr
dims: 120
depth: 2
hidden_dims: 120
kernel_size: [1, 3]
use_guide: True
Head:
fc_decay: 0.00001
- NRTRHead:
nrtr_dim: 384
max_text_length: *max_text_length
Loss:
name: MultiLoss
loss_config_list:
- CTCLoss:
- NRTRLoss:
PostProcess:
name: CTCLabelDecode
Metric:
name: RecMetric
main_indicator: acc
Train:
dataset:
name: MultiScaleDataSet
ds_width: false
data_dir: ./train_data/
ext_op_transform_idx: 1
label_file_list:
- ./train_data/rec/train.txt
transforms:
- DecodeImage:
img_mode: BGR
channel_first: false
-去除 RecConAug 增广
- RecConAug:
prob: 0.5
ext_data_num: 2
image_shape: [48, 320, 3]
max_text_length: *max_text_length
sampler:
name: MultiScaleSampler
scales: [[320, 32], [320, 48], [320, 64]]
first_bs: &bs 64
fix_bs: false
divided_factor: [8, 16] # w, h
is_training: True
loader:
shuffle: true
batch_size_per_card: 8
drop_last: true
num_workers: 8
Eval:
dataset:
name: SimpleDataSet
data_dir: ./train_data
label_file_list:
- ./train_data/rec/val.txt
transforms:
- DecodeImage:
img_mode: BGR
channel_first: false
- MultiLabelEncode:
gtc_encode: NRTRLabelEncode
- RecResizeImg:
image_shape: [3, 48, 320]
- KeepKeys:
keep_keys:
- image
- label_ctc
- label_gtc
- length
- valid_ratio
loader:
shuffle: false
drop_last: false
batch_size_per_card: 8
num_workers: 4
Beta Was this translation helpful? Give feedback.
All reactions