rec识别模型微调后的遗忘现象
#17575
Replies: 1 comment 1 reply
-
|
请问你解决了吗? |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
系统:WIN10 22H2专业版,
python:3.13.5
CUDA:12.9
paddlepaddle版本:3.1.0
仓库版本:3.0.0
模型:ppocr-v5
显卡:英伟达RTX3090 24G显存 单卡
在官方预置模型上微调,第一次微调后能正确识别的字符,增加新数据后进行第二次训练,增加的第二次数据是新数据,与第一次不一样,第一次训练后能正确识别的数据,到第二次训练后,第一次训练的字符不能正确识别了,该调整哪些参数?配置文件如下,每次加入新数据都是重头在官方预置模型上开始训练的,并非在上一次自我训练的模型基础上进行的增量训练。图1为第一次训练的识别结果,图2为第二次训练的结果.图1是正确的结果,图2识别错误
Global:
model_name: PP-OCRv5_mobile_rec # To use static model for inference.
debug: false
use_gpu: true
epoch_num: 60
log_smooth_window: 20
print_batch_step: 1
save_model_dir: G:\PaddleOCR-3.3.0\output\PP-OCRv5_mobile\PP-OCRv5_mobile_rec
save_epoch_step: 1
eval_batch_step: [0, 323]
cal_metric_during_train: true
pretrained_model: G:\PaddleOCR-3.3.0\pretrained_model\PP-OCRv5_mobile\PP-OCRv5_mobile_rec_pretrained.pdparams
checkpoints:
save_inference_dir: null
use_visualdl: false
infer_img: doc/imgs_words/ch/word_1.jpg
character_dict_path: ./ppocr/utils/dict/ppocrv5_dict.txt
max_text_length: &max_text_length 60
infer_mode: false
use_space_char: true
distributed: true
save_res_path: ./output/rec/predicts_ppocrv5.txt
d2s_train_image_shape: [3, 48, 320]
Optimizer:
name: Adam
beta1: 0.9
beta2: 0.999
lr:
name: Cosine
learning_rate: 0.00001
warmup_epoch: 5
regularizer:
name: L2
factor: 3.0e-05
Architecture:
model_type: rec
algorithm: SVTR_LCNet
Transform:
Backbone:
name: PPLCNetV3
scale: 0.95
Head:
name: MultiHead
head_list:
- CTCHead:
Neck:
name: svtr
dims: 120
depth: 2
hidden_dims: 120
kernel_size: [1, 3]
use_guide: True
Head:
fc_decay: 0.00001
- NRTRHead:
nrtr_dim: 384
max_text_length: *max_text_length
Loss:
name: MultiLoss
loss_config_list:
- CTCLoss:
- NRTRLoss:
PostProcess:
name: CTCLabelDecode
Metric:
name: RecMetric
main_indicator: acc
Train:
dataset:
name: MultiScaleDataSet
ds_width: false
data_dir: G:\PaddleOCR-3.3.0\train_data\rec
ext_op_transform_idx: 1
label_file_list:
- G:\PaddleOCR-3.3.0\train_data\rec\train.txt
transforms:
- DecodeImage:
img_mode: BGR
channel_first: false
- RecConAug:
prob: 0.5
ext_data_num: 2
image_shape: [48, 320, 3]
max_text_length: *max_text_length
- RecAug:
- MultiLabelEncode:
gtc_encode: NRTRLabelEncode
- KeepKeys:
keep_keys:
- image
- label_ctc
- label_gtc
- length
- valid_ratio
sampler:
name: MultiScaleSampler
scales: [[320, 32], [320, 48], [320, 64]]
first_bs: &bs 96
fix_bs: false
divided_factor: [8, 16] # w, h
is_training: True
loader:
shuffle: true
batch_size_per_card: *bs
drop_last: true
num_workers: 8
Eval:
dataset:
name: SimpleDataSet
data_dir: G:\PaddleOCR-3.3.0\train_data\rec
label_file_list:
- G:\PaddleOCR-3.3.0\train_data\rec\val.txt
transforms:
- DecodeImage:
img_mode: BGR
channel_first: false
- MultiLabelEncode:
gtc_encode: NRTRLabelEncode
- RecResizeImg:
image_shape: [3, 48, 320]
- KeepKeys:
keep_keys:
- image
- label_ctc
- label_gtc
- length
- valid_ratio
loader:
shuffle: false
drop_last: false
batch_size_per_card: 96
num_workers: 4
图1.bmp
图2.bmp
Beta Was this translation helpful? Give feedback.
All reactions