Replies: 4 comments
-
但是ch_PP-OCRv4_rec_distillation.yml 里面 freeze_params: true 这个参数到底怎么填写 |
Beta Was this translation helpful? Give feedback.
-
还有: ch_PP-OCRv4_rec_distillation.yml 里面填写: |
Beta Was this translation helpful? Give feedback.
-
你的问题涉及微调模型后对训练数据有效但对旧数据失效的现象,这通常与过拟合或模型参数未正确冻结有关。以下是一些可能的原因及对应的调整建议: 🔍 可能的原因:
🛠 推荐调整:
🔗 相关参考:虽然你的问题主要是关于 如果你还有其他测试情况,可以提供更详细的日志,我们可以进一步分析优化策略! 🚀 Response generated by 🤖 feifei-bot | chatgpt-4o-latest |
Beta Was this translation helpful? Give feedback.
-
@GreatV 还有: ch_PP-OCRv4_rec_distillation.yml 里面填写: |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
🔎 Search before asking
🐛 Bug (问题描述)
按照finetune.md: 建议选择PP-OCRv3模型(配置文件:ch_PP-OCRv3_rec_distillation.yml,预训练模型:ch_PP-OCRv3_rec_train.tar)进行微调,其精度与泛化性能是目前提供的最优预训练模型。有个参数问下怎么填写:Architecture:
model_type: &model_type "rec"
name: DistillationModel
algorithm: Distillation
Models:
Teacher:
pretrained:
freeze_params: false
return_all_feats: true
model_type: *model_type
algorithm: SVTR_LCNet
Transform:
其中 freeze_params: false 但是ch_PP-OCRv3_rec_distillation.yml 里面 freeze_params: true 这个参数到底怎么填写,我用finetune.md 微调出来模型感觉对训练后数据都有效果,对以前可以识别数据好像失效,是不是哪里需要配置冻结参数才可以。如果是怎么配置
🏃♂️ Environment (运行环境)
ppocr release/2.10.0版本
命令执行: python tools/train.py -c configs/rec/PP-OCRv3/ch_PP-OCRv3_rec_distillation.ym
🌰 Minimal Reproducible Example (最小可复现问题的Demo)
ppocr release/2.10.0版本
命令执行: python tools/train.py -c configs/rec/PP-OCRv3/ch_PP-OCRv3_rec_distillation.yml
参数配置:
lobal:
debug: false
use_gpu: true
epoch_num: 10
log_smooth_window: 20
print_batch_step: 10
save_model_dir: ./output/rec_ppocr_v3_distillation1
save_epoch_step: 3
eval_batch_step: [0, 2000]
cal_metric_during_train: true
pretrained_model: ./pretrained_models/ch_PP-OCRv3_rec_train/best_accuracy.pdparams
checkpoints:
save_inference_dir:
use_visualdl: false
infer_img: doc/imgs_words/ch/word_1.jpg
character_dict_path: ppocr/utils/ppocr_keys_v1.txt
max_text_length: &max_text_length 25
infer_mode: false
use_space_char: true
distributed: true
save_res_path: ./output/rec/predicts_ppocrv3_distillation.txt
d2s_train_image_shape: [3, 48, -1]
Optimizer:
name: Adam
beta1: 0.9
beta2: 0.999
lr:
name: Cosine
learning_rate: 0.0001
warmup_epoch: 2
regularizer:
name: L2
factor: 0
Architecture:
model_type: &model_type "rec"
name: DistillationModel
algorithm: Distillation
Models:
Teacher:
pretrained:
freeze_params: false
return_all_feats: true
model_type: *model_type
algorithm: SVTR_LCNet
Transform:
Backbone:
name: MobileNetV1Enhance
scale: 0.5
last_conv_stride: [1, 2]
last_pool_type: avg
last_pool_kernel_size: [2, 2]
Head:
name: MultiHead
head_list:
- CTCHead:
Neck:
name: svtr
dims: 64
depth: 2
hidden_dims: 120
use_guide: True
Head:
fc_decay: 0.00001
- SARHead:
enc_dim: 512
max_text_length: *max_text_length
Student:
pretrained:
freeze_params: false
return_all_feats: true
model_type: *model_type
algorithm: SVTR_LCNet
Transform:
Backbone:
name: MobileNetV1Enhance
scale: 0.5
last_conv_stride: [1, 2]
last_pool_type: avg
last_pool_kernel_size: [2, 2]
Head:
name: MultiHead
head_list:
- CTCHead:
Neck:
name: svtr
dims: 64
depth: 2
hidden_dims: 120
use_guide: True
Head:
fc_decay: 0.00001
- SARHead:
enc_dim: 512
max_text_length: *max_text_length
Loss:
name: CombinedLoss
loss_config_list:
weight: 1.0
act: "softmax"
use_log: true
model_name_pairs:
key: head_out
multi_head: True
dis_head: ctc
name: dml_ctc
weight: 0.5
act: "softmax"
use_log: true
model_name_pairs:
key: head_out
multi_head: True
dis_head: sar
name: dml_sar
weight: 1.0
mode: "l2"
model_name_pairs:
key: backbone_out
weight: 1.0
model_name_list: ["Student", "Teacher"]
key: head_out
multi_head: True
weight: 1.0
model_name_list: ["Student", "Teacher"]
key: head_out
multi_head: True
PostProcess:
name: DistillationCTCLabelDecode
model_name: ["Student", "Teacher"]
key: head_out
multi_head: True
Metric:
name: DistillationMetric
base_metric_name: RecMetric
main_indicator: acc
key: "Student"
ignore_space: False
Train:
dataset:
name: SimpleDataSet
data_dir: ./HW_Chinese/
ext_op_transform_idx: 1
label_file_list:
- ./HW_Chinese/train.txt
transforms:
- DecodeImage:
img_mode: BGR
channel_first: false
- RecAug:
- MultiLabelEncode:
- RecResizeImg:
image_shape: [3, 48, 320]
- KeepKeys:
keep_keys:
- image
- label_ctc
- label_sar
- length
- valid_ratio
loader:
shuffle: true
batch_size_per_card: 32
drop_last: true
num_workers: 4
Eval:
dataset:
name: SimpleDataSet
data_dir: ./HW_Chinese/
label_file_list:
- ./HW_Chinese/test.txt
transforms:
- DecodeImage:
img_mode: BGR
channel_first: false
- MultiLabelEncode:
- RecResizeImg:
image_shape: [3, 48, 320]
- KeepKeys:
keep_keys:
- image
- label_ctc
- label_sar
- length
- valid_ratio
loader:
shuffle: false
drop_last: false
batch_size_per_card: 32
num_workers: 4
Beta Was this translation helpful? Give feedback.
All reactions