微调参数确认：微调出来模型感觉对训练后数据都有效果，对以前可以识别数据好像失效，是不是哪里需要配置冻结参数才可以 #14867

hecheng64 · 2025-03-16T11:39:51Z

hecheng64
Mar 16, 2025

🔎 Search before asking

I have searched the PaddleOCR Docs and found no similar bug report.
I have searched the PaddleOCR Issues and found no similar bug report.
I have searched the PaddleOCR Discussions and found no similar bug report.

🐛 Bug (问题描述)

按照finetune.md：建议选择PP-OCRv3模型（配置文件：ch_PP-OCRv3_rec_distillation.yml，预训练模型：ch_PP-OCRv3_rec_train.tar）进行微调，其精度与泛化性能是目前提供的最优预训练模型。有个参数问下怎么填写：Architecture:
model_type: &model_type "rec"
name: DistillationModel
algorithm: Distillation
Models:
Teacher:
pretrained:
freeze_params: false
return_all_feats: true
model_type: *model_type
algorithm: SVTR_LCNet
Transform:
其中 freeze_params: false 但是ch_PP-OCRv3_rec_distillation.yml 里面 freeze_params: true 这个参数到底怎么填写，我用finetune.md 微调出来模型感觉对训练后数据都有效果，对以前可以识别数据好像失效，是不是哪里需要配置冻结参数才可以。如果是怎么配置

🏃‍♂️ Environment (运行环境)

ppocr release/2.10.0版本
命令执行： python tools/train.py -c configs/rec/PP-OCRv3/ch_PP-OCRv3_rec_distillation.ym

🌰 Minimal Reproducible Example (最小可复现问题的Demo)

ppocr release/2.10.0版本
命令执行： python tools/train.py -c configs/rec/PP-OCRv3/ch_PP-OCRv3_rec_distillation.yml
参数配置：
lobal:
debug: false
use_gpu: true
epoch_num: 10
log_smooth_window: 20
print_batch_step: 10
save_model_dir: ./output/rec_ppocr_v3_distillation1
save_epoch_step: 3
eval_batch_step: [0, 2000]
cal_metric_during_train: true
pretrained_model: ./pretrained_models/ch_PP-OCRv3_rec_train/best_accuracy.pdparams
checkpoints:
save_inference_dir:
use_visualdl: false
infer_img: doc/imgs_words/ch/word_1.jpg
character_dict_path: ppocr/utils/ppocr_keys_v1.txt
max_text_length: &max_text_length 25
infer_mode: false
use_space_char: true
distributed: true
save_res_path: ./output/rec/predicts_ppocrv3_distillation.txt
d2s_train_image_shape: [3, 48, -1]

Optimizer:
name: Adam
beta1: 0.9
beta2: 0.999
lr:
name: Cosine
learning_rate: 0.0001
warmup_epoch: 2
regularizer:
name: L2
factor: 0

Architecture:
model_type: &model_type "rec"
name: DistillationModel
algorithm: Distillation
Models:
Teacher:
pretrained:
freeze_params: false
return_all_feats: true
model_type: *model_type
algorithm: SVTR_LCNet
Transform:
Backbone:
name: MobileNetV1Enhance
scale: 0.5
last_conv_stride: [1, 2]
last_pool_type: avg
last_pool_kernel_size: [2, 2]
Head:
name: MultiHead
head_list:
- CTCHead:
Neck:
name: svtr
dims: 64
depth: 2
hidden_dims: 120
use_guide: True
Head:
fc_decay: 0.00001
- SARHead:
enc_dim: 512
max_text_length: *max_text_length
Student:
pretrained:
freeze_params: false
return_all_feats: true
model_type: *model_type
algorithm: SVTR_LCNet
Transform:
Backbone:
name: MobileNetV1Enhance
scale: 0.5
last_conv_stride: [1, 2]
last_pool_type: avg
last_pool_kernel_size: [2, 2]
Head:
name: MultiHead
head_list:
- CTCHead:
Neck:
name: svtr
dims: 64
depth: 2
hidden_dims: 120
use_guide: True
Head:
fc_decay: 0.00001
- SARHead:
enc_dim: 512
max_text_length: *max_text_length
Loss:
name: CombinedLoss
loss_config_list:

DistillationDMLLoss:
weight: 1.0
act: "softmax"
use_log: true
model_name_pairs:
- ["Student", "Teacher"]
  key: head_out
  multi_head: True
  dis_head: ctc
  name: dml_ctc
DistillationDMLLoss:
weight: 0.5
act: "softmax"
use_log: true
model_name_pairs:
- ["Student", "Teacher"]
  key: head_out
  multi_head: True
  dis_head: sar
  name: dml_sar
DistillationDistanceLoss:
weight: 1.0
mode: "l2"
model_name_pairs:
- ["Student", "Teacher"]
  key: backbone_out
DistillationCTCLoss:
weight: 1.0
model_name_list: ["Student", "Teacher"]
key: head_out
multi_head: True
DistillationSARLoss:
weight: 1.0
model_name_list: ["Student", "Teacher"]
key: head_out
multi_head: True

PostProcess:
name: DistillationCTCLabelDecode
model_name: ["Student", "Teacher"]
key: head_out
multi_head: True

Metric:
name: DistillationMetric
base_metric_name: RecMetric
main_indicator: acc
key: "Student"
ignore_space: False

Train:
dataset:
name: SimpleDataSet
data_dir: ./HW_Chinese/
ext_op_transform_idx: 1
label_file_list:
- ./HW_Chinese/train.txt
transforms:
- DecodeImage:
img_mode: BGR
channel_first: false
- RecAug:
- MultiLabelEncode:
- RecResizeImg:
image_shape: [3, 48, 320]
- KeepKeys:
keep_keys:
- image
- label_ctc
- label_sar
- length
- valid_ratio
loader:
shuffle: true
batch_size_per_card: 32
drop_last: true
num_workers: 4
Eval:
dataset:
name: SimpleDataSet
data_dir: ./HW_Chinese/
label_file_list:
- ./HW_Chinese/test.txt
transforms:
- DecodeImage:
img_mode: BGR
channel_first: false
- MultiLabelEncode:
- RecResizeImg:
image_shape: [3, 48, 320]
- KeepKeys:
keep_keys:
- image
- label_ctc
- label_sar
- length
- valid_ratio
loader:
shuffle: false
drop_last: false
batch_size_per_card: 32
num_workers: 4

hecheng64 · 2025-03-16T11:42:34Z

hecheng64
Mar 16, 2025
Author

但是ch_PP-OCRv4_rec_distillation.yml 里面 freeze_params: true 这个参数到底怎么填写

0 replies

hecheng64 · 2025-03-16T11:45:25Z

hecheng64
Mar 16, 2025
Author

还有：
Optimizer:
lr:
name: Piecewise
decay_epochs : [700, 800]
values : [0.001, 0.0001] # 学习率
warmup_epoch: 5
regularizer:
name: 'L2'
factor: 0

ch_PP-OCRv4_rec_distillation.yml 里面填写：
lr:
name: Cosine
learning_rate: 0.0001
warmup_epoch: 2
regularizer:
name: L2
factor: 3.0e-05 哪个为对的

0 replies

GreatV · 2025-03-16T12:12:25Z

GreatV
Mar 16, 2025
Maintainer

你的问题涉及微调模型后对训练数据有效但对旧数据失效的现象，这通常与过拟合或模型参数未正确冻结有关。以下是一些可能的原因及对应的调整建议：

🔍 可能的原因：

未冻结 Teacher 模型参数：
- 在蒸馏训练中，通常 Teacher 模型需要被冻结（freeze_params: true），保证其不会被更新，而 Student 负责学习 Teacher 的输出。
- 你提供的配置文件显示：
```
Architecture:
  Models:
    Teacher:
      freeze_params: false  # 这里没有冻结 Teacher
```
- 但是官方配置 ch_PP-OCRv3_rec_distillation.yml 里是 freeze_params: true。
- 建议：修改 freeze_params: true，避免 Teacher 权重被更新。
Student 过度拟合新数据：
- 你的 Student 模型正在学习 Teacher 提供的特征，会优先学习新数据中的特征而可能遗忘旧数据的特征分布。
- 解决方案：
  - 增加 dropout 以增强泛化能力。
  - 适当调整 loss 权重，增加 DistillationDMLLoss 权重来平衡 Student 和 Teacher 的差异学习。
训练数据分布不同：
- 你的 train.txt 可能覆盖了新的数据，但训练时可能没有老数据，导致发生 灾难性遗忘（Catastrophic Forgetting）。
- 解决方案
  - 在训练集中混合旧数据，这样 Student 可以在蒸馏过程中保持旧数据的信息。
  - 另一种方法是使用 正则化方法或知识蒸馏策略 来减少遗忘。

🛠 推荐调整：

修正 Teacher 参数的冻结状态

Architecture:
  Models:
    Teacher:
      freeze_params: true  # 确保教师模型不会被微调

调整 Loss 策略，进一步平衡 Student 的学习

增加 DistillationDMLLoss 的 weight，比如：

Loss:
  loss_config_list:
  - DistillationDMLLoss:
      weight: 1.5  # 增强蒸馏引导

数据增强（Data Augmentation）
- 让模型更适应多种输入，可以使用 RecAug 数据增强策略：
```
Train:
  dataset:
    transforms:
    - RecAug:
```
- 这样可以模拟不同的输入场景，提升模型泛化能力。

🔗 相关参考：

虽然你的问题主要是关于 freeze_params，但 PaddleOCR 相关讨论表明 微调过程中数据分布变化会影响识别效果：

PaddleOCR Issues: OCRv4_det_cml 检测指标较低

如果你还有其他测试情况，可以提供更详细的日志，我们可以进一步分析优化策略！ 🚀

Response generated by 🤖 feifei-bot | chatgpt-4o-latest

⚠️ This is an AI-generated response. Please verify any critical information.

0 replies

hecheng64 · 2025-03-16T12:27:37Z

hecheng64
Mar 16, 2025
Author

@GreatV 还有：
ch_PP-OCRv3_rec_distillation.yml 里面填写：
Optimizer:
lr:
name: Piecewise
decay_epochs : [700, 800]
values : [0.001, 0.0001] # 学习率
warmup_epoch: 5
regularizer:
name: 'L2'
factor: 0

ch_PP-OCRv4_rec_distillation.yml 里面填写：
lr:
name: Cosine
learning_rate: 0.0001
warmup_epoch: 2
regularizer:
name: L2
factor: 3.0e-05 学习率参数选哪个方案？

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

微调参数确认：微调出来模型感觉对训练后数据都有效果，对以前可以识别数据好像失效，是不是哪里需要配置冻结参数才可以 #14867

Uh oh!

{{title}}

Uh oh!

Replies: 4 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

微调参数确认：微调出来模型感觉对训练后数据都有效果，对以前可以识别数据好像失效，是不是哪里需要配置冻结参数才可以 #14867

Uh oh!

hecheng64 Mar 16, 2025

🔎 Search before asking

🐛 Bug (问题描述)

🏃‍♂️ Environment (运行环境)

🌰 Minimal Reproducible Example (最小可复现问题的Demo)

Replies: 4 comments

Uh oh!

hecheng64 Mar 16, 2025 Author

Uh oh!

hecheng64 Mar 16, 2025 Author

Uh oh!

GreatV Mar 16, 2025 Maintainer

🔍 可能的原因：

🛠 推荐调整：

🔗 相关参考：

Uh oh!

hecheng64 Mar 16, 2025 Author

hecheng64
Mar 16, 2025

hecheng64
Mar 16, 2025
Author

hecheng64
Mar 16, 2025
Author

GreatV
Mar 16, 2025
Maintainer

hecheng64
Mar 16, 2025
Author