Replies: 9 comments 10 replies
-
啥也不说 怎么帮? |
Beta Was this translation helpful? Give feedback.
-
@Zomcxj 不好意思哥,下面是我的配置文件:是以ch_PP-OCRv3_rec_distillation.yml文件进行修改的 Global:
debug: false
use_gpu: true
epoch_num: 200
log_smooth_window: 20
print_batch_step: 10
save_model_dir: ./output/rec_ppocr_v3_distillation
save_epoch_step: 10
eval_batch_step: [0, 2000]
cal_metric_during_train: true
pretrained_model: Model/ch_PP-OCRv3_rec_train/best_accuracy.pdparams
checkpoints:
save_inference_dir:
use_visualdl: false
infer_img: doc/imgs_words/ch/word_1.jpg
character_dict_path: ppocr/utils/en_dict.txt
max_text_length: &max_text_length 25
infer_mode: false
use_space_char: true
distributed: true
save_res_path: ./output/rec/predicts_ppocrv3_distillation.txt
d2s_train_image_shape: [3, 48, -1]
Optimizer:
name: Adam
beta1: 0.9
beta2: 0.999
lr:
name: Piecewise
decay_epochs : [700]
values : [0.0005, 0.00005]
warmup_epoch: 5
regularizer:
name: L2
factor: 3.0e-05
Architecture:
model_type: &model_type "rec"
name: DistillationModel
algorithm: Distillation
Models:
Teacher:
pretrained:
freeze_params: false
return_all_feats: true
model_type: *model_type
algorithm: SVTR_LCNet
Transform:
Backbone:
name: MobileNetV1Enhance
scale: 0.5
last_conv_stride: [1, 2]
last_pool_type: avg
last_pool_kernel_size: [2, 2]
Head:
name: MultiHead
head_list:
- CTCHead:
Neck:
name: svtr
dims: 64
depth: 2
hidden_dims: 120
use_guide: True
Head:
fc_decay: 0.00001
- SARHead:
enc_dim: 512
max_text_length: *max_text_length
Student:
pretrained:
freeze_params: false
return_all_feats: true
model_type: *model_type
algorithm: SVTR_LCNet
Transform:
Backbone:
name: MobileNetV1Enhance
scale: 0.5
last_conv_stride: [1, 2]
last_pool_type: avg
last_pool_kernel_size: [2, 2]
Head:
name: MultiHead
head_list:
- CTCHead:
Neck:
name: svtr
dims: 64
depth: 2
hidden_dims: 120
use_guide: True
Head:
fc_decay: 0.00001
- SARHead:
enc_dim: 512
max_text_length: *max_text_length
Loss:
name: CombinedLoss
loss_config_list:
- DistillationDMLLoss:
weight: 1.0
act: "softmax"
use_log: true
model_name_pairs:
- ["Student", "Teacher"]
key: head_out
multi_head: True
dis_head: ctc
name: dml_ctc
- DistillationDMLLoss:
weight: 0.5
act: "softmax"
use_log: true
model_name_pairs:
- ["Student", "Teacher"]
key: head_out
multi_head: True
dis_head: sar
name: dml_sar
- DistillationDistanceLoss:
weight: 1.0
mode: "l2"
model_name_pairs:
- ["Student", "Teacher"]
key: backbone_out
- DistillationCTCLoss:
weight: 1.0
model_name_list: ["Student", "Teacher"]
key: head_out
multi_head: True
- DistillationSARLoss:
weight: 1.0
model_name_list: ["Student", "Teacher"]
key: head_out
multi_head: True
PostProcess:
name: DistillationCTCLabelDecode
model_name: ["Student", "Teacher"]
key: head_out
multi_head: True
Metric:
name: DistillationMetric
base_metric_name: RecMetric
main_indicator: acc
key: "Student"
ignore_space: False
Train:
dataset:
name: SimpleDataSet
data_dir: train_data/
ext_op_transform_idx: 1
label_file_list:
- train_data/train.txt
transforms:
- DecodeImage:
img_mode: BGR
channel_first: false
- RecConAug:
prob: 0.5
ext_data_num: 2
image_shape: [48, 320, 3]
max_text_length: *max_text_length
- RecAug:
- MultiLabelEncode:
- RecResizeImg:
image_shape: [3, 48, 320]
- KeepKeys:
keep_keys:
- image
- label_ctc
- label_sar
- length
- valid_ratio
loader:
shuffle: true
batch_size_per_card: 8
drop_last: true
num_workers: 4
Eval:
dataset:
name: SimpleDataSet
data_dir: train_data/
label_file_list:
- train_data/val.txt
transforms:
- DecodeImage:
img_mode: BGR
channel_first: false
- MultiLabelEncode:
- RecResizeImg:
image_shape: [3, 48, 320]
- KeepKeys:
keep_keys:
- image
- label_ctc
- label_sar
- length
- valid_ratio
loader:
shuffle: false
drop_last: false
batch_size_per_card: 8
num_workers: 4 |
Beta Was this translation helpful? Give feedback.
-
@Zomcxj 现在问题是我将训练好后的模型转换为推理 模型后,利用命令行运行tools/export_model.py是能够正常识别结果的,但是我想用paddleocr的package进行python脚本的运行,于是我就照着官方提供的修改步骤进行修改: |
Beta Was this translation helpful? Give feedback.
-
@Zomcxj 上面说错了,是利用命令行运行tools/predict_rec.py文件是能够正常识别的。export_model.py这个是模型转换 |
Beta Was this translation helpful? Give feedback.
-
请问解决了吗,我训练的字符检测模型使用训练文件可以正常使用,而转化为推理文件却检测不出字符位置,但是字符识别的模型转化前后都是正常的,就是检测模型异常。 |
Beta Was this translation helpful? Give feedback.
-
@cyj02132654 没有啊兄弟 |
Beta Was this translation helpful? Give feedback.
-
我这边解决了,就是图片尺寸不一样导致的,可以在推理文件predict_det.py文件中334行左右找到图片传入模型的接口前面,把图片不失真压缩到与训练测试时使用到的图片大小就可以了(效果会和训练时的效果一样),但是直接传入OCR库中使用我还没试, |
Beta Was this translation helpful? Give feedback.
-
请问训练时的图片大小是一致的吗,我在det的yml里没有看到啊,rec的yml是有规定尺寸的,但rec在推理的过程中会做resize,能不能稍微讲一下啊 |
Beta Was this translation helpful? Give feedback.
-
如果还没有解决,请参考RapidOCR项目吧。这个项目就是将模型转化为onnx,并推理的。 |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
我训练好的文本识别模型在测试识别结果是效果很好,但是模型导出后再预测效果非常糟糕,请求大佬帮助
Beta Was this translation helpful? Give feedback.
All reactions