Replies: 3 comments
-
没问题了,推理时指定字典就行, python tools/infer/predict_rec.py --image_dir='E:/BaiduNetdiskDownload/DataSet/Chinese_dataset/images/img_0000001.jpg' --rec_model_dir='./inference/rec_svtrv2_ch' --rec_char_dict_path='E:/BaiduNetdiskDownload/DataSet/Chinese_dataset/labels_.txt' |
Beta Was this translation helpful? Give feedback.
0 replies
-
这个要paddleocr哪个版本才能训练? |
Beta Was this translation helpful? Give feedback.
0 replies
-
这个模型要什么环境财经运行?各种报错,运行不起来,有没有教程? |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
使用的svtr2识别模型架构,使用了svtr2的预训练模型,
训练数据是合成的数据,有100万张
。lable_.txt没有用paddle ocr自带的文字,而是使用训练文本中的文字重新生成的,训练了五轮,第三轮最佳,精度有98%,
配置文件如下:Global:
debug: false
use_gpu: true
epoch_num: 20 # 200
log_smooth_window: 20
print_batch_step: 10
save_model_dir: ./output/rec_svtrv2_ch
save_epoch_step: 1
eval_batch_step: [0, 500]
cal_metric_during_train: False
pretrained_model: ./pretrain_models/openatom_rec_svtrv2_ch_train/best_accuracy
checkpoints: ./output/rec_svtrv2_ch/best_accuracy
save_inference_dir:
use_visualdl: false
infer_img: doc/imgs_words/ch/
character_dict_path: E:/BaiduNetdiskDownload/DataSet/Chinese_dataset/labels_.txt #ppocr/utils/ppocr_keys_v1.txt
max_text_length: &max_text_length 25
infer_mode: false
use_space_char: true
distributed: true
save_res_path: ./output/rec/predicts_svrtv2.txt
Optimizer:
name: AdamW
beta1: 0.9
beta2: 0.999
epsilon: 1.e-8
weight_decay: 0.05
no_weight_decay_name: norm
one_dim_param_no_weight_decay: True
lr:
name: Cosine
learning_rate: 0.001 # 8gpus 192bs
warmup_epoch: 5
Architecture:
model_type: rec
algorithm: SVTR_HGNet
Transform:
Backbone:
name: SVTRv2
use_pos_embed: False
dims: [128, 256, 384]
depths: [6, 6, 6]
num_heads: [4, 8, 12]
mixer: [['Conv','Conv','Conv','Conv','Conv','Conv'],['Conv','Conv','Global','Global','Global','Global'],['Global','Global','Global','Global','Global','Global']]
local_k: [[5, 5], [5, 5], [-1, -1]]
sub_k: [[2, 1], [2, 1], [-1, -1]]
last_stage: False
use_pool: True
Head:
name: MultiHead
head_list:
- CTCHead:
Neck:
name: svtr
dims: 256
depth: 2
hidden_dims: 256
kernel_size: [1, 3]
use_guide: True
Head:
fc_decay: 0.00001
- NRTRHead:
nrtr_dim: 384
max_text_length: *max_text_length
num_decoder_layers: 2
Loss:
name: MultiLoss
loss_config_list:
- CTCLoss:
- NRTRLoss:
PostProcess:
name: CTCLabelDecode
Metric:
name: RecMetric
main_indicator: acc
Train:

dataset:
name: MultiScaleDataSet
ds_width: false
data_dir: E:/BaiduNetdiskDownload/DataSet/Chinese_dataset/images
ext_op_transform_idx: 1
label_file_list:
- E:/BaiduNetdiskDownload/DataSet/Chinese_dataset/train_list.txt
transforms:
- DecodeImage:
img_mode: BGR
channel_first: false
- RecAug:
- MultiLabelEncode:
gtc_encode: NRTRLabelEncode
- KeepKeys:
keep_keys:
- image
- label_ctc
- label_gtc
- length
- valid_ratio
sampler:
name: MultiScaleSampler
scales: [[320, 32], [320, 48], [320, 64]]
first_bs: &bs 192
fix_bs: false
divided_factor: [8, 16] # w, h
is_training: True
loader:
shuffle: true
batch_size_per_card: *bs
drop_last: true
num_workers: 8
Eval:
dataset:
name: SimpleDataSet
data_dir: E:/BaiduNetdiskDownload/DataSet/Chinese_dataset/images
label_file_list:
- E:/BaiduNetdiskDownload/DataSet/Chinese_dataset/val_list.txt
transforms:
- DecodeImage:
img_mode: BGR
channel_first: false
- MultiLabelEncode:
gtc_encode: NRTRLabelEncode
- RecResizeImg:
image_shape: [3, 48, 320]
- KeepKeys:
keep_keys:
- image
- label_ctc
- label_gtc
- length
- valid_ratio
loader:
shuffle: false
drop_last: false
batch_size_per_card: 128
num_workers: 4
使用命令将训练模型转换为推理模型,python tools/export_model.py -c configs/rec/SVTRv2/rec_svtrv2_ch.yml
最后识别结果是乱码
Beta Was this translation helpful? Give feedback.
All reactions