CTCLoss = 0 During Finetuning #15781
Unanswered
milicaaaa20
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi everyone,
I'm training a custom OCR model in PaddleOCR using the en_PP-OCRv4_mobile_rec pretrained model with the SVTR_LCNet architecture. Everything was working fine until I increased max_text_length from 25 to 50. Since then, the CTCLoss shows as 0.000000 from the start and stays there for the entire training, regardless of the number of epochs.
Here is the output I'm getting:
[2025/06/19 10:06:07] ppocr INFO: epoch: [1/10], global_step: 50, lr: 0.000000, acc: 0.000000, norm_edit_dis: 0.045464, CTCLoss: 0.000000, NRTRLoss: 4.016458, loss: 4.016458, avg_reader_cost: 0.00386 s, avg_batch_cost: 0.24973 s, avg_samples: 1.0, ips: 4.00435 samples/s, eta: 3 days, 17:59:12, max_mem_reserved: 763 MB, max_mem_allocated: 687 MB
[2025/06/19 10:06:19] ppocr INFO: epoch: [1/10], global_step: 100, lr: 0.000000, acc: 0.000000, norm_edit_dis: 0.068191, CTCLoss: 0.000000, NRTRLoss: 4.017907, loss: 4.017907, avg_reader_cost: 0.00263 s, avg_batch_cost: 0.22968 s, avg_samples: 1.0, ips: 4.35381 samples/s, eta: 3 days, 14:22:19, max_mem_reserved: 763 MB, max_mem_allocated: 687 MB
I would really appreciate help from anyone who has experienced this issue or has suggestions on how to resolve it.
Here is my full config:
Global:
model_name: en_PP-OCRv4_mobile_rec
debug: false
use_gpu: true
epoch_num: 100
log_smooth_window: 20
print_batch_step: 50
save_model_dir: ./output/rec_ppocr_v4
save_epoch_step: 10
eval_batch_step:
- 0
- 2000
cal_metric_during_train: true
pretrained_model: ./pretrain_models/en_PP-OCRv4_mobile_rec_pretrained.pdparams
checkpoints: null
save_inference_dir: ./output/inference
use_visualdl: true
infer_img: doc/imgs_words/ch/word_1.jpg
character_dict_path: ./ppocr/utils/dict/custom_dict.txt
max_text_length: 50
infer_mode: false
use_space_char: true
distributed: true
save_res_path: ./output/rec/predicts_ppocrv4.txt
Optimizer:
name: Adam
beta1: 0.9
beta2: 0.999
lr:
name: Cosine
learning_rate: 0.0005
warmup_epoch: 5
regularizer:
name: L2
factor: 3.0e-05
Architecture:
model_type: rec
algorithm: SVTR_LCNet
Transform: null
Backbone:
name: PPLCNetV3
scale: 0.95
Head:
name: MultiHead
head_list:
- CTCHead:
Neck:
name: svtr
dims: 120
depth: 2
hidden_dims: 120
kernel_size:
- 1
- 3
use_guide: true
Head:
fc_decay: 1.0e-05
- NRTRHead:
nrtr_dim: 384
max_text_length: 50
Loss:
name: MultiLoss
loss_config_list:
- CTCLoss: null
- NRTRLoss: null
PostProcess:
name: CTCLabelDecode
Metric:
name: RecMetric
main_indicator: acc
ignore_space: false
Train:
dataset:
name: SimpleDataSet
ds_width: false
data_dir: ./data/first_train/train/
ext_op_transform_idx: 1
label_file_list:
- ./data/first_train/train.tsv
transforms:
- DecodeImage:
img_mode: BGR
channel_first: false
- MultiLabelEncode:
gtc_encode: NRTRLabelEncode
- RecResizeImg:
image_shape:
- 3
- 48
- 640
- KeepKeys:
keep_keys:
- image
- label_ctc
- label_gtc
- length
- valid_ratio
loader:
shuffle: true
batch_size_per_card: 64
drop_last: true
num_workers: 4
Eval:
dataset:
name: SimpleDataSet
data_dir: ./data/first_train/test/
label_file_list:
- ./data/first_train/test.tsv
transforms:
- DecodeImage:
img_mode: BGR
channel_first: false
- MultiLabelEncode:
gtc_encode: NRTRLabelEncode
- RecResizeImg:
image_shape:
- 3
- 48
- 640
- KeepKeys:
keep_keys:
- image
- label_ctc
- label_gtc
- length
- valid_ratio
loader:
shuffle: false
drop_last: false
batch_size_per_card: 64
num_workers: 4
profiler_options: null
Beta Was this translation helpful? Give feedback.
All reactions