Want to finetune ppocr_v5 with handwritten dataset #16128
Unanswered
Delipan-regami
asked this question in
Q&A
Replies: 1 comment
-
Hello, it seems that the issue is caused by an incompatibility between the Paddle version and your GPU environment. You can try upgrading Paddle to version 3.1 or above. The official Paddle website is as follows: Paddle Installation Guide. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
how can i do that i have tried with their official config_file but im facing
Traceback (most recent call last):
File "/content/PaddleOCR/tools/train.py", line 272, in
main(config, device, logger, vdl_writer, seed)
File "/content/PaddleOCR/tools/train.py", line 225, in main
program.train(
File "/content/PaddleOCR/tools/program.py", line 356, in train
preds = model(images, data=batch[1:])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/paddle/nn/layer/layers.py", line 1429, in call
return self.forward(*inputs, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/content/PaddleOCR/ppocr/modeling/architectures/base_model.py", line 99, in forward
x = self.head(x, targets=data)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/paddle/nn/layer/layers.py", line 1429, in call
return self.forward(*inputs, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/content/PaddleOCR/ppocr/modeling/heads/rec_multi_head.py", line 151, in forward
gtc_out = self.gtc_head(self.before_gtc(x), targets[1:])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/paddle/nn/layer/layers.py", line 1429, in call
return self.forward(*inputs, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/content/PaddleOCR/ppocr/modeling/heads/rec_nrtr_head.py", line 155, in forward
return self.forward_train(src, tgt)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/content/PaddleOCR/ppocr/modeling/heads/rec_nrtr_head.py", line 135, in forward_train
tgt = decoder_layer(tgt, memory, self_mask=tgt_mask)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/paddle/nn/layer/layers.py", line 1429, in call
return self.forward(*inputs, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/content/PaddleOCR/ppocr/modeling/heads/rec_nrtr_head.py", line 480, in forward
tgt1 = self.self_attn(tgt, attn_mask=self_mask)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/paddle/nn/layer/layers.py", line 1429, in call
return self.forward(*inputs, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/content/PaddleOCR/ppocr/modeling/heads/rec_nrtr_head.py", line 404, in forward
self.qkv(query)
File "/usr/local/lib/python3.11/dist-packages/paddle/nn/layer/layers.py", line 1429, in call
return self.forward(*inputs, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/paddle/nn/layer/common.py", line 185, in forward
out = F.linear(
^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/paddle/nn/functional/common.py", line 1960, in linear
return _C_ops.linear(x, weight, bias)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
OSError: (External) CUBLAS error(13).
[Hint: 'CUBLAS_STATUS_EXECUTION_FAILED'. The GPU program failed to execute. This is often caused by a launch failure of the kernel on the GPU, which can be caused by multiple reasons. To correct: check that the hardware, an appropriate version of the driver, and the cuBLAS library are correctly installed. ] (at /paddle/paddle/phi/kernels/funcs/blas/blas_impl.cu.h:41)
[operator < linear > error]
and i have used
Global:
model_name: PP-OCRv5_server_rec
debug: false
use_gpu: true
epoch_num: 5
log_smooth_window: 20
print_batch_step: 10
save_model_dir: ./output/PP-OCRv5_server_rec
save_epoch_step: 1
eval_batch_step: [0, 2000] # Consider reducing for quicker debugging feedback
cal_metric_during_train: true
calc_epoch_interval: 1
CRITICAL CHECK 1: Ensure this path is correct and the file exists
pretrained_model: /content/PaddleOCR/pretrain_models/ppocrv5_rec/PP-OCRv5_server_rec_pretrained.pdparams
save_inference_dir: /content/drive/MyDrive/recognition
use_visualdl: false
infer_img: doc/imgs_words/ch/word_1.jpg
CRITICAL CHECK 2: Ensure this path is correct and the file exists
character_dict_path: ./ppocr/utils/dict/ppocrv5_dict.txt
max_text_length: &max_text_length 25 # CRITICAL CHECK 3: Ensure no label in your data exceeds this length
infer_mode: false
use_space_char: true # Adds space to the dictionary. Make sure your dict file accounts for this or doesn't have an extra space.
distributed: false # Set to false for single GPU training. Your log showed distributed=False initially.
save_res_path: ./output/rec/predicts_ppocrv5.txt
d2s_train_image_shape: [3, 48, 320]
Optimizer:
name: Adam
beta1: 0.9
beta2: 0.999
lr:
name: Cosine
learning_rate: 0.0005 # Might need adjustment based on dataset size/validation performance
warmup_epoch: 1
regularizer:
name: L2
factor: 3.0e-05
Architecture:
model_type: rec
algorithm: SVTR_HGNet
Transform:
Backbone:
name: PPHGNetV2_B4
text_rec: True
Head:
name: MultiHead
head_list:
- CTCHead:
Neck:
name: svtr
dims: 120
depth: 2
hidden_dims: 120
kernel_size: [1, 3]
use_guide: True
Head:
fc_decay: 0.00001
- NRTRHead:
nrtr_dim: 384
max_text_length: *max_text_length
Loss:
name: MultiLoss
loss_config_list:
- CTCLoss:
- NRTRLoss:
PostProcess:
name: CTCLabelDecode # Used for final prediction/inference, usually CTC for this architecture
Metric:
name: RecMetric
main_indicator: acc
Train:
dataset:
name: MultiScaleDataSet
ds_width: false
data_dir: /content/data/processed_data/images # CRITICAL CHECK 4: Ensure this path is correct
ext_op_transform_idx: 1
# CRITICAL CHECK 5: Ensure this path is correct and the file exists and is formatted correctly (path\tlabel)
label_file_list:
- /content/train_list_fixed.txt
transforms:
- DecodeImage:
img_mode: BGR
channel_first: false
- RecAug: # Data augmentation - might be okay, but disable temporarily if debugging data issues
- MultiLabelEncode:
gtc_encode: NRTRLabelEncode # This uses character_dict_path from Global
- KeepKeys:
keep_keys:
- image
- label_ctc
- label_gtc
- length
- valid_ratio
sampler:
name: MultiScaleSampler
scales: [[320, 32], [320, 48], [320, 64]]
first_bs: &bs 128 # Reduce if you encounter OOM (Out-Of-Memory) errors
fix_bs: false
divided_factor: [8, 16]
is_training: True
loader:
shuffle: true
batch_size_per_card: *bs
drop_last: true
num_workers: 4 # Reduce if you encounter data loading issues
Eval:
dataset:
name: SimpleDataSet
data_dir: /content/data/processed_data/images # CRITICAL CHECK 4: Ensure this path is correct
# CRITICAL CHECK 5: Often, a separate validation list is used here
label_file_list:
- /content/train_list_fixed.txt # <-- POTENTIAL ISSUE: Using training list for eval? Consider a separate val list.
transforms:
- DecodeImage:
img_mode: BGR
channel_first: false
- MultiLabelEncode:
gtc_encode: NRTRLabelEncode # This also uses character_dict_path from Global
- RecResizeImg:
image_shape: [3, 48, 320]
- KeepKeys:
keep_keys:
- image
- label_ctc
- label_gtc
- length
- valid_ratio
loader:
shuffle: false
drop_last: false
batch_size_per_card: 128 # Reduce if needed
num_workers: 2 # Reduce if needed
i have used this pretrained model
!pip install paddlepaddle-gpu==2.6.1 -f https://www.paddlepaddle.org.cn/whl/mkl/avx/stable.html
!wget https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/PP-OCRv5_server_rec_pretrained.pdparams
Beta Was this translation helpful? Give feedback.
All reactions