PaddleOCR fails to correctly detect German diacritics (ä, ö, ü, ß) in text recognition #16427

sayinmehmet47 · 2025-03-14T15:44:05Z

sayinmehmet47
Mar 14, 2025

🔎 Search before asking

I have searched the PaddleOCR Docs and found no similar bug report.
I have searched the PaddleOCR Issues and found no similar bug report.
I have searched the PaddleOCR Discussions and found no similar bug report.

🐛 Bug (问题描述)

I'm using PaddleOCR to detect German text from images, but I've noticed that it consistently fails to correctly recognize German diacritics. For example: "Spaß" is detected as "SpafS" or "SpaR" "Zähne" is detected as "Zahne" "frühstücken" is detected as "fruhstucken" "Frühstück" is detected as "Fruhstuck" "nächsten" is detected as "nachsten"

My current setup:

Current output: Florian steht jeden Tag um sechs Uhr auf. Zuerst wascht er sein Gesicht und putzt sich die Zahne. Dann geht er nach unten, um zu fruhstücken. Nach dem Fruhstuck zieht er sich an und geht zur Schule.

I've already tried: Using lang='german' parameter Using different image resolutions Pre-processing the images to improve contrast Is there a way to improve the recognition of German special characters with PaddleOCR? Do I need to fine-tune the model with a specialized German dataset? Are there any specific parameters or pre-processing techniques that might help?

PaddleOCR version: 2.9.1

I tried both latin and german model but both not correctly detect the diacritics. I tried to finetune the latin model with more umlaut examples.i created 5000 syntetic images and 100 real image. After train with this yaml i saw it overfit even i reached accuracy 98%.

Click to expand

Global:
  debug: false
  use_gpu: false
  epoch_num: 10
  log_smooth_window: 20
  print_batch_step: 10
  save_model_dir: ./output/v3_latin_mobile
  save_epoch_step: 3
  eval_batch_step: [0, 150]
  cal_metric_during_train: true
  pretrained_model: ./pretrain_models/latin_PP-OCRv3_rec_train/best_accuracy
  checkpoints:
  save_inference_dir: ./output/v3_latin_mobile/inference
  use_visualdl: false
  infer_img: doc/imgs_words/ch/word_1.jpg
  character_dict_path: ppocr/utils/dict/latin_dict.txt
  max_text_length: &max_text_length 50
  infer_mode: false
  use_space_char: true
  distributed: false
  save_res_path: ./output/rec/predicts_ppocrv3_latin.txt
  freeze_params:
    - "backbone"

Optimizer:
  name: Adam
  beta1: 0.9
  beta2: 0.999
  lr:
    name: Cosine
    learning_rate: 0.0005
    warmup_epoch: 3
  regularizer:
    name: L2
    factor: 3.0e-05


Architecture:
  model_type: rec
  algorithm: SVTR_LCNet
  Transform:
  Backbone:
    name: MobileNetV1Enhance
    scale: 0.5
    last_conv_stride: [1, 2]
    last_pool_type: avg
    last_pool_kernel_size: [2, 2]
  Head:
    name: MultiHead
    head_list:
      - CTCHead:
          Neck:
            name: svtr
            dims: 64
            depth: 2
            hidden_dims: 120
            use_guide: True
          Head:
            fc_decay: 0.00001
      - SARHead:
          enc_dim: 512
          max_text_length: *max_text_length

Loss:
  name: MultiLoss
  loss_config_list:
    - CTCLoss:
    - SARLoss:

PostProcess:  
  name: CTCLabelDecode

Metric:
  name: RecMetric
  main_indicator: acc
  ignore_space: False

Train:
  dataset:
    name: SimpleDataSet
    data_dir: ./train_data/
    ext_op_transform_idx: 1
    label_file_list:
    - ./train_data/train_list.txt
    transforms:
    - DecodeImage:
        img_mode: BGR
        channel_first: false
    - RecConAug:
        prob: 0.5
        ext_data_num: 2
        image_shape: [48, 320, 3]
    - RecAug:
    - MultiLabelEncode:
    - RecResizeImg:
        image_shape: [3, 48, 320]
    - KeepKeys:
        keep_keys:
        - image
        - label_ctc
        - label_sar
        - length
        - valid_ratio
  loader:
    shuffle: true
    batch_size_per_card: 64
    drop_last: true
    num_workers: 8
Eval:
  dataset:
    name: SimpleDataSet
    data_dir: ./train_data
    label_file_list:
    - ./train_data/val_list.txt
    transforms:
    - DecodeImage:
        img_mode: BGR
        channel_first: false
    - MultiLabelEncode:
    - RecResizeImg:
        image_shape: [3, 48, 320]
    - KeepKeys:
        keep_keys:
        - image
        - label_ctc
        - label_sar
        - length
        - valid_ratio
  loader:
    shuffle: false
    drop_last: false
    batch_size_per_card: 64
    num_workers: 8

🏃‍♂️ Environment (运行环境)

Hardware:
Model: Mac mini (Mac16,10)
Processor: Apple Silicon (M-series)
Memory: 24 GB
Software:
OS: macOS 15.3.1 (darwin 24.3.0)
Python: 3.12.9
Shell: fish (/opt/homebrew/bin/fish)
Dependencies:
paddleocr: 2.10.0
paddlepaddle: 3.0.0b0 (CPU version)
opencv-python: 4.6.0.66
opencv-contrib-python: 4.11.0.86

🌰 Minimal Reproducible Example (最小可复现问题的Demo)

def process_image_ocr(image):
    """
    Process an image through OCR and return the results.
    Args:
        image: numpy array of the image
    Returns:
        results: list of OCR results
    """
    enhanced = enhance_image(image)
    ocr = PaddleOCR(use_angle_cls=True, lang='latin', show_log=False)
    results = ocr.ocr(enhanced, cls=True)
    return results[0] if results else []

Answered by leo-q8

Jul 10, 2025

@sayinmehmet47 PP-OCRv5 now supports Multilingual Text Recognition Model, which supports the training and inference process for text recognition models in 37 languages, including French, Spanish, Portuguese, Russian...Details

View full answer

CanadianHusky · 2025-05-27T09:06:34Z

CanadianHusky
May 27, 2025

I tested your image with paddleOCR version 3.0.0 and backend paddlepaddle 3.0, using PP-OCRv5_server_rec model, which is larger and more accurate than the default mobile model. Same problem exists. Some German letters do not exist in the model.
Letter ü is defined in models PP-OCRv5_server_rec
The letter ß or ä do not seem to exist.

Anyone able to train a OCRv5 model, compatible with Version 3.0, with German characters would be much appreciated.

Otherwise the Version 3.0 engine is near perfect. See my post and sample image at #15414 (comment)

0 replies

CanadianHusky · 2025-05-28T07:26:51Z

CanadianHusky
May 28, 2025

I have posted German testing data, with the hopes that someone that knows how to train the models can use it for German characters. See #15457

OCR Engine is frighteningly accurate for English. It would be a shame not to do other languages with the latest models.

0 replies

dariofinardi · 2025-06-08T15:43:54Z

dariofinardi
Jun 8, 2025

Models PP-OCRv5 and v4 don't support chars outside english.
The way is to use the v3 but it seems that the server version is not supported anymore, only the mobile one that is poor.
I spent weeks in fine-tuning v3 to better read italian but based on the v3 server version.
The reason is that the missing charset from v4 is too large and the fine tuning would take much time on my double 4090 system.
Now with the v3.0 version it seems that the noisy documentation of PaddleOCR 2.x is even worse than before.

0 replies

CanadianHusky · 2025-06-08T15:51:14Z

CanadianHusky
Jun 8, 2025

Models PP-OCRv5 and v4 don't support chars outside english.

Please look at the last message and sample in #15457 . German letters are supported in PP-OCRv5 server model. But it requires a certain font and DPI. Otherwise German letters are easily missed.

0 replies

leo-q8 · 2025-07-10T04:01:02Z

leo-q8
Jul 10, 2025
Collaborator

@sayinmehmet47 PP-OCRv5 now supports Multilingual Text Recognition Model, which supports the training and inference process for text recognition models in 37 languages, including French, Spanish, Portuguese, Russian...Details

0 replies

CanadianHusky · 2025-07-10T10:35:00Z

CanadianHusky
Jul 10, 2025

Thank you @leo-q8 !
I tested German input data with latin_PP-OCRv5_mobile_rec and can confirm it detects German diacritics fine, even at 150dpi input

0 replies

CanadianHusky · 2025-07-10T10:51:42Z

CanadianHusky
Jul 10, 2025

@leo-q8 can we have a 'Server' grade high quality latin inference model as well ?
Basically latin_PP-OCRv5_server_rec
At the moment only lightweight mobile model exists. It works well on most input but server model is even better.

0 replies

leo-q8 · 2025-07-10T11:32:35Z

leo-q8
Jul 10, 2025
Collaborator

Thank you for your attention！The precision performance of the mobile version is quite good. Feel free to use it and raise any issues if you encounter a bad case. We are also planning to develop latin_PP-OCRv5_server_rec model. Stay tuned for future updates.

0 replies

PaddleOCR fails to correctly detect German diacritics (ä, ö, ü, ß) in text recognition #16427

Uh oh!

Uh oh!

sayinmehmet47 Mar 14, 2025

🔎 Search before asking

🐛 Bug (问题描述)

🏃‍♂️ Environment (运行环境)

🌰 Minimal Reproducible Example (最小可复现问题的Demo)

Replies: 8 comments

Uh oh!

CanadianHusky May 27, 2025

Uh oh!

CanadianHusky May 28, 2025

Uh oh!

Uh oh!

dariofinardi Jun 8, 2025

Uh oh!

CanadianHusky Jun 8, 2025

Uh oh!

leo-q8 Jul 10, 2025 Collaborator

Uh oh!

CanadianHusky Jul 10, 2025

Uh oh!

CanadianHusky Jul 10, 2025

Uh oh!

leo-q8 Jul 10, 2025 Collaborator

sayinmehmet47
Mar 14, 2025

CanadianHusky
May 27, 2025

CanadianHusky
May 28, 2025

dariofinardi
Jun 8, 2025

CanadianHusky
Jun 8, 2025

leo-q8
Jul 10, 2025
Collaborator

CanadianHusky
Jul 10, 2025

CanadianHusky
Jul 10, 2025

leo-q8
Jul 10, 2025
Collaborator