How to fine-tune PP-OCRv5 for japanese language? #15690

brt1s · 2025-06-11T12:37:43Z

brt1s
Jun 11, 2025

I'm interested in fine-tuning PaddleOCR for the Japanese language. I know that Japanese is officially supported, and there's a dedicated recognition model for Japanese in PP-OCRv3. However, I noticed that there isn't a specific Japanese recognition model for PP-OCRv5.

Is this because PP-OCRv5 uses a multilingual recognition model? If so, how exactly does the lang parameter work in PP-OCRv5, what does it do exactly?

Additionally, if I want to fine-tune PP-OCRv5 japanese version for Japanese, should I just use the PP-OCRv5_server_rec.yml configuration along with the PP-OCRv5_server_rec_pretrained.pdparams? Or is there a better way to fine-tune the model for improved Japanese text recognition?

Answered by liuhongen1234567

Jun 12, 2025

Yes, in real-world scenarios, multiple languages are often mixed together. PP-OCRv5 integrates multiple languages into a single model for recognition and currently supports Simplified Chinese, Traditional Chinese, English, and Japanese. More languages will be added for training in the future. Regarding the use of the lang parameter in PP-OCRv5, specifying lang="japan" will default to using the PP-OCRv5 server. If you wish to use the Japanese model from PP-OCRv3, you also need to specify ocr_version="PP-OCRv3", as detailed below:

from paddleocr import PaddleOCR
ocr = PaddleOCR(ocr_version="PP-OCRv3", lang="japan",device="cpu")
tmp = "./general_ocr_002.png"
result = ocr.predict(input=tmp)
for

View full answer

liuhongen1234567 · 2025-06-12T07:44:40Z

liuhongen1234567
Jun 12, 2025
Collaborator

Yes, in real-world scenarios, multiple languages are often mixed together. PP-OCRv5 integrates multiple languages into a single model for recognition and currently supports Simplified Chinese, Traditional Chinese, English, and Japanese. More languages will be added for training in the future. Regarding the use of the lang parameter in PP-OCRv5, specifying lang="japan" will default to using the PP-OCRv5 server. If you wish to use the Japanese model from PP-OCRv3, you also need to specify ocr_version="PP-OCRv3", as detailed below:

from paddleocr import PaddleOCR
ocr = PaddleOCR(ocr_version="PP-OCRv3", lang="japan",device="cpu")
tmp = "./general_ocr_002.png"
result = ocr.predict(input=tmp)
for res in result:
    print(res['rec_texts'])

The languages supported by different versions of PP-OCR can be referenced in the ‘5. Appendix’ section of the PP-OCRv5 documentation.

If you want to fine-tune PP-OCRv5 Japanese, you only need to use PP-OCRv5_server_rec.yml. The PP-OCRv3 training set mainly consists of natural images, while PP-OCRv5 includes a large amount of Japanese document data. Therefore, it is recommended to use PP-OCRv5 for fine-tuning.

3 replies

brt1s Jun 12, 2025
Author

So if i use the code
ocr = PaddleOCR(text_detection_model_name="PP-OCRv5_server_det", text_recognition_model_name="PP-OCRv5_server_rec", lang="jap")
Only diffrence with not using the "lang" would be dicts right? or maybe also preprocessing? And also do i need to change some settings in the original "PP-OCRv5_server_rec.yml" because japanese is vertical so for example do i need to change "RecResizeImg", "d2s_train_image_shape" or "scales" in "sampler"?

liuhongen1234567 Jun 12, 2025
Collaborator

Yes, the dictionary of PP-OCRv5 is a large dictionary that includes Chinese, English, Japanese, and Traditional Chinese characters.

For vertical Japanese text, it is necessary to convert it to horizontal format through operations such as rotation, rather than modifying the aspect ratio in the configuration file. This is because PP-OCRv5 is trained to recognize text in horizontal format, and changing to a vertical aspect ratio may affect the model’s performance. Below is an example function for converting vertical format to horizontal format.

def get_rotate_crop_image(img: np.ndarray, points: list) -> np.ndarray:
    """
    Crop and rotate the input image based on the given four points to form a perspective-transformed image.

    Args:
        img (np.ndarray): The input image array.
        points (list): A list of four 2D points defining the crop region in the image.

    Returns:
        np.ndarray: The transformed image array.
    """
    assert len(points) == 4, "The 'points' list must contain exactly four points."

    points = np.array(points, dtype=np.float32)

    s = points.sum(axis=1)
    diff = np.diff(points, axis=1).reshape(-1)
    top_left = points[np.argmin(s)]
    bottom_right = points[np.argmax(s)]
    top_right = points[np.argmin(diff)]
    bottom_left = points[np.argmax(diff)]
    ordered_points = np.array([top_left, top_right, bottom_right, bottom_left], dtype=np.float32)

    width_a = np.linalg.norm(bottom_right - bottom_left)
    width_b = np.linalg.norm(top_right - top_left)
    max_width = int(max(width_a, width_b))

    height_a = np.linalg.norm(top_right - bottom_right)
    height_b = np.linalg.norm(top_left - bottom_left)
    max_height = int(max(height_a, height_b))

    dst = np.array([
        [0, 0],
        [max_width - 1, 0],
        [max_width - 1, max_height -1],
        [0, max_height -1]
    ], dtype=np.float32)

    M = cv2.getPerspectiveTransform(ordered_points, dst)

    warped = cv2.warpPerspective(
        img,
        M,
        (max_width, max_height),
        flags=cv2.INTER_CUBIC,
        borderMode=cv2.BORDER_REPLICATE
    )

    if max_height > max_width * 1.2: 
        warped = np.rot90(warped)

    return warped

brt1s Jun 12, 2025
Author

Thank you very much for your help.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How to fine-tune PP-OCRv5 for japanese language? #15690

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 3 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

How to fine-tune PP-OCRv5 for japanese language? #15690

Uh oh!

brt1s Jun 11, 2025

Replies: 1 comment · 3 replies

Uh oh!

liuhongen1234567 Jun 12, 2025 Collaborator

Uh oh!

brt1s Jun 12, 2025 Author

Uh oh!

liuhongen1234567 Jun 12, 2025 Collaborator

Uh oh!

brt1s Jun 12, 2025 Author

brt1s
Jun 11, 2025

Replies: 1 comment 3 replies

liuhongen1234567
Jun 12, 2025
Collaborator

brt1s Jun 12, 2025
Author

liuhongen1234567 Jun 12, 2025
Collaborator

brt1s Jun 12, 2025
Author