Skip to content
Discussion options

You must be logged in to vote

In the configuration:

- EastRandomCropData:
    size:
    - 96
    - 320
    max_tries: 50
    keep_ratio: true

The values under size represent the height and width of the cropped image. Typically, in PaddleOCR and similar frameworks, the convention is:

  • The first value (96) refers to the height.
  • The second value (320) refers to the width.

This convention aligns with the way image dimensions are usually specified in deep learning frameworks, where height comes first, followed by width.

If you need further confirmation, you can check other parts of the configuration file, such as image_shape in RecConAug, where similar patterns are followed:

image_shape:
- 48
- 320
- 3

Here, 48 is the hei…

Replies: 1 comment

Comment options

You must be logged in to vote
0 replies
Answer selected by felixho789
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants