-
Hello! Thanks a lot for the great work. I am training a multilingual model on synthetic data (one GPU, config file - cyrillic_PP-OCRv3_rec.yml). If I use default max_text_length=25 and train on pictures with 1-2 words it takes about one to two minutes for 100 iterations. But if I use pictures with 5-7 words, it takes 10 minutes for 100 iterations! Then I changed max_text_length from 25 to 100 and the training time changes again by 1-2 minutes per 100 iterations. Could you explain why there is a significant change in training time? And what is the best way to set max_text_length if I want to recognize both individual words and entire sentences? Thank you in advance. |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 1 reply
-
max_text_length is a limit length |
Beta Was this translation helpful? Give feedback.
-
@ErshovVE @LDOUBLEV Can you explain to me if the model max length is set to 25 how it can infer more than 25 characters? is there postprocessing that handles this and it's not applied in the training phase? |
Beta Was this translation helpful? Give feedback.
max_text_length is a limit length
During training, if the number of characters contained in a sample exceeds the value of max_text_length, then the sample is discarded. The training just include samples whose number of characters does not exceed max_text_length