Missing character when saving after using multiple add_redact_annot & insert_htmlbox #3270
Unanswered
Nader-Khalil
asked this question in
Looking for help
Replies: 2 comments 2 replies
-
Highly complex post - and apparently not a bug, but a call for help. |
Beta Was this translation helpful? Give feedback.
0 replies
-
|
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Description of the bug
Hello there,

I'm trying to use the library for pdf translation process,
I managed to do lots of stuff but i stuck in a problem
This is the an original page of a pdf that contains 9 pages :
This is the result after translation to Arabic :

you can notice lots of missing characters in a lot of words [ I already pointed to some with a red arrow ]
so after some tracing and trials, i tried to make the pdf with this page only to make the tracing more simple :

and the surprise the letters appear! 😅 not all There are still 2 missing but lots have been shown correctly now
you can note the correction in the middle squares compared to the previous image
I need help handling this, whenever the page contains fewer blocks it goes well, but when the page contains lots of blocks the problem occurs ,
I have also another question the pdf after saving had its size increased alot is there something to do to reduce this to be close to the orginal size ?? cause it's more than x10
Thanks in advance
How to reproduce the bug
i'm using python code in these steps :
python in Google Colab
Define translate_sent(sent, source, trgt) function:
Translate text from source language to target language
Define check_table(page, text, bbox) function:
Determine if the specified text in a bbox is a table by checking alignment
Define process_table_text(page, bbox, text, src_lang, trgt_lang, css_style) function:
## Can be Neglected for now nothing wrong with it
Translate and replace table text on the page within the specified bbox using provided styles
The exact function to handle the page is as follows:
for running :
PyMuPDF version
1.23.26
Operating system
Other
Python version
3.10
Beta Was this translation helpful? Give feedback.
All reactions