Use Original Font in new PDF #3957
-
I am trying to extract text from one PDF and write back to new PDF with same font, as the font won't be available in new_page of new pdf, I am extracting the font in each page in old PDF and saving the font files and using them to insert and use that font in new PDF, as the naming convention of the font in the extracted font(ex: TimesNewRomanPS-BoldItalicMT) is different from the font in text blocks(ex:TimesNewRomanPS-BoldItal), I used get_close_matches to get the font files we saved, which is similar to the font in text blocks. This approach is working for most of the fonts but for some fonts it is inserting some square boxes instead of actual text (Please refer to the Image attached) |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
If these squares ("tofus") appear, then the respective font did not find a glyph with the provided Unicode. |
Beta Was this translation helpful? Give feedback.
If these squares ("tofus") appear, then the respective font did not find a glyph with the provided Unicode.
It may have been caused already during text extraction if the original font did not return valid Unicodes (i.e. �) - you may want to add an appropriate check.
Also, the roundtrip glyph -> unicode -> glyph may not work for all font subsets - even if the extracted font is a valid font (not always the case).
Sometimes the only way out is using the original (downloaded) font.