Adding text layer to a scanned PDF #775
-
It would be good to know how to add a text layer to a scanned PDF. Let's consider the following document as an example. Raw JSON response of Amazon Textract API call: |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 2 replies
-
Without really knowing the internals of the attached JSON info, let's assume it contains all the required text. tw = fitz.TextWriter(page.rect) # need the intended page's size here
# for each text piece (a word, a string, a character, ... everything goes)
tw.append(
pos, # the insertion point
text, # the text to insert
font=font, # a fitz.Font(...) object
fontsize=fontsize,
)
# ... repeat the above with arbitrary other fonts / fontsizes, when done:
tw.writeText(page, render_mode=3,...) # write the whole text writer as hidden (render mode 3) text. |
Beta Was this translation helpful? Give feedback.
-
@sjscotti - not quite clear what you mean: |
Beta Was this translation helpful? Give feedback.
-
This information is generated automatically by MuPDF heuristics, when a page is read. |
Beta Was this translation helpful? Give feedback.
Without really knowing the internals of the attached JSON info, let's assume it contains all the required text.
Then simply create a
TextWriter
object anappend
text piece by text piece