insert_textbox
prunes whitespace
#1567
Replies: 3 comments 6 replies
-
Method
|
Beta Was this translation helpful? Give feedback.
-
Thanks for the fast response @JorjMcKie ! and a great tool :) ah, Trying to use edit: just tried inserting on the page I'm extracting the font from, and there it works fine, so problem seems to arise from the extraction->insertion operation. I'm on linux and the document is likely generated on Windows if that matters. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
I need to copy spans of text from an existing pdf to a new one, and hoped to rely on
insert_textbox
, by first identifying the appropriate bounding box by usingget_text("dict")
on the original page and using it as an argument for the insertion command.My problem is that my text comes with encoded whitespace and the
insert_textbox
command seems to prune whitespace characters and also tightens the bounding box. Oddly, this is not the case forinsert_text
, but there does not seem to be a straightforward way to get the point of insertion needed for that command from the availableget_text("dict")
span metadata.In short, is there a way to insert trailing whitespace with
insert_textbox
?I'd like for this to be a one-to-one copy so that the documents are encoded as similarly as possible. This also causes problems if I do a re-extraction of the new document, since whitespace will have to be inferred.
Beta Was this translation helpful? Give feedback.
All reactions