Modifying PDF Page Content Streams In-Place #1770
fergusonjason
started this conversation in
General
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I've asked this on Stack Overflow, but hey, can't hurt to ask here.
I'm trying to modify PDF page content streams in-place, specifically to add MCID identifiers to text blocks. Right now, my approach is rather brute force, but it's early.
I'm trying to use the below code to follow the trail from Catalog -> Pages -> Page -> PageContent (PDFRawStream), grab the stream, reprocess it, and replace the original with the same object reference.
What's happening is that I can build the PDFContext, but when I call PDFDocument.save(), I get a malformed PDF that ONLY contains the objects for the reprocessed streams. EVERY OTHER OBJECT IS GONE in the new PDF.
I'd prefer to do an in-place replacement to rebuilding an entire PDF object tree, but frankly I'm taking this project on out of spite: the few libraries that do this are commercial and charge more than $1k per seat.
` async reprocessPDF2(input:Uint8Array): Promise {
}`
Beta Was this translation helpful? Give feedback.
All reactions