Uninterpretable content in content stream when a PDF was created with text using PyMuPDF #1394
-
Beta Was this translation helpful? Give feedback.
Replies: 4 comments 6 replies
-
A typical "Discussions" item - no issue. Content streams are written in PDF's mini-language by which the appearance of pages, annotations and some other object is defined. To give you some start at least:
|
Beta Was this translation helpful? Give feedback.
-
You are free to choose between hex and non-hex sometimes - depends on the font in use. For some fonts you cannot put the output characters ("ABC") in the contents. |
Beta Was this translation helpful? Give feedback.
-
Text in PDF does not have an XREF at all - nothing will be able to provide one. |
Beta Was this translation helpful? Give feedback.
-
What are you trying to achieve? |
Beta Was this translation helpful? Give feedback.
A typical "Discussions" item - no issue.
Content streams are written in PDF's mini-language by which the appearance of pages, annotations and some other object is defined.
The syntax is explained on pages 643 of https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf.
Dozens of pages - too much to explain here. View this to be some type of source code of a programming language that you do not know.
On top, content streams are usually compressed, so you won't see that source code in ASCII.
To give you some start at least: