Question: about originalPdfDocument.get_page_images(pageNumber, full=True), can't get full images list. #2759
-
Please provide all mandatory information! Describe the bug (mandatory)I want get full Would you help to resolve it? Thanks in advance. To Reproduce (mandatory)
https://github.com/AwesomeYuer/public-misc-share/blob/main/Simple.Rowspan.for.Pdf.Print.pdf the pdf document is print by one html as below, all the images are inline base64 html format https://github.com/AwesomeYuer/public-misc-share/blob/main/Simple.Rowspan.for.Pdf.Print.html 2.run the code in pdfDocument = fitz.open(path)
page = pdfDocument[0] # Page 2
page.wrap_contents()
images = pdfDocument.get_page_images(0, full=True)
for i, item in enumerate(images):
print(i, item) Expected behavior (optional)Describe what you expected to happen (if not obvious). I want get full The full output list should be 5 items Screenshots (optional)If applicable, add screenshots to help explain your problem. Your configuration (mandatory)
For example, the output of
Additional context (optional)Add any other context about the problem here. |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 4 replies
-
This is a typical "Discussions" post - transferring to the respective tab. |
Beta Was this translation helpful? Give feedback.
-
Method To get a list of all image display commands that the page actually issues, use |
Beta Was this translation helpful? Give feedback.
Method
page.get_images()
only shows which image references are contained in the page's PDF object definition. This may or may not be the same set of images that the page actually displays. Whether you use thefull
parameter or not: this has a completely different meaning.Please consult the documentation for more background.
To get a list of all image display commands that the page actually issues, use
page.get_image_info()
(please consult documentation for parameter details). This also contains duplicate images and images that may not have an xref (inline images).Also, the sequence in that list is the sequence in which the display commands are being executed.