Replaced all deflated images (of a certain type) with jpeg equivalent #3464
-
I feel this could be an FAQ but couldn't find a relevant switch/example. What is the mutool convert / pymupdf equivalent of wanting these with ghostscript's pdfwrite, or combinations of? I.e. basically replacing zlib deflated images with their jpeg / ccitt tiff equivalent. |
Beta Was this translation helpful? Give feedback.
Replies: 6 comments
-
Actually downsampling isn't exactly what I want - I want to convert all images from zlib deflate to dctdecode/ccittdecode, at its original resolution. This seems to be a useful /frequent request? |
Beta Was this translation helpful? Give feedback.
-
We currently have a - possible meagre - version for that: |
Beta Was this translation helpful? Give feedback.
-
Thanks for the tips . Here is what i come up with now: https://github.com/HinTak/pymupdf-jbig2-extract/blob/main/lossy-convert.py It is mostly doing want I wanted - just convert RGB flatedecoded images to jpeg. I have two questions though:
|
Beta Was this translation helpful? Give feedback.
-
Argh, I have my answer to the 2nd question: |
Beta Was this translation helpful? Give feedback.
-
Argh, have the answer to my first question also, after extracting the corresponding objects. Upstream mupdf set progressive=True, set dpi=96, disable chroma subsampling, and set quality to 95. Doing all these in PIL I cab get bitwise identical output. Found all of them in upstream mupdf code too, except the last one: upstream default to 90. 95 is a pymupdf setting? |
Beta Was this translation helpful? Give feedback.
-
Yes, 95 is a pymupdf setting: Line 326 in d464133 This concludes everything I want to know now. I'll add these as comments, but keep the PIL code. |
Beta Was this translation helpful? Give feedback.
Thanks for the tips . Here is what i come up with now: https://github.com/HinTak/pymupdf-jbig2-extract/blob/main/lossy-convert.py
It is mostly doing want I wanted - just convert RGB flatedecoded images to jpeg. I have two questions though:
going via
PIL.Image.save
vsfitz.Pixmap.save
results in much better compression, by default even without theoptimized=True
PIL key (you can roll back a commit or two before to see the diff - I started off with PIL actually, then wanted to remove that dependency and found the size result to be worse - the public repo has simplified/reverted history compared to my private repo). In one file I use for such test, original is 400MB. Going via PIL gives a …