Skip to content

LibPDF+LibGfx: Decode Mellor_ACTITC_10.pdf faster#26235

Merged
nico merged 4 commits intoSerenityOS:masterfrom
LucasChollet:pdf_optimize_part_1
Oct 3, 2025
Merged

LibPDF+LibGfx: Decode Mellor_ACTITC_10.pdf faster#26235
nico merged 4 commits intoSerenityOS:masterfrom
LucasChollet:pdf_optimize_part_1

Conversation

@LucasChollet
Copy link
Copy Markdown
Member

Related to #26082

This is mainly low-hanging fruits, but it still gives a nice perf improvement.

From:

Benchmark 1: BuildLagom/bin/pdf --render out.png PDF/Mellor_ACTITC_10.pdf --page 4
  Time (mean ± σ):     630.9 ms ±   9.7 ms    [User: 582.8 ms, System: 47.7 ms]
  Range (min … max):   622.1 ms … 655.9 ms    10 runs

To:

Benchmark 1: BuildLagom/bin/pdf --render out.png PDF/Mellor_ACTITC_10.pdf --page 4
  Time (mean ± σ):     470.4 ms ±   5.6 ms    [User: 420.8 ms, System: 49.4 ms]
  Range (min … max):   462.7 ms … 479.3 ms    10 runs

This avoids the performance costs of calling `set_pixel` (which includes
a bunch of VERIFY) that valgrind measured as 5% on:
`Build/lagom/bin/pdf --render out.png PDF/Mellor_ACTITC_10.pdf --page 4`
Makes JBIG2's composite_bilevel_image() drop from 25% to 18% in
profiles.
@github-actions github-actions bot added the 👀 pr-needs-review PR needs review from a maintainer or community member label Sep 30, 2025
Copy link
Copy Markdown
Contributor

@Hendiadyoin1 Hendiadyoin1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one small nit from my end

@LucasChollet LucasChollet force-pushed the pdf_optimize_part_1 branch 2 times, most recently from d2885c1 to 2d9a4d6 Compare October 3, 2025 10:38
@LucasChollet
Copy link
Copy Markdown
Member Author

I dropped the last commit in favor of #26239.

@nico nico merged commit 1cb425c into SerenityOS:master Oct 3, 2025
12 checks passed
@github-actions github-actions bot removed the 👀 pr-needs-review PR needs review from a maintainer or community member label Oct 3, 2025
@nico
Copy link
Copy Markdown
Contributor

nico commented Oct 3, 2025

Thanks!

I haven't profiled it, but the CCITT (and JBIG2) hashtable code is pretty inefficient. You put more efficient code in JPEGLoader (maybe that's even the same type of hashtable as in CCITT and JBIG2?), and Compress::CanonicalCodes is iirc more efficient still. (And https://www.hanshq.net/zip.html is still a bit nicer than CanonicalCodes – see e.g. #25005)

@LucasChollet LucasChollet deleted the pdf_optimize_part_1 branch October 3, 2025 12:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants