Insert_pdf/save Performance Question #1699
Unanswered
NilesRath
asked this question in
Looking for help
Replies: 1 comment
-
Hard to tell the reason for your duration time growth - without seeing the code.
Opening a PDF and jumping to an arbitrary page is very fast, so that should not be the reason for the problem. Confirm this by measuring only extracting the last 25k pages. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I had an extremely large pdf, 343,652 pages. I set a loop for 14 new docs pulling 25k pages per doc using the insert_pdf method. The doc2 is created and saved 14 times, once for each loop. The amount of time to insert and save grew with each iteration of the loop. So for example, the first 25k pages took ~25min, the next took ~40min, the next took ~50min, and so on with the last few taking close to 2 hours each. If they are all 25k pages and the output PDFs are all roughly the same size ~140MB, can you help me understand why the amount of time grows with each loop? Is it because the range being pulled is further and further into the source PDF? I'm trying to understand how I can improve the speed/performance of this process as we are going to get more of these large PDFs.
This library is great btw, thank you so much for developing!
Beta Was this translation helpful? Give feedback.
All reactions