Why do all my PDF files take almost exactly 5 seconds (occasionally 10) to be processed? #2259

chris-prezi · 2025-09-12T16:53:11Z

chris-prezi
Sep 12, 2025

My service splits larger PDFs into chunks, which vary in size, then passes them to docling-serve for text extraction. Checking the extraction durations for all chunks in my logs, they are all 5.xxx seconds, or occasionally 10.xxx seconds, suggesting some 5 second process or delay. I'm looking at docling-serve to replace sagemaker/docling, but sagemaker is much faster at individual chunk processing. Not sure what to make of the returned processing_time from docling-serve, which always claims to be a few microseconds

@chris-prezi · 2025-09-12T16:57:57Z

dosubot[bot]
bot Sep 12, 2025

Hi @chris-prezi! I'm Dosu and I’m helping the docling team.

The consistent 5-second (or 10-second) processing time per PDF chunk in docling-serve is not caused by any hardcoded delay or timeout in the codebase. The "processing_time" field only measures the core extraction step and does not include I/O, model loading, or network overhead, which explains why it reports microseconds even though the wall-clock time is much longer example example.

Common causes for this fixed delay include model loading or initialization on each request (especially if your API parameters differ from the docling-serve defaults, which can prevent effective caching), network or storage latency (such as models being downloaded at runtime instead of pre-mounted), and infrastructure bottlenecks like thread or worker configuration example.

To reduce processing time, make sure all required models are pre-downloaded and mounted into the container, set the DOCLING_SERVE_ARTIFACTS_PATH environment variable to the correct path, and align your API parameters with the docling-serve defaults to maximize cache reuse. Also, tune thread and worker settings (e.g., DOCLING_NUM_THREADS) to match your hardware example.

If you’re running in a cluster or containerized environment, use persistent storage for models and avoid runtime downloads. If you’re using async mode, be aware that running with multiple Uvicorn workers can cause reliability issues unless you use an external orchestrator like RQ or KFP example.

If you want to match sagemaker/docling performance, check which PDF backend it uses (e.g., pypdfium2 is much faster than the default for some documents) and set the same backend in docling-serve example.

To reply, just mention @dosu.

^{How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other}

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Why do all my PDF files take almost exactly 5 seconds (occasionally 10) to be processed? #2259

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Why do all my PDF files take almost exactly 5 seconds (occasionally 10) to be processed? #2259

Uh oh!

chris-prezi Sep 12, 2025

Replies: 1 comment

Uh oh!

dosubot[bot] bot Sep 12, 2025

chris-prezi
Sep 12, 2025

dosubot[bot]
bot Sep 12, 2025