How to disable OCR and optimize performance in Docling-serve with Docker? #2173

Cyp9715 · 2025-09-02T06:02:03Z

Cyp9715
Sep 2, 2025

Hello everyone,

I'm currently using Docling-serve and while Docling is excellent, the processing speed is quite slow for my use case.

I would like to optimize Docling-serve by:

Using DoclingParseV2DocumentBackend engine(If you have a faster option, please recommend it.)
Disabling OCR functionality (not needed for my workflow)

I can easily configure these settings in Python code, but I'm not sure how to properly set them up in a Docker deployment for Docling-serve.

docker run --name docling-serve \
    --gpus all \
    --restart unless-stopped \
    -d -p 5001:5001 \
    -e DOCLING_SERVE_MAX_SYNC_WAIT=1200 \
    -e DOCLING_SERVE_MAX_DOCUMENT_TIMEOUT=1200 \
    -e DOCLING_SERVE_ENG_LOC_NUM_WORKERS=2 \
    -e OMP_NUM_THREADS=4 \
    -e MKL_NUM_THREADS=4 \
    quay.io/docling-project/docling-serve-cu128

Also, I'd like to enable multicore and GPU (not sure if it's needed for plain text though...) to increase processing speed as much as possible.
Could you tell me about Docker deployment for this?

Answered by Elettrotecnica

Sep 2, 2025

I have opened an issue open-webui/open-webui#17148

View full answer

@Cyp9715 · 2025-09-02T06:04:20Z

dosubot[bot]
bot Sep 2, 2025

Hi @Cyp9715! I'm Dosu and I’m helping the docling team.

To use the DoclingParseV2DocumentBackend engine in Docker, set the environment variable DOCLING_SERVE_ENGINE=DoclingParseV2DocumentBackend in your docker run command. This backend does not perform OCR, so OCR will be effectively disabled by using it (source).

For optimal performance, keep these settings in your Docker command:

Set DOCLING_SERVE_ENG_LOC_NUM_WORKERS, OMP_NUM_THREADS, and MKL_NUM_THREADS to match your CPU resources for multicore processing (source).
Use --gpus all and a CUDA-enabled image (like docling-serve-cu128) if you need GPU acceleration, but for plain text processing, GPU is usually not necessary—it's mainly beneficial for OCR and vision tasks (source).

Your Docker command would look like:

docker run --name docling-serve \
    --network=ainet \
    --gpus all \
    --restart unless-stopped \
    -d -p 5001:5001 \
    -e DOCLING_SERVE_ENGINE=DoclingParseV2DocumentBackend \
    -e DOCLING_SERVE_MAX_SYNC_WAIT=1200 \
    -e DOCLING_SERVE_MAX_DOCUMENT_TIMEOUT=1200 \
    -e DOCLING_SERVE_ENG_LOC_NUM_WORKERS=2 \
    -e OMP_NUM_THREADS=4 \
    -e MKL_NUM_THREADS=4 \
    quay.io/docling-project/docling-serve-cu128

If you want to further tune performance, you can increase the worker and thread counts to match your hardware. Disabling enrichment steps and image generation (if not needed) can also help (source). For most plain text workflows, CPU multicore scaling is the main lever for speed.

To reply, just mention @dosu.

^{How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other}

0 replies

Cyp9715 · 2025-09-02T08:22:30Z

Cyp9715
Sep 2, 2025
Author

@dosu
No, as a result of testing on Python, the time difference is about twice as much depending on the on/off of the OCR option.

pipeline_options = PdfPipelineOptions()
pipeline_options.do_ocr = False # Default True, Doubles the time when changing to True.

converter = DocumentConverter(
    format_options={
        InputFormat.PDF: PdfFormatOption(
            pipeline_options=pipeline_options,
            backend=DoclingParseV2DocumentBackend
        )
    }
)

8 replies

Cyp9715 Sep 2, 2025
Author

@dosu

The problem is that I am using OpenWebUI.
Do you know how to put these options in OpenWebUI and transfer them?

Or do you know how to test with Curl?
Please show an example of the appropriate command.

dosubot[bot] bot Sep 2, 2025

To disable OCR and select the DoclingParseV2DocumentBackend in Docling-serve, you need to set the options per API request. OpenWebUI does not appear to expose these advanced options directly in its UI, so you may not be able to set them there unless it has a custom "advanced options" panel or lets you edit the raw API payload.

You can test these options directly with curl. Here are two examples:

For a document URL:

curl -X POST \
  'http://localhost:5001/v1/convert/source' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
    "options": {
      "do_ocr": false,
      "pdf_backend": "dlparse_v2"
    },
    "http_sources": [{"url": "https://example.com/document.pdf"}]
  }'

For a local file:

curl -X POST \
  'http://localhost:5001/v1/convert/file' \
  -H 'accept: application/json' \
  -H 'Content-Type: multipart/form-data' \
  -F 'pdf_backend=dlparse_v2' \
  -F 'do_ocr=false' \
  -F '[email protected];type=application/pdf'

These commands will disable OCR and use the dlparse_v2 backend for PDF processing (reference).

If OpenWebUI doesn't let you set these options, using curl or a tool like Postman is the best way to ensure the settings are applied. If OpenWebUI adds support for advanced options in the future, look for fields named do_ocr and pdf_backend.

To reply, just mention @dosu.

^{How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other}

Elettrotecnica Sep 2, 2025

@dosu

The problem is that I am using OpenWebUI. Do you know how to put these options in OpenWebUI and transfer them?

Or do you know how to test with Curl? Please show an example of the appropriate command.

Yes, it seems openwebui does not allow to turn off OCR, set the pdf backend, or even the table extraction mode at this time. The relevant code should start here https://github.com/open-webui/open-webui/blob/2407d9b905978d68619bdce4021e424046ec8df9/backend/open_webui/retrieval/loaders/main.py#L151

This should improve IMO, these settings impact docling performance a lot and would be helpful to be able to change them.

Elettrotecnica Sep 2, 2025

I have opened an issue open-webui/open-webui#17148

Answer selected by Cyp9715

Cyp9715 Sep 3, 2025
Author

@Elettrotecnica
Thank you.

Also, while allowing the client-side to handle the on/off behavior of this feature offers great flexibility, I think it would be better to have an option for the server to enforce it.

This is an area for improvement not only in OpenWebUI but also in Docling.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How to disable OCR and optimize performance in Docling-serve with Docker? #2173

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 2 comments 8 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

How to disable OCR and optimize performance in Docling-serve with Docker? #2173

Uh oh!

Uh oh!

Cyp9715 Sep 2, 2025

Replies: 2 comments · 8 replies

Uh oh!

dosubot[bot] bot Sep 2, 2025

Uh oh!

Uh oh!

Cyp9715 Sep 2, 2025 Author

Uh oh!

Cyp9715 Sep 2, 2025 Author

Uh oh!

dosubot[bot] bot Sep 2, 2025

Uh oh!

Elettrotecnica Sep 2, 2025

Uh oh!

Elettrotecnica Sep 2, 2025

Uh oh!

Cyp9715 Sep 3, 2025 Author

Cyp9715
Sep 2, 2025

Replies: 2 comments 8 replies

dosubot[bot]
bot Sep 2, 2025

Cyp9715
Sep 2, 2025
Author

Cyp9715 Sep 2, 2025
Author

Cyp9715 Sep 3, 2025
Author