Skip to content

VLM OCR processing fails with 500 Internal Server Error from Azure OpenAI #2485

@kiendoantrung

Description

@kiendoantrung

Bug Report

Description

The document processing pipeline fails during the document conversion step due to an unhandled HTTPError. The error occurs when calling the Azure OpenAI GPT-4o API (/chat/completions). The server responds with a 500 Internal Server Error.


Steps to Reproduce

  1. Trigger the document processing pipeline with a supported input document.
  2. The pipeline reaches the image tagging step using api_image_request().
  3. An exception is raised from the underlying HTTP request to the Azure OpenAI endpoint.

Stack Trace

Traceback (most recent call last):
  File "..\document_service.py", line 92, in _process_document_background
    chunks = self.doc_processor.process(processing_file_path)
  File "..\processor.py", line 38, in process
    result = self.converter.convert(source_path)
  File ".venv\Lib\site-packages\pydantic\_internal\_validate_call.py", line 39, in wrapper_function
    return wrapper(*args, **kwargs)
  File ".venv\Lib\site-packages\pydantic\_internal\_validate_call.py", line 136, in __call__
    res = self.__pydantic_validator__.validate_python(pydantic_core.ArgsKwargs(args, kwargs))
  File ".venv\Lib\site-packages\docling\document_converter.py", line 245, in convert
    return next(all_res)
  File ".venv\Lib\site-packages\docling\document_converter.py", line 268, in convert_all
    for conv_res in conv_res_iter:
  File ".venv\Lib\site-packages\docling\document_converter.py", line 340, in _convert
    for item in map(
  File ".venv\Lib\site-packages\docling\document_converter.py", line 387, in _process_document
    conv_res = self._execute_pipeline(in_doc, raises_on_error=raises_on_error)
  File ".venv\Lib\site-packages\docling\document_converter.py", line 410, in _execute_pipeline
    conv_res = pipeline.execute(in_doc, raises_on_error=raises_on_error)
  File ".venv\Lib\site-packages\docling\pipeline\base_pipeline.py", line 80, in execute
    raise e
  File ".venv\Lib\site-packages\docling\pipeline\base_pipeline.py", line 72, in execute
    conv_res = self._build_document(conv_res)
  File ".venv\Lib\site-packages\docling\pipeline\base_pipeline.py", line 270, in _build_document
    raise e
  File ".venv\Lib\site-packages\docling\pipeline\base_pipeline.py", line 230, in _build_document
    for p in pipeline_pages:  # Must exhaust!
  File ".venv\Lib\site-packages\docling\pipeline\base_pipeline.py", line 195, in _apply_on_pages
    yield from page_batch
  File ".venv\Lib\site-packages\docling\models\api_vlm_model.py", line 101, in __call__
    yield from executor.map(_vlm_request, page_batch)
  File "\uv\python\cpython-3.12.11-windows-x86_64-none\Lib\concurrent\futures\_base.py", line 619, in result_iterator
    yield _result_or_cancel(fs.pop())
  File "\uv\python\cpython-3.12.11-windows-x86_64-none\Lib\concurrent\futures\_base.py", line 317, in _result_or_cancel
    return fut.result(timeout)
  File "\uv\python\cpython-3.12.11-windows-x86_64-none\Lib\concurrent\futures\_base.py", line 456, in result
    return self.__get_result()
  File "\uv\python\cpython-3.12.11-windows-x86_64-none\Lib\concurrent\futures\_base.py", line 401, in __get_result
    raise self._exception
  File "\uv\python\cpython-3.12.11-windows-x86_64-none\Lib\concurrent\futures\thread.py", line 59, in run
    result = self.fn(*self.args, **self.kwargs)
  File ".venv\Lib\site-packages\docling\models\api_vlm_model.py", line 87, in _vlm_request
    page_tags = api_image_request(
  File ".venv\Lib\site-packages\docling\utils\api_image_request.py", line 59, in api_image_request
    r.raise_for_status()
  File ".venv\Lib\site-packages\requests\models.py", line 1026, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError:  
500 Server Error: Internal Server Error  
for url: `https://<redacted-openai-endpoint>.azure.com/openai/deployments/gpt-4o/chat/completions?api-version=2024-12-01-preview`

Expected Behavior

The API should respond with a valid completion or return a controlled error (e.g., 4xx) that can be caught and handled gracefully in the pipeline.

Actual Behavior

The pipeline crashes with an unhandled HTTPError due to a 500 response from the Azure OpenAI deployment.

Environment

  • Docling version: v2.57.0

  • Python version: 3.12.11

  • Model used: gpt-4o

  • API version: 2024-12-01-preview

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions