Skip to content

UnboundLocalError when using TypeAdapter in compiled/packaged environments (Nuitka, PyInstaller, cx_Freeze) #386

@glamberson

Description

@glamberson

Description

Summary

The TypeAdapter import in docling_core/utils/file.py causes an UnboundLocalError when docling-core is used in compiled Python environments (Nuitka, PyInstaller, etc.), despite the import statement being syntactically correct.

Error Message

UnboundLocalError: cannot access local variable 'TypeAdapter' where it is not associated with a value
  File "docling_core/utils/file.py", line 82, in _get_url
    http_url: AnyHttpUrl = TypeAdapter(AnyHttpUrl).validate_python(source)

Affected Versions

  • docling-core: 2.45.0 (confirmed)
  • Python: 3.13.1
  • Pydantic: 2.11.7
  • Environment: Windows 11, compiled with Nuitka 2.7.13

Root Cause

Python's variable scoping rules can cause imported names to become "unbound" in certain execution contexts, particularly in compiled environments where the import resolution differs from standard Python interpretation.

Steps to Reproduce

  1. Create a Python application that uses docling-core:

    from docling.document_converter import DocumentConverter

def process_document(file_path): converter = DocumentConverter() result = converter.convert(file_path) return result

2. Compile with Nuitka:
```bash
python -m nuitka --standalone --follow-imports process_document.py
  1. Run the compiled executable with any document:

    ./process_document.dist/process_document.exe test.pdf

  2. Observe the UnboundLocalError at line 82 or 134 in docling_core/utils/file.py

Expected Behavior

The TypeAdapter should work correctly in both interpreted and compiled environments.

Actual Behavior

UnboundLocalError occurs when TypeAdapter is referenced, despite successful import.

Proposed Fix

File: docling_core/utils/file.py

Current Code (Lines 17, 82, 134):

# Line 17
from pydantic import AnyHttpUrl, TypeAdapter, ValidationError

# Line 82 (in _get_url function)
http_url: AnyHttpUrl = TypeAdapter(AnyHttpUrl).validate_python(source)

# Line 134 (in _get_local_path function)
local_path = TypeAdapter(Path).validate_python(source)

Fixed Code:

# Line 17 - Use alias to avoid scoping issues
from pydantic import AnyHttpUrl, TypeAdapter as _TypeAdapter, ValidationError

# Line 82 - Use the aliased import
http_url: AnyHttpUrl = _TypeAdapter(AnyHttpUrl).validate_python(source)

# Line 134 - Use the aliased import
local_path = _TypeAdapter(Path).validate_python(source)

Why This Fix Works

The alias (_TypeAdapter) creates a new name binding that avoids Python's function-scope variable binding edge case. This ensures the imported class remains accessible even in complex execution contexts created by compilation tools.

Testing

The fix has been tested in:

  • Standard Python interpreter (3.13.1)
  • Nuitka compiled executable
  • Production Windows environment
  • With various document types (PDF, DOCX, HTML)

Impact

Affected Users

  • Anyone using docling in compiled/packaged applications
  • Commercial applications requiring standalone executables
  • Enterprise deployments using application bundlers

Severity

High - This completely blocks usage in compiled environments with no workaround except modifying the library code.

Additional Context

This issue was discovered during development of AI-Extractor, a commercial document extraction system. The bug manifests consistently in Nuitka-compiled environments but may also affect:

  • PyInstaller
  • cx_Freeze
  • py2exe
  • Any tool that modifies Python's import mechanism

Proposed Pull Request

I can submit a PR with this fix if desired. The change is minimal (adding an alias) but resolves a critical issue for compiled deployments.

Verification Script

#!/usr/bin/env python3
"""Test script to verify TypeAdapter issue and fix"""

import sys
import tempfile
from pathlib import Path

def test_typeadapter_import():
    """Test if TypeAdapter import works correctly"""
    try:
        # This mimics what docling_core does
        from pydantic import TypeAdapter

        # Test Path validation (like line 134)
        test_path = Path("/tmp/test.pdf")
        validated = TypeAdapter(Path).validate_python(test_path)
        print(f"✓ TypeAdapter works: {validated}")
        return True

    except UnboundLocalError as e:
        print(f"✗ UnboundLocalError: {e}")
        return False
    except Exception as e:
        print(f"✗ Other error: {e}")
        return False

def test_with_alias():
    """Test the proposed fix with alias"""
    try:
        from pydantic import TypeAdapter as _TypeAdapter

        test_path = Path("/tmp/test.pdf")
        validated = _TypeAdapter(Path).validate_python(test_path)
        print(f"✓ Alias fix works: {validated}")
        return True

    except Exception as e:
        print(f"✗ Alias fix failed: {e}")
        return False

if __name__ == "__main__":
    print("Testing TypeAdapter import issue...")
    print(f"Python: {sys.version}")
    print(f"Executable: {sys.executable}")
    print("-" * 50)

    # Test original approach
    print("1. Testing original import:")
    original_works = test_typeadapter_import()

    # Test fix
    print("\n2. Testing alias fix:")
    alias_works = test_with_alias()

    print("-" * 50)
    if not original_works and alias_works:
        print("CONFIRMED: Bug exists and fix works!")
    elif original_works:
        print("Cannot reproduce in this environment")
    else:
        print("Both approaches failed - different issue")

References


Contact

Happy to provide additional testing or clarification as needed.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions