Skip to content

Installation and Stability Report 2025: Fixes for pypdfium2, API crashes, and Transformers segfaults #264

@KylePoe

Description

@KylePoe

I have been working on setting up a stable environment for nougat-ocr (v0.1.17) and encountered several breaking issues due to dependency drift and unpinned versions. Below is a summary of the critical errors and the fixes required to get the package running in 2025.

I have compiled these fixes into a standalone setup script/repository here for anyone facing similar issues. It creates a new conda environment that is, as far as I can tell, minimal to have a working installation. Several choices are not fully generic (in particular, I use cuda 11.8) and are specific to the resources available to me, but I believe the issues below to be fairly general.

1. pypdfium2 Compatibility (Critical)

The Issue:
Newer versions of pypdfium2 (v4.0+) strictly require file paths to be strings, but nougat passes pathlib.Path objects.
Symptom:
Immediate crash with TypeError or AttributeError when attempting to load a PDF.
The Fix:
Cast the filepath to a string before passing it to the document loader.

# In nougat/dataset/rasterize.py
# Change:
doc = pdfium.PdfDocument(filepath)
# To:
doc = pdfium.PdfDocument(str(filepath))

2. API Parallel Rendering Crash

The Issue:
When running nougat_api, the rasterize_paper function attempts to use multiprocessing to render pages. This fails when the input is passed as bytes or a temporary file handle, or simply due to context issues within the API server.
Symptom:
The API returns a 500 error with ValueError: Can only render in parallel with file path input.
The Fix:
Disable the parallel execution path in rasterize.py and force the serial rendering loop.

3. Transformers Segmentation Faults

The Issue:
Recent versions of transformers (likely 4.38+) introduced changes that cause nougat to segfault during the inference generation loop on CUDA.
Symptom:
The process dies silently or segfaults while processing a page.
The Fix:
Pin transformers==4.37.2.

4. Environment Stability

To ensure reproducibility, I found it necessary to strictly pin the following dependencies:

  • torch==2.5.1
  • torchvision==0.20.1
  • torchaudio==2.5.1
  • pypdfium2==4.0.0
  • fastapi==0.123.10
  • uvicorn==0.38.0

Solution

I have created a repository that automates the creation of a working conda environment with these specific versions and applies the necessary code patches post-install.

https://github.com/KylePoe/nougat-setup-script

Hopefully, this helps others trying to use this excellent tool.

DISCLAIMER: I extensively employed LLMs to produce these changes. It may contain suboptimal fixes (in particular, I am skeptical of the parallel rendering fix). I am only sharing this because, as noted, I could not otherwise get it to work, and I thought others might find it useful.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions