Make layout deps optional, fix requires-python, lazy-import cv2 by dguido · Pull Request #100 · zai-org/GLM-OCR

dguido · 2026-02-18T03:08:02Z

Summary

Move heavy deps to optional extras: torch, torchvision, transformers, sentencepiece, accelerate, opencv-python, and flask are moved from core dependencies to their existing [layout] and [server] optional extras. OCR-only mode (pip install glmocr) now installs in seconds instead of pulling ~5GB of ML frameworks. Users who need layout detection install with pip install glmocr[layout] or pip install glmocr[all].
Fix requires-python: Bumped from >=3.8 to >=3.10. The core dependency transformers>=5.1.0 already requires Python 3.10+, so the old lower bound caused dependency resolver failures (e.g., uv cannot find a valid solution when it tries to satisfy transformers>=5.1.0 across Python 3.8/3.9).
Lazy-import cv2: Moved import cv2 from module level in image_utils.py into crop_image_region(), and deferred visualization_utils imports in utils/__init__.py via __getattr__. opencv-python is only needed for layout detection (polygon cropping and visualization), but the module-level import made it a hard requirement even in OCR-only mode.
Fix double image preprocessing in pipeline: In the OCR-only path of Pipeline.process(), images were encoded via load_image_to_base64 (with smart_resize), then build_request() decoded and re-encoded them through load_image_to_base64 a second time. Replaced the build_request() call with direct setdefault() for generation parameters.
Simplify [all] extra: Changed from duplicating the full dependency list to referencing glmocr[layout,server].

Motivation

When using glm-ocr in OCR-only mode with an external inference server (e.g., mlx-vlm on Apple Silicon, vLLM, or the MaaS API), there's no need for torch, torchvision, or opencv. The current pyproject.toml makes these mandatory, which means a multi-GB install for a use case that only needs requests and Pillow.

The requires-python >= 3.8 also blocks modern Python package managers (uv) from resolving dependencies, since transformers >= 5.1.0 dropped Python 3.8/3.9 support.

Test plan

pip install . (or uv sync) installs without torch/opencv
pip install ".[layout]" installs the full layout detection stack
pip install ".[all]" installs everything
OCR-only pipeline works without opencv installed
Layout pipeline works with [layout] extra installed
from glmocr.utils import crop_image_region succeeds without opencv (cv2 only imported when function is called)

🤖 Generated with Claude Code

- Move torch, torchvision, transformers, sentencepiece, accelerate, opencv-python, and flask from core dependencies to their existing optional extras ([layout] and [server]). OCR-only mode now installs in seconds instead of pulling ~5GB of ML frameworks. - Bump requires-python from >=3.8 to >=3.10. The core dependency transformers>=5.1.0 already requires Python 3.10+, so the old lower bound caused resolver failures (e.g. uv cannot find a valid solution across Python 3.8/3.9). - Lazy-import cv2 in image_utils.crop_image_region() and defer visualization_utils imports in utils/__init__.py. opencv-python is only needed for layout detection, but the module-level import made it required even in OCR-only mode. - Fix double image preprocessing in Pipeline.process() OCR-only path. Images were encoded via load_image_to_base64 (with smart_resize), then build_request() decoded and re-encoded them through load_image_to_base64 a second time. Replace the build_request() call with direct setdefault() for the generation parameters. - Simplify [all] extra to reference [layout,server] instead of duplicating the dependency list. Update classifiers and black target-version to reflect 3.10-3.13. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make layout deps optional, fix requires-python, lazy-import cv2#100

Make layout deps optional, fix requires-python, lazy-import cv2#100
dguido wants to merge 1 commit intozai-org:mainfrom
dguido:fix/deps-and-imports

dguido commented Feb 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

dguido commented Feb 18, 2026

Summary

Motivation

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant