docling-project
diff --git a/‎CHANGELOG.md‎
Lines changed: 32 additions & 0 deletions b/‎CHANGELOG.md‎
Lines changed: 32 additions & 0 deletions
diff --git a/‎README.md‎
Lines changed: 3 additions & 3 deletions b/‎README.md‎
Lines changed: 3 additions & 3 deletions
diff --git a/‎docling/cli/main.py‎
Lines changed: 2 additions & 2 deletions b/‎docling/cli/main.py‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎docling/cli/models.py‎
Lines changed: 4 additions & 0 deletions b/‎docling/cli/models.py‎
Lines changed: 4 additions & 0 deletions
diff --git a/‎docling/datamodel/pipeline_options.py‎
Lines changed: 7 additions & 3 deletions b/‎docling/datamodel/pipeline_options.py‎
Lines changed: 7 additions & 3 deletions
diff --git a/‎docling/datamodel/vlm_model_specs.py‎
Lines changed: 4 additions & 3 deletions b/‎docling/datamodel/vlm_model_specs.py‎
Lines changed: 4 additions & 3 deletions
diff --git a/‎docling/models/rapid_ocr_model.py‎
Lines changed: 40 additions & 25 deletions b/‎docling/models/rapid_ocr_model.py‎
Lines changed: 40 additions & 25 deletions
diff --git a/‎docling/utils/model_downloader.py‎
Lines changed: 22 additions & 0 deletions b/‎docling/utils/model_downloader.py‎
Lines changed: 22 additions & 0 deletions
diff --git a/‎docs/examples/batch_convert.py‎
Lines changed: 45 additions & 1 deletion b/‎docs/examples/batch_convert.py‎
Lines changed: 45 additions & 1 deletion
@@ -1,3 +1,35 @@
+## [v2.53.0](https://github.com/docling-project/docling/releases/tag/v2.53.0) - 2025-09-17
+
+### Feature
+
+* Add granite-docling model ([#2272](https://github.com/docling-project/docling/issues/2272)) ([`17afb66`](https://github.com/docling-project/docling/commit/17afb664d005168b5a6f12a2df4432076a9329bb))
+* **RapidOcr:** Support generic extra arguments for RapidOcr ([#2266](https://github.com/docling-project/docling/issues/2266)) ([`0e95171`](https://github.com/docling-project/docling/commit/0e95171dd64733ba52f2f0906642be24f6237977))
+
+### Fix
+
+* Handle empty result from RapidOCR to avoid crash ([#2264](https://github.com/docling-project/docling/issues/2264)) ([`609d902`](https://github.com/docling-project/docling/commit/609d902eef157ae68e33faa26d73533ef7a4a749))
+
+### Documentation
+
+* Describe examples ([#2262](https://github.com/docling-project/docling/issues/2262)) ([`ff351fd`](https://github.com/docling-project/docling/commit/ff351fd40c4b635133401e77dea89bec8cd0ca33))
+
+## [v2.52.0](https://github.com/docling-project/docling/releases/tag/v2.52.0) - 2025-09-11
+
+### Feature
+
+* Enrichment steps on all convert pipelines (incl docx, html, etc) ([#2251](https://github.com/docling-project/docling/issues/2251)) ([`2c91234`](https://github.com/docling-project/docling/commit/2c9123419f541feda8cc98c53aeb37288fabcaee))
+
+### Fix
+
+* Add missing features in ThreadedStandardPdfPipeline ([#2252](https://github.com/docling-project/docling/issues/2252)) ([`0700af2`](https://github.com/docling-project/docling/commit/0700af212cce8d90dbe0477dcb06d69370649e97))
+* Address deprecation warnings of dependencies ([#2237](https://github.com/docling-project/docling/issues/2237)) ([`c696549`](https://github.com/docling-project/docling/commit/c6965495a22703d0e35105b5daafcaaf8a8063d6))
+
+### Documentation
+
+* Add an example of RAG with OpenSearch ([#2238](https://github.com/docling-project/docling/issues/2238)) ([`f8cc545`](https://github.com/docling-project/docling/commit/f8cc545bab07e5fdd79bcff7042e9279e18926c6))
+* Add instructions for using Docling with MCP to README ([#2219](https://github.com/docling-project/docling/issues/2219)) ([`e5cd702`](https://github.com/docling-project/docling/commit/e5cd7020bd281aea63519db9a5332dd2dcca54b4))
+* Document VLM support requirement in extraction example ([#2231](https://github.com/docling-project/docling/issues/2231)) ([`55f5f37`](https://github.com/docling-project/docling/commit/55f5f3752f33f5b495cb2af5e6a3aee5d157fad8))
+
 ## [v2.51.0](https://github.com/docling-project/docling/releases/tag/v2.51.0) - 2025-09-05
 
 ### Feature
 
@@ -36,7 +36,7 @@ Docling simplifies document processing, parsing diverse formats — including ad
 * 🔒 Local execution capabilities for sensitive data and air-gapped environments
 * 🤖 Plug-and-play [integrations][integrations] incl. LangChain, LlamaIndex, Crew AI & Haystack for agentic AI
 * 🔍 Extensive OCR support for scanned PDFs and images
-* 👓 Support of several Visual Language Models ([SmolDocling](https://huggingface.co/ds4sd/SmolDocling-256M-preview))
+* 👓 Support of several Visual Language Models ([GraniteDocling](https://huggingface.co/ibm-granite/granite-docling-258M))
 * 🎙️ Audio support with Automatic Speech Recognition (ASR) models
 * 🔌 Connect to any agent using the [MCP server](https://docling-project.github.io/docling/usage/mcp/)
 * 💻 Simple and convenient CLI
@@ -88,9 +88,9 @@ Docling has a built-in CLI to run conversions.
 docling https://arxiv.org/pdf/2206.01062
 ```
 
-You can also use 🥚[SmolDocling](https://huggingface.co/ds4sd/SmolDocling-256M-preview) and other VLMs via Docling CLI:
+You can also use 🥚[GraniteDocling](https://huggingface.co/ibm-granite/granite-docling-258M) and other VLMs via Docling CLI:
 ```bash
-docling --pipeline vlm --vlm-model smoldocling https://arxiv.org/pdf/2206.01062
+docling --pipeline vlm --vlm-model granite_docling https://arxiv.org/pdf/2206.01062
 ```
 This will use MLX acceleration on supported Apple Silicon hardware.
 
 
@@ -336,7 +336,7 @@ def convert(  # noqa: C901
     vlm_model: Annotated[
         VlmModelType,
         typer.Option(..., help="Choose the VLM model to use with PDF or image files."),
-    ] = VlmModelType.SMOLDOCLING,
+    ] = VlmModelType.GRANITEDOCLING,
     asr_model: Annotated[
         AsrModelType,
         typer.Option(..., help="Choose the ASR model to use with audio/video files."),
@@ -695,7 +695,7 @@ def convert(  # noqa: C901
                         pipeline_options.vlm_options = GRANITEDOCLING_MLX
                     except ImportError:
                         _log.warning(
-                            "To run SmolDocling faster, please install mlx-vlm:\n"
+                            "To run GraniteDocling faster, please install mlx-vlm:\n"
                             "pip install mlx-vlm"
                         )
             elif vlm_model == VlmModelType.SMOLDOCLING_VLLM:
 
@@ -33,6 +33,8 @@ class _AvailableModels(str, Enum):
     CODE_FORMULA = "code_formula"
     PICTURE_CLASSIFIER = "picture_classifier"
     SMOLVLM = "smolvlm"
+    GRANITEDOCLING = "granitedocling"
+    GRANITEDOCLING_MLX = "granitedocling_mlx"
     SMOLDOCLING = "smoldocling"
     SMOLDOCLING_MLX = "smoldocling_mlx"
     GRANITE_VISION = "granite_vision"
@@ -108,6 +110,8 @@ def download(
         with_code_formula=_AvailableModels.CODE_FORMULA in to_download,
         with_picture_classifier=_AvailableModels.PICTURE_CLASSIFIER in to_download,
         with_smolvlm=_AvailableModels.SMOLVLM in to_download,
+        with_granitedocling=_AvailableModels.GRANITEDOCLING in to_download,
+        with_granitedocling_mlx=_AvailableModels.GRANITEDOCLING_MLX in to_download,
         with_smoldocling=_AvailableModels.SMOLDOCLING in to_download,
         with_smoldocling_mlx=_AvailableModels.SMOLDOCLING_MLX in to_download,
         with_granite_vision=_AvailableModels.GRANITE_VISION in to_download,
 
@@ -12,7 +12,7 @@
 )
 from typing_extensions import deprecated
 
-from docling.datamodel import asr_model_specs
+from docling.datamodel import asr_model_specs, vlm_model_specs
 
 # Import the following for backwards compatibility
 from docling.datamodel.accelerator_options import AcceleratorDevice, AcceleratorOptions
@@ -114,7 +114,11 @@ class RapidOcrOptions(OcrOptions):
     cls_model_path: Optional[str] = None  # same default as rapidocr
     rec_model_path: Optional[str] = None  # same default as rapidocr
     rec_keys_path: Optional[str] = None  # same default as rapidocr
-    rec_font_path: Optional[str] = None  # same default as rapidocr
+    rec_font_path: Optional[str] = None  # Deprecated, please use font_path instead
+    font_path: Optional[str] = None  # same default as rapidocr
+
+    # Dictionary to overwrite or pass-through additional parameters
+    rapidocr_params: Dict[str, Any] = Field(default_factory=dict)
 
     model_config = ConfigDict(
         extra="forbid",
@@ -286,7 +290,7 @@ class VlmPipelineOptions(PaginatedPipelineOptions):
     )
     # If True, text from backend will be used instead of generated text
     vlm_options: Union[InlineVlmOptions, ApiVlmOptions] = (
-        smoldocling_vlm_conversion_options
+        vlm_model_specs.GRANITEDOCLING_TRANSFORMERS
     )
 
 
 
@@ -20,10 +20,11 @@
 
 # Granite-Docling
 GRANITEDOCLING_TRANSFORMERS = InlineVlmOptions(
-    repo_id="ds4sd/granite-docling-258m-2-9-2025-v2",
+    repo_id="ibm-granite/granite-docling-258M",
     prompt="Convert this page to docling.",
     response_format=ResponseFormat.DOCTAGS,
-    inference_framework=InferenceFramework.MLX,
+    inference_framework=InferenceFramework.TRANSFORMERS,
+    transformers_model_type=TransformersModelType.AUTOMODEL_IMAGETEXTTOTEXT,
     supported_devices=[
         AcceleratorDevice.CPU,
         AcceleratorDevice.CUDA,
@@ -35,7 +36,7 @@
 )
 
 GRANITEDOCLING_MLX = InlineVlmOptions(
-    repo_id="ds4sd/granite-docling-258m-2-9-2025-v2-mlx-bf16",
+    repo_id="ibm-granite/granite-docling-258M-mlx",
     prompt="Convert this page to docling.",
     response_format=ResponseFormat.DOCTAGS,
     inference_framework=InferenceFramework.MLX,
 
@@ -62,32 +62,44 @@ def __init__(
             }
             backend_enum = _ALIASES.get(self.options.backend, EngineType.ONNXRUNTIME)
 
+            params = {
+                # Global settings (these are still correct)
+                "Global.text_score": self.options.text_score,
+                "Global.font_path": self.options.font_path,
+                # "Global.verbose": self.options.print_verbose,
+                # Detection model settings
+                "Det.model_path": self.options.det_model_path,
+                "Det.use_cuda": use_cuda,
+                "Det.use_dml": use_dml,
+                "Det.intra_op_num_threads": intra_op_num_threads,
+                # Classification model settings
+                "Cls.model_path": self.options.cls_model_path,
+                "Cls.use_cuda": use_cuda,
+                "Cls.use_dml": use_dml,
+                "Cls.intra_op_num_threads": intra_op_num_threads,
+                # Recognition model settings
+                "Rec.model_path": self.options.rec_model_path,
+                "Rec.font_path": self.options.rec_font_path,
+                "Rec.keys_path": self.options.rec_keys_path,
+                "Rec.use_cuda": use_cuda,
+                "Rec.use_dml": use_dml,
+                "Rec.intra_op_num_threads": intra_op_num_threads,
+                "Det.engine_type": backend_enum,
+                "Cls.engine_type": backend_enum,
+                "Rec.engine_type": backend_enum,
+            }
+
+            if self.options.rec_font_path is not None:
+                _log.warning(
+                    "The 'rec_font_path' option for RapidOCR is deprecated. Please use 'font_path' instead."
+                )
+            user_params = self.options.rapidocr_params
+            if user_params:
+                _log.debug("Overwriting RapidOCR params with user-provided values.")
+                params.update(user_params)
+
             self.reader = RapidOCR(
-                params={
-                    # Global settings (these are still correct)
-                    "Global.text_score": self.options.text_score,
-                    # "Global.verbose": self.options.print_verbose,
-                    # Detection model settings
-                    "Det.model_path": self.options.det_model_path,
-                    "Det.use_cuda": use_cuda,
-                    "Det.use_dml": use_dml,
-                    "Det.intra_op_num_threads": intra_op_num_threads,
-                    # Classification model settings
-                    "Cls.model_path": self.options.cls_model_path,
-                    "Cls.use_cuda": use_cuda,
-                    "Cls.use_dml": use_dml,
-                    "Cls.intra_op_num_threads": intra_op_num_threads,
-                    # Recognition model settings
-                    "Rec.model_path": self.options.rec_model_path,
-                    "Rec.font_path": self.options.rec_font_path,
-                    "Rec.keys_path": self.options.rec_keys_path,
-                    "Rec.use_cuda": use_cuda,
-                    "Rec.use_dml": use_dml,
-                    "Rec.intra_op_num_threads": intra_op_num_threads,
-                    "Det.engine_type": backend_enum,
-                    "Cls.engine_type": backend_enum,
-                    "Rec.engine_type": backend_enum,
-                }
+                params=params,
             )
 
     def __call__(
@@ -120,6 +132,9 @@ def __call__(
                             use_cls=self.options.use_cls,
                             use_rec=self.options.use_rec,
                         )
+                        if result is None or result.boxes is None:
+                            _log.warning("RapidOCR returned empty result!")
+                            continue
                         result = list(
                             zip(result.boxes.tolist(), result.txts, result.scores)
                         )
 
@@ -10,6 +10,8 @@
 )
 from docling.datamodel.settings import settings
 from docling.datamodel.vlm_model_specs import (
+    GRANITEDOCLING_MLX,
+    GRANITEDOCLING_TRANSFORMERS,
     SMOLDOCLING_MLX,
     SMOLDOCLING_TRANSFORMERS,
 )
@@ -34,6 +36,8 @@ def download_models(
     with_code_formula: bool = True,
     with_picture_classifier: bool = True,
     with_smolvlm: bool = False,
+    with_granitedocling: bool = False,
+    with_granitedocling_mlx: bool = False,
     with_smoldocling: bool = False,
     with_smoldocling_mlx: bool = False,
     with_granite_vision: bool = False,
@@ -86,6 +90,24 @@ def download_models(
             progress=progress,
         )
 
+    if with_granitedocling:
+        _log.info("Downloading GraniteDocling model...")
+        download_hf_model(
+            repo_id=GRANITEDOCLING_TRANSFORMERS.repo_id,
+            local_dir=output_dir / GRANITEDOCLING_TRANSFORMERS.repo_cache_folder,
+            force=force,
+            progress=progress,
+        )
+
+    if with_granitedocling_mlx:
+        _log.info("Downloading GraniteDocling MLX model...")
+        download_hf_model(
+            repo_id=GRANITEDOCLING_MLX.repo_id,
+            local_dir=output_dir / GRANITEDOCLING_MLX.repo_cache_folder,
+            force=force,
+            progress=progress,
+        )
+
     if with_smoldocling:
         _log.info("Downloading SmolDocling model...")
         download_hf_model(
 
@@ -1,3 +1,33 @@
+# %% [markdown]
+# Batch convert multiple PDF files and export results in several formats.
+
+# What this example does
+# - Loads a small set of sample PDFs.
+# - Runs the Docling PDF pipeline once per file.
+# - Writes outputs to `scratch/` in multiple formats (JSON, HTML, Markdown, text, doctags, YAML).
+
+# Prerequisites
+# - Install Docling and dependencies as described in the repository README.
+# - Ensure you can import `docling` from your Python environment.
+# <!-- YAML export requires `PyYAML` (`pip install pyyaml`). -->
+
+# Input documents
+# - By default, this example uses a few PDFs from `tests/data/pdf/` in the repo.
+# - If you cloned without test data, or want to use your own files, edit
+#   `input_doc_paths` below to point to PDFs on your machine.
+
+# Output formats (controlled by flags)
+# - `USE_V2 = True` enables the current Docling document exports (recommended).
+# - `USE_LEGACY = False` keeps legacy Deep Search exports disabled.
+#   You can set it to `True` if you need legacy formats for compatibility tests.
+
+# Notes
+# - Set `pipeline_options.generate_page_images = True` to include page images in HTML.
+# - The script logs conversion progress and raises if any documents fail.
+# <!-- This example shows both helper methods like `save_as_*` and lower-level
+#   `export_to_*` + manual file writes; outputs may overlap intentionally. -->
+# %%
+
 import json
 import logging
 import time
@@ -15,6 +45,9 @@
 
 _log = logging.getLogger(__name__)
 
+# Export toggles:
+# - USE_V2 controls modern Docling document exports.
+# - USE_LEGACY enables legacy Deep Search exports for comparison or migration.
 USE_V2 = True
 USE_LEGACY = False
 
@@ -35,6 +68,9 @@ def export_documents(
             doc_filename = conv_res.input.file.stem
 
             if USE_V2:
+                # Recommended modern Docling exports. These helpers mirror the
+                # lower-level "export_to_*" methods used below, but handle
+                # common details like image handling.
                 conv_res.document.save_as_json(
                     output_dir / f"{doc_filename}.json",
                     image_mode=ImageRefMode.PLACEHOLDER,
@@ -121,6 +157,9 @@ def export_documents(
 def main():
     logging.basicConfig(level=logging.INFO)
 
+    # Location of sample PDFs used by this example. If your checkout does not
+    # include test data, change `data_folder` or point `input_doc_paths` to
+    # your own files.
     data_folder = Path(__file__).parent / "../../tests/data"
     input_doc_paths = [
         data_folder / "pdf/2206.01062.pdf",
@@ -139,6 +178,8 @@ def main():
     # settings.debug.visualize_tables = True
     # settings.debug.visualize_cells = True
 
+    # Configure the PDF pipeline. Enabling page image generation improves HTML
+    # previews (embedded images) but adds processing time.
     pipeline_options = PdfPipelineOptions()
     pipeline_options.generate_page_images = True
 
@@ -152,11 +193,14 @@ def main():
 
     start_time = time.time()
 
+    # Convert all inputs. Set `raises_on_error=False` to keep processing other
+    # files even if one fails; errors are summarized after the run.
     conv_results = doc_converter.convert_all(
         input_doc_paths,
         raises_on_error=False,  # to let conversion run through all and examine results at the end
     )
-    success_count, partial_success_count, failure_count = export_documents(
+    # Write outputs to ./scratch and log a summary.
+    _success_count, _partial_success_count, failure_count = export_documents(
         conv_results, output_dir=Path("scratch")
     )