Skip to content

Releases: PaddlePaddle/PaddleX

v3.4.2

13 Feb 06:34

Choose a tag to compare

2026.2.13 v3.4.2 released

  • The locally deployed services of PaddleOCR-VL and PaddleOCR-VL-1.5 have removed the default PDF page limit.
  • PaddleOCR-VL and PaddleOCR-VL-1.5 services now support configuring the validity period of returned BOS links to enhance security.
  • PaddleOCR-VL and PaddleOCR-VL-1.5 now support inference on AMD GPUs.
  • PaddleOCR-VL-0.9B and PaddleOCR-VL-1.5-0.9B support optimized default vLLM startup parameters when running on Intel GPUs.
  • The VLM client supports graceful shutdown of the asynchronous event loop and automatically cancels remaining requests within the same batch when one request fails, preventing unnecessary compute waste.
  • Supports configuring the DPI for PDF rendering via environment variables.
  • Narrowed the scope of matmul_add_act_fuse_pass to exclude PP-DocLayoutV3 from this fusion logic.
  • Resolved a redundant parsing issue in the PaddleOCR-VL series triggered by text boxes within tables.

2026.2.13 v3.4.2 发布

  • PaddleOCR-VL 与 PaddleOCR-VL-1.5 本地部署服务已移除默认的 PDF 页数限制。
  • PaddleOCR-VL 与 PaddleOCR-VL-1.5 服务支持配置返回的 BOS 链接有效期,提升安全性。
  • PaddleOCR-VL 与 PaddleOCR-VL-1.5 现已支持基于 AMD GPU 的推理。
  • PaddleOCR-VL-0.9B 与 PaddleOCR-VL-1.5-0.9B 模型在 Intel GPU 上支持使用优化后的默认 vLLM 启动参数。
  • VLM 客户端支持异步事件循环的优雅退出,并在同一批次中单个请求失败时自动取消其余请求,避免不必要的算力浪费。
  • 支持通过环境变量配置 PDF 渲染的 DPI。
  • 缩小 matmul_add_act_fuse_pass 算子融合影响的模型范围,仅在 PP-DocLayoutV3 时,不做该算子融合。
  • 修复 PaddleOCR-VL 系列模型中若表格中包含的文本框导致重复解析的 bug。

Full Changelog: v3.4.1...v3.4.2

v3.4.1

30 Jan 09:07

Choose a tag to compare

2026.1.30 v3.4.1 released

  • Fixed an issue in PaddleOCR-VL where charts and stamps might not be recognized after consecutive inference runs.
  • For the genai-vllm-server plugin, added an upper bound on the supported transformers version to prevent compatibility issues during installation.

2026.1.30 v3.4.1 发布

  • 修复 PaddleOCR-VL 连续执行推理后可能无法识别图表、印章的问题。
  • 对于 genai-vllm-server 插件,新增对 transformers 版本上界的限制,防止安装时出现兼容性问题。

New Contributors

Full Changelog: v3.4.0...v3.4.1

v3.4.0

29 Jan 11:19
b1bfbc6

Choose a tag to compare

2026.1.29 v3.4.0 released

  • Release the PaddleOCR-VL-1.5 complex document parsing solution.

    PaddleOCR-VL-1.5 is a new iterative version of the PaddleOCR-VL series. Based on comprehensive optimization of the core capabilities of version 1.0, the model achieves 94.5% accuracy on the authoritative document parsing benchmark OmniDocBench v1.5, surpassing top global general-purpose large models and document parsing–specific models.

    PaddleOCR-VL-1.5 innovatively supports irregular-shaped bounding box localization of document elements, enabling excellent performance in real-world application scenarios such as scanning, skew, warping, screen-photography, and complex illumination, achieving comprehensive SOTA performance. In addition, the model further integrates seal recognition and spotting tasks, with key metrics continuing to lead mainstream models.

    You can use it online on the PaddleOCR official website or call the model API.

  • Add support for calling MLX-VLM inference services.

  • The PP-StructureV3 service supports the prettifyMarkdown and showFormulaNumber parameters, with functionality aligned with local inference.

  • Upgrade the PaddleOCR-VL concatenate-pages method to restructure-pages, supporting reorganization of multi-page results without changing the total number of pages, with more flexible usage.

  • Fix potential memory issues caused by the non-thread-safe PDF rendering library in multi-threaded invocation scenarios.

  • The parameter validation logic of production services such as general OCR has been optimized, so that in more cases of invalid input parameters, a 422 status code is returned instead of 500.

  • For GenAIClient, implement graceful exit of the asynchronous event loop to improve system stability and the reliability of resource release.

2026.1.29 v3.4.0 发布

  • 发布 PaddleOCR-VL-1.5 复杂文档解析方案。

    PaddleOCR-VL-1.5 是 PaddleOCR-VL 系列的全新迭代版本。在全面优化 1.0 版本核心能力的基础上,该模型在文档解析权威评测集 OmniDocBench v1.5 上斩获了 94.5% 的高精度,超越了全球的顶尖通用大模型及文档解析专用模型。

    PaddleOCR-VL-1.5 创新性地支持了文档元素的异形框定位,使得 PaddleOCR-VL-1.5 在扫描、倾斜、弯折、屏幕拍摄及复杂光照等真实落地场景中均表现卓越,实现了全面的 SOTA。此外,模型进一步集成了印章识别与文本检测识别任务,关键指标持续领跑主流模型。

    您可以在 PaddleOCR官网 在线使用或者调用该模型的API。

  • 新增对 MLX-VLM 推理服务的调用支持。

  • PP-StructureV3 服务支持 prettifyMarkdownshowFormulaNumber 参数,功能与本地推理对齐。

  • 将 PaddleOCR-VL 的 concatenate-pages 方法升级为 restructure-pages,支持在不改变总页数的情况下重新整合多页结果,用法更加灵活。

  • 修复在多线程调用场景下,PDF 渲染库因非并发安全导致的潜在内存问题。

  • 优化了通用 OCR 等产线服务的参数校验逻辑,在更多传入无效参数的场景下返回 422 状态码,而非 500 状态码。

  • 针对 GenAIClient,实现异步事件循环的优雅退出,提升系统稳定性与资源释放可靠性。

New Contributors

Full Changelog: v3.3.13...v3.4.0

v3.3.13

12 Jan 11:24

Choose a tag to compare

2026.1.12 v3.3.13 released

  • PaddleOCR-VL adds a new concatenate_pages method for concatenating multi-page parsing results, supporting preservation of multi-level heading structures and merging tables that span across pages.
  • GenAIClient now supports specifying a custom model name during construction.
  • PaddleOCR-VL-0.9B supports local inference and allows passing min_pixels and max_pixels parameters on each prediction.
  • @alealexpro100 fixed an issue where the cyrillic_PP-OCRv5_mobile_rec model could not enable high-performance inference under PaddlePaddle 3.1.1 + CUDA 12.
  • @szepeviktor fixed an issue where the width and height were displayed in the wrong order in logs when the image size exceeded the limit for text detection models.
  • @metax666 @duqimeng PP-StructureV3 now supports running on MetaX GPUs.
  • Added support for using the internally integrated PaddleOCR-VL-0.9B in vLLM and SGLang (requires newer versions of vLLM and SGLang).
  • Fixed errors in the application packaging documentation and updated sample code to be compatible with the latest APIs.

2025.1.12 v3.3.13 发布

  • PaddleOCR-VL 新增 concatenate_pages 方法,用于拼接多页解析结果,支持保留多级标题结构、合并跨页表格。
  • GenAIClient在构造时支持指定自定义模型名称。
  • PaddleOCR-VL-0.9B 支持本地推理,并允许在每次预测时传入 min_pixelsmax_pixels 参数。
  • @alealexpro100 修复了 cyrillic_PP-OCRv5_mobile_rec 模型在 PaddlePaddle 3.1.1 + CUDA 12 环境下无法启用高性能推理的问题。
  • @szepeviktor 修复了文本检测模型在图像尺寸超过限制时,日志中宽高显示顺序错误的问题。
  • @metax666 @duqimeng PP-StructureV3 现已支持在沐曦 GPU 上运行。
  • 支持使用 vLLM、SGLang 内部集成的 PaddleOCR-VL-0.9B(依赖高版本 vLLM 和 SGLang)。
  • 修复应用打包文档中的错误,更新示例代码以适配最新接口。

Full Changelog: v3.3.12...v3.3.13

v3.3.12

22 Dec 02:40

Choose a tag to compare

2025.12.17 v3.3.12 released

  • Fixed an issue where headers and footers were missing in the Markdown output of PP-StructureV3.
  • For PaddleOCR-VL-0.9B, optimized the flash attention availability check and fall back to eager attention when flash attention is unavailable (e.g., on Windows systems).

2025.12.17 v3.3.12 发布

  • 修复 PP-StructureV3 的 Markdown 结果中缺失页眉、页脚的问题。
  • 对于 PaddleOCR-VL-0.9B,优化 flash attention 可用性检查,在 Windows 系统等 flash attention 不可用的情况使用 eager attention 方案。

Full Changelog: v3.3.11...v3.3.12

v3.3.11

09 Dec 12:40

Choose a tag to compare

2025.12.9 v3.3.11 released

  • Fixed an issue where memory kept increasing during repeated inference.
  • Added a new parameter markdown_ignore_labels to PP-StructureV3 and PaddleOCR-VL, which controls the element types to be ignored in Markdown output; pipeline outputs now include additional information such as the number of pages and image size.
  • PaddleOCR-VL now supports controlling whether to merge image blocks through the merge_layout_blocks parameter.
  • Supported skipping model source checking via the environment variable PADDLE_PDX_DISABLE_MODEL_SOURCE_CHECK, resolving model loading lag in offline environments.
  • Fixed an accuracy issue of the RT-DETR-X model when using PIR-TRT inference with batch size > 1.
  • Pre-installed fonts in high-stability deployment images to prevent content loss when rendering PDF pages.
  • Added support for safetensors 0.7.0 and removed instructions in the documentation about installing a specific version.
  • Added support for MetaX GPUs. @metax666

2025.12.9 v3.3.11 发布

  • 修复重复推理时内存持续增长的问题。
  • PP-StructureV3 与 PaddleOCR-VL 新增参数 markdown_ignore_labels,用于控制在 Markdown 输出中需忽略的元素类型;产线输出结果中新增文件页数、图像尺寸等信息。
  • PaddleOCR-VL 支持通过参数 merge_layout_blocks 控制是否对图像块进行合并。
  • 支持通过环境变量 PADDLE_PDX_DISABLE_MODEL_SOURCE_CHECK 绕过模型源检查,解决离线环境下模型加载卡顿的问题。
  • 修复 RT-DETR-X 模型在使用 PIR-TRT 推理且 batch size > 1 时出现的精度问题。
  • 在高稳定性部署镜像中预装字体,避免渲染 PDF 页面时出现内容缺失。
  • 支持 safetensors 0.7.0,并移除文档中关于安装指定版本的说明。
  • 新增对沐曦 GPU 的支持。 @metax666

Full Changelog: v3.3.10...v3.3.11

v3.3.10

25 Nov 09:41

Choose a tag to compare

2025.11.24 v3.3.10 released

  • Optimized the network implementation of the PaddleOCR-VL-0.9B model, significantly reducing GPU memory usage on devices with Compute Capability ≥ 8.

2025.11.24 v3.3.10 发布

  • 优化 PaddleOCR-VL-0.9B 模型的组网实现,在 Compute Capability >= 8 的 GPU 设备上显著降低显存用量。

Full Changelog: v3.3.9...v3.3.10

v3.3.9

25 Nov 09:41

Choose a tag to compare

2025.11.10 v3.3.9 released

  • Fixed an issue where PP-DocLayoutV2 exhibited abnormal accuracy on CPU.
  • PaddleOCR-VL-0.9B now supports deployment on DCU devices using the vLLM server mode.

2025.11.10 v3.3.9 发布

  • 修复 PP-DocLayoutV2 在 CPU 上精度异常的问题。
  • PaddleOCR-VL-0.9B 支持在 DCU 设备上以 vLLM server 方式部署。

Full Changelog: v3.3.8...v3.3.9

v3.3.8

25 Nov 09:40

Choose a tag to compare

2025.11.5 v3.3.8 released

  • Fixed installation bugs in the vLLM and SGLang plugins.

2025.11.5 v3.3.8 发布

  • 修复 vLLM、SGLang 插件的安装 bug。

Full Changelog: v3.3.7...v3.3.8

v3.3.7

25 Nov 09:39

Choose a tag to compare

2025.11.5 v3.3.7 released

  • PaddleOCR-VL now supports inference on DCU and XPU.
  • Optimized the installation process for vLLM / SGLang plugins: hardware information is automatically detected and the matching version of flash-attn is installed without manual installation.
  • The high-stability serving solution for General OCR and PP-StructureV3 pipelines now supports handling concurrent requests in a single instance.
  • For high-stability serving, the server now prints the log IDs of each request in a batch when receiving requests, making debugging easier.
  • The PP-StructureV3 and PP-DocTranslation pipelines now support saving results in DOCX and LaTeX formats.
  • Simplified PDF page rendering logic to improve reading performance. @mara004

2025.11.5 v3.3.7 发布

  • PaddleOCR-VL 新增对 DCU 和 XPU 的推理支持。
  • 优化 vLLM / SGLang 插件的安装流程:自动检测硬件信息并安装匹配版本的 flash-attn,无需手动安装。
  • 通用 OCR 与 PP-StructureV3 产线的高稳定性服务化部署方案新增支持单实例并发请求处理。
  • 对于高稳定性服务化部署,服务器在接收请求时新增打印 batch 内各请求的 log ID,便于调试。
  • PP-StructureV3、PP-DocTranslation 产线新增结果保存为 DOCX、LaTeX 格式的能力。
  • 简化 PDF 页面渲染逻辑,提升读取性能。 @mara004

Full Changelog: v3.3.6...v3.3.7