Skip to content

Releases: PaddlePaddle/PaddleX

v3.3.10

25 Nov 09:41

Choose a tag to compare

2025.11.24 v3.3.10 released

  • Optimized the network implementation of the PaddleOCR-VL-0.9B model, significantly reducing GPU memory usage on devices with Compute Capability ≥ 8.

2025.11.24 v3.3.10 发布

  • 优化 PaddleOCR-VL-0.9B 模型的组网实现,在 Compute Capability >= 8 的 GPU 设备上显著降低显存用量。

Full Changelog: v3.3.9...v3.3.10

v3.3.9

25 Nov 09:41

Choose a tag to compare

2025.11.10 v3.3.9 released

  • Fixed an issue where PP-DocLayoutV2 exhibited abnormal accuracy on CPU.
  • PaddleOCR-VL-0.9B now supports deployment on DCU devices using the vLLM server mode.

2025.11.10 v3.3.9 发布

  • 修复 PP-DocLayoutV2 在 CPU 上精度异常的问题。
  • PaddleOCR-VL-0.9B 支持在 DCU 设备上以 vLLM server 方式部署。

Full Changelog: v3.3.8...v3.3.9

v3.3.8

25 Nov 09:40

Choose a tag to compare

2025.11.5 v3.3.8 released

  • Fixed installation bugs in the vLLM and SGLang plugins.

2025.11.5 v3.3.8 发布

  • 修复 vLLM、SGLang 插件的安装 bug。

Full Changelog: v3.3.7...v3.3.8

v3.3.7

25 Nov 09:39

Choose a tag to compare

2025.11.5 v3.3.7 released

  • PaddleOCR-VL now supports inference on DCU and XPU.
  • Optimized the installation process for vLLM / SGLang plugins: hardware information is automatically detected and the matching version of flash-attn is installed without manual installation.
  • The high-stability serving solution for General OCR and PP-StructureV3 pipelines now supports handling concurrent requests in a single instance.
  • For high-stability serving, the server now prints the log IDs of each request in a batch when receiving requests, making debugging easier.
  • The PP-StructureV3 and PP-DocTranslation pipelines now support saving results in DOCX and LaTeX formats.
  • Simplified PDF page rendering logic to improve reading performance. @mara004

2025.11.5 v3.3.7 发布

  • PaddleOCR-VL 新增对 DCU 和 XPU 的推理支持。
  • 优化 vLLM / SGLang 插件的安装流程:自动检测硬件信息并安装匹配版本的 flash-attn,无需手动安装。
  • 通用 OCR 与 PP-StructureV3 产线的高稳定性服务化部署方案新增支持单实例并发请求处理。
  • 对于高稳定性服务化部署,服务器在接收请求时新增打印 batch 内各请求的 log ID,便于调试。
  • PP-StructureV3、PP-DocTranslation 产线新增结果保存为 DOCX、LaTeX 格式的能力。
  • 简化 PDF 页面渲染逻辑,提升读取性能。 @mara004

Full Changelog: v3.3.6...v3.3.7

v3.3.6

29 Oct 07:48

Choose a tag to compare

2025.10.28 v3.3.6 released

  • PaddleOCR-VL supports inference using x86-64 CPU.
  • Unified the chat template used for different inference methods in PaddleOCR-VL.
  • Fixed the precision inconsistency issue for images containing formulas between PaddleOCR-VL inference using the PaddlePaddle framework and vLLM inference.
  • Released the Dockerfile for the vLLM inference image: https://github.com/PaddlePaddle/PaddleX/tree/release/3.3/deploy/genai_vllm_server_docker .
  • Fixed the issue where, during offline inference, the program still attempted to download models online even when local cached models existed.

2025.10.28 v3.3.6 发布

  • PaddleOCR-VL 支持使用 x86-64 CPU 推理。
  • 统一 PaddleOCR-VL 不同推理方式使用的 chat template。
  • 修复 PaddleOCR-VL 使用 PaddlePaddle 框架推理与使用 vLLM 推理对于包含公式的图像精度不一致的问题。
  • 公开 vLLM 推理镜像的 Dockerfile:https://github.com/PaddlePaddle/PaddleX/tree/release/3.3/deploy/genai_vllm_server_docker
  • 修复离线推理时,即使本地缓存模型存在,程序仍然尝试联网下载模型的问题。

Full Changelog: v3.3.5...v3.3.6

v3.3.5

23 Oct 15:03

Choose a tag to compare

2025.10.23 v3.3.5 released

  • Fixed the issue with weight data type mapping, supporting GPUs with compute capability between 7 and 8.
  • Resolved the problem of model configuration parsing failure when the model configuration includes quantization_config.
  • Fixed the issue where inference errors occurred when using paths containing Chinese characters in directories on Windows.
  • Resolved the problem of being unable to use PaddleOCR-VL models hosted on the AI Studio platform.
  • Added support for passing the max_new_tokens parameter during PaddleOCR-VL model inference.

2025.10.23 v3.3.5 发布

  • 修复权重数据类型映射问题,支持 compute capability 在7-8之间的GPU。
  • 修复模型配置中包含 quantization_config 时模型解析配置失败的问题。
  • 修复 Windows 环境下,使用带有中文目录的路径推理报错的问题
  • 修复无法使用 AI Studio 平台托管的 PaddleOCR-VL 模型的问题
  • 支持 PaddleOCR-VL 模型推理时 max_new_tokens 参数的传入

Full Changelog: v3.3.4...v3.3.5

v3.3.1

16 Oct 08:01

Choose a tag to compare

2025.10.16 v3.3.1 released

  • Fix issues such as the missing concatenate_markdown_pages method in the PaddleOCR-VL production pipeline.

2025.10.16 v3.3.1 发布

  • 修复 PaddleOCR-VL 产线 concatenate_markdown_pages 方法缺失等问题。

Full Changelog: v3.3.0...v3.3.1

v3.3.0

16 Oct 05:50

Choose a tag to compare

2025.10.16 v3.3.0 released

  • Added support for inference and deployment of PaddleOCR-VL and PP-OCRv5 multilingual models.

2025.10.16 v3.3.0 发布

  • 支持PaddleOCR-VL、PP-OCRv5多语种模型的推理部署能力。

Full Changelog: v3.2.1...v3.3.0

v3.2.1

29 Aug 11:51
ca49e57

Choose a tag to compare

2025.8.29 v3.2.1 released

  • Bug Fixes:
    • Fixed a potential array overflow issue in the formula recognition module of PP-StructureV3 during pipeline inference.
    • Optimized the model download logic: when the model weight file already exists locally, the system will no longer re-download it, ensuring usability in offline environments.
    • Updated the high-stability serving image dependencies to be compatible with PaddleX 3.2.0; fixed path errors in the upload and cleanup scripts for the high-stability serving image, as well as typos in the documentation.
    • Fixed issues of invalid escape sequences in the formula recognition model post-processing stage in Python 3.12, as well as the potential Regular Expression Denial of Service (ReDoS) vulnerability.

2025.8.29 v3.2.1 发布

  • Bug修复:
    • 修复了PP-StructureV3在产线推理过程中,公式识别模块可能出现的数组溢出的隐患。
    • 优化了模型下载逻辑:当本地已存在模型权重文件时,系统将不再重新下载,确保在离线环境下的可用性。
    • 更新高稳定性服务化部署镜像依赖,适配PaddleX 3.2.0;修复高稳定服务化部署镜像上传与清理脚本中的路径错误以及文档中的typo。
    • 修复了在 Python 3.12 中公式识别模型后处理阶段的无效转义符问题,以及潜在的正则表达式拒绝服务(ReDoS)问题。

Full Changelog: v3.2.0...v3.2.1

v3.2.0

20 Aug 06:33

Choose a tag to compare

2025.8.20 v3.2.0 released

  • Deployment Capability Upgrades:

    • Fully supports PaddlePaddle framework versions 3.1.0 and 3.1.1.
    • High-performance inference supports CUDA 12, with backend options including Paddle Inference and ONNX Runtime.
    • High-stability serving solution is fully open-sourced, enabling users to customize Docker images and SDKs as needed.
    • High-stability serving solution supports invocation via manually constructed HTTP requests, allowing client applications to be developed in any programming language.
  • Key Model Additions:

    • Added training, inference, and deployment support for PP-OCRv5 English, Thai, and Greek recognition models. The PP-OCRv5 English model delivers an 11% improvement over the main PP-OCRv5 model in English scenarios, with the Thai model achieving an accuracy of 82.68% and the Greek model 89.28%.
  • Benchmark Enhancements:

    • All pipelines support fine-grained benchmarking, enabling the measurement of end-to-end inference time as well as per-layer and per-module latency data to assist with performance analysis.
    • Added key metrics such as inference latency and memory usage for commonly used configurations on mainstream hardware to the documentation, providing deployment reference for users.
  • Bug Fixes:

    • Fixed an issue where invalid input image file formats could cause recursive calls.
    • Resolved ineffective parameter settings for chart recognition, seal recognition, and document pre-processing in the configuration files for the PP-DocTranslation and PP-StructureV3 pipelines.
    • Fixed an issue where PDF files were not properly closed after inference.
  • Other Updates:

    • Added support for Windows users with NVIDIA 50-series graphics cards; users can install the corresponding PaddlePaddle framework version as per the installation guide.
    • The PP-OCR model series now supports returning coordinates for individual characters.
    • The model_name parameter in PaddlePredictorOption has been moved to PaddleInfer, improving usability.
    • Refactored the official model download logic, with new support for multiple model hosting platforms such as AIStudio and ModelScope.

2025.8.20 v3.2.0 发布

  • 部署能力升级:

    • 全面支持飞桨框架 3.1.0 和 3.1.1 版本。
    • 高性能推理支持 CUDA 12,可使用 Paddle Inference、ONNX Runtime 后端推理。
    • 高稳定性服务化部署方案全面开源,支持用户根据需求对 Docker 镜像和 SDK 进行定制化修改。
    • 高稳定性服务化部署方案支持通过手动构造HTTP请求的方式调用,该方式允许客户端代码使用任意编程语言编写。
  • 重要模型新增:

    • 新增 PP-OCRv5 英文、泰文、希腊文识别模型的训练、推理、部署。其中 PP-OCRv5 英文模型较 PP-OCRv5 主模型在英文场景提升 11%,泰文识别模型精度 82.68%,希腊文识别模型精度 89.28%。
  • Benchmark升级:

    • 全部产线支持产线细粒度 benchmark,能够测量产线端到端推理时间以及逐层、逐模块的耗时数据,可用于辅助产线性能分析。
    • 在文档中补充各产线常用配置在主流硬件上的关键指标,包括推理耗时和内存占用等,为用户部署提供参考。
  • Bug修复:

    • 修复了当输入图片文件格式不合法时,导致递归调用的问题。
    • 修复了 PP-DocTranslation 和 PP-StructureV3 产线配置文件中图表识别、印章识别、文档预处理参数设置不生效的问题。
    • 修复 PDF 文件在推理结束后未正确关闭的问题。
  • 其他升级:

    • 支持 Windows 用户使用英伟达 50 系显卡,可根据安装文档安装对应版本的 paddle 框架。
    • PP-OCR 系列模型支持返回单文字坐标。
    • PaddlePredictorOption 中的 model_name 参数移至 PaddleInfer 中,改善了用户易用性。
    • 重构了官方模型下载逻辑,新增了 AIStudio、ModelScope 等多模型托管平台。

Full Changelog: v3.1.4...v3.2.0