Skip to content

Commit 615930b

Browse files
Update README (#3426)
* 修改READMe * code style * code style
1 parent 6f11171 commit 615930b

File tree

8 files changed

+26
-27
lines changed

8 files changed

+26
-27
lines changed

README.md

Lines changed: 4 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -23,13 +23,11 @@ English | [简体中文](README_CN.md)
2323
</p>
2424

2525
--------------------------------------------------------------------------------
26-
# FastDeploy 2.1: Inference and Deployment Toolkit for LLMs and VLMs based on PaddlePaddle
26+
# FastDeploy : Inference and Deployment Toolkit for LLMs and VLMs based on PaddlePaddle
2727

2828
## News
2929
**[2025-08] 🔥 Released FastDeploy v2.1:** A brand-new KV Cache scheduling strategy has been introduced, and expanded support for PD separation and CUDA Graph across more models. Enhanced hardware support has been added for platforms like Kunlun and Hygon, along with comprehensive optimizations to improve the performance of both the service and inference engine.
3030

31-
**[2025-07] 《FastDeploy2.0推理部署实测》专题活动已上线!** 完成文心4.5系列开源模型的推理部署等任务,即可获得骨瓷马克杯等FastDeploy2.0官方周边及丰富奖金!🎁 欢迎大家体验反馈~ 📌[报名地址](https://www.wjx.top/vm/meSsp3L.aspx#) 📌[活动详情](https://github.com/PaddlePaddle/FastDeploy/discussions/2728)
32-
3331
**[2025-07] The FastDeploy 2.0 Inference Deployment Challenge is now live!** Complete the inference deployment task for the ERNIE 4.5 series open-source models to win official FastDeploy 2.0 merch and generous prizes! 🎁 You're welcome to try it out and share your feedback! 📌[Sign up here](https://www.wjx.top/vm/meSsp3L.aspx#) 📌[Event details](https://github.com/PaddlePaddle/FastDeploy/discussions/2728)
3432

3533
**[2025-06] 🔥 Released FastDeploy v2.0:** Supports inference and deployment for ERNIE 4.5. Furthermore, we open-source an industrial-grade PD disaggregation with context caching, dynamic role switching for effective resource utilization to further enhance inference performance for MoE models.
@@ -52,14 +50,15 @@ English | [简体中文](README_CN.md)
5250

5351
## Installation
5452

55-
FastDeploy supports inference deployment on **NVIDIA GPUs**, **Kunlunxin XPUs**, **Iluvatar GPUs**, **Enflame GCUs**, and other hardware. For detailed installation instructions:
53+
FastDeploy supports inference deployment on **NVIDIA GPUs**, **Kunlunxin XPUs**, **Iluvatar GPUs**, **Enflame GCUs**, **Hygon DCUs** and other hardware. For detailed installation instructions:
5654

5755
- [NVIDIA GPU](./docs/get_started/installation/nvidia_gpu.md)
5856
- [Kunlunxin XPU](./docs/get_started/installation/kunlunxin_xpu.md)
5957
- [Iluvatar GPU](./docs/get_started/installation/iluvatar_gpu.md)
6058
- [Enflame GCU](./docs/get_started/installation/Enflame_gcu.md)
59+
- [Hygon DCU](./docs/get_started/installation/hygon_dcu.md)
6160

62-
**Note:** We are actively working on expanding hardware support. Additional hardware platforms including Ascend NPU, Hygon DCU, and MetaX GPU are currently under development and testing. Stay tuned for updates!
61+
**Note:** We are actively working on expanding hardware support. Additional hardware platforms including Ascend NPU and MetaX GPU are currently under development and testing. Stay tuned for updates!
6362

6463
## Get Started
6564

README_CN.md

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@
2323
</p>
2424

2525
--------------------------------------------------------------------------------
26-
# FastDeploy 2.1:基于飞桨的大语言模型与视觉语言模型推理部署工具包
26+
# FastDeploy :基于飞桨的大语言模型与视觉语言模型推理部署工具包
2727

2828
## 最新活动
2929
**[2025-08] 🔥 FastDeploy v2.1 全新发布:** 全新的KV Cache调度策略,更多模型支持PD分离和CUDA Graph,昆仑、海光等更多硬件支持增强,全方面优化服务和推理引擎的性能。
@@ -48,14 +48,15 @@
4848

4949
## 安装
5050

51-
FastDeploy 支持在**英伟达(NVIDIA)GPU****昆仑芯(Kunlunxin)XPU****天数(Iluvatar)GPU****燧原(Enflame)GCU** 以及其他硬件上进行推理部署。详细安装说明如下:
51+
FastDeploy 支持在**英伟达(NVIDIA)GPU****昆仑芯(Kunlunxin)XPU****天数(Iluvatar)GPU****燧原(Enflame)GCU****海光(Hygon)DCU** 以及其他硬件上进行推理部署。详细安装说明如下:
5252

5353
- [英伟达 GPU](./docs/zh/get_started/installation/nvidia_gpu.md)
5454
- [昆仑芯 XPU](./docs/zh/get_started/installation/kunlunxin_xpu.md)
5555
- [天数 CoreX](./docs/zh/get_started/installation/iluvatar_gpu.md)
5656
- [燧原 S60](./docs/zh/get_started/installation/Enflame_gcu.md)
57+
- [海光 DCU](./docs/zh/get_started/installation/hygon_dcu.md)
5758

58-
**注意:** 我们正在积极拓展硬件支持范围。目前,包括昇腾(Ascend)NPU、海光(Hygon)DCU 和摩尔线程(MetaX)GPU 在内的其他硬件平台正在开发测试中。敬请关注更新!
59+
**注意:** 我们正在积极拓展硬件支持范围。目前,包括昇腾(Ascend)NPU 和 沐曦(MetaX)GPU 在内的其他硬件平台正在开发测试中。敬请关注更新!
5960

6061
## 入门指南
6162

dockerfiles/Dockerfile.gpu

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
FROM ccr-2vdh3abv-pub.cnc.bj.baidubce.com/paddlepaddle/fastdeploy-cuda-12.6:2.0.0
1+
FROM ccr-2vdh3abv-pub.cnc.bj.baidubce.com/paddlepaddle/fastdeploy-cuda-12.6:2.1.0
22
ARG PADDLE_VERSION=3.1.1
33
ARG FD_VERSION=2.1.0
44

docs/get_started/installation/nvidia_gpu.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ The following installation methods are available when your environment meets the
1313
**Notice**: The pre-built image only supports SM80/90 GPU(e.g. H800/A800),if you are deploying on SM86/89GPU(L40/4090/L20), please reinstall ```fastdpeloy-gpu``` after you create the container.
1414

1515
```shell
16-
docker pull ccr-2vdh3abv-pub.cnc.bj.baidubce.com/paddlepaddle/fastdeploy-cuda-12.6:2.0.0
16+
docker pull ccr-2vdh3abv-pub.cnc.bj.baidubce.com/paddlepaddle/fastdeploy-cuda-12.6:2.1.0
1717
```
1818

1919
## 2. Pre-built Pip Installation

docs/index.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -13,12 +13,12 @@
1313

1414
| Model | Data Type | PD Disaggregation | Chunked Prefill | Prefix Caching | MTP | CUDA Graph | Maximum Context Length |
1515
|:--- | :------- | :---------- | :-------- | :-------- | :----- | :----- | :----- |
16-
|ERNIE-4.5-300B-A47B | BF16/WINT4/WINT8/W4A8C8/WINT2/FP8 ||||(WINT4)| WIP |128K |
17-
|ERNIE-4.5-300B-A47B-Base| BF16/WINT4/WINT8 ||||✅(WINT4)| WIP | 128K |
16+
|ERNIE-4.5-300B-A47B | BF16/WINT4/WINT8/W4A8C8/WINT2/FP8 ||||| WIP |128K |
17+
|ERNIE-4.5-300B-A47B-Base| BF16/WINT4/WINT8 ||||| WIP | 128K |
1818
|ERNIE-4.5-VL-424B-A47B | BF16/WINT4/WINT8 | WIP || WIP || WIP |128K |
1919
|ERNIE-4.5-VL-28B-A3B | BF16/WINT4/WINT8 ||| WIP || WIP |128K |
20-
|ERNIE-4.5-21B-A3B | BF16/WINT4/WINT8/FP8 |||| WIP ||128K |
21-
|ERNIE-4.5-21B-A3B-Base | BF16/WINT4/WINT8/FP8 |||| WIP ||128K |
20+
|ERNIE-4.5-21B-A3B | BF16/WINT4/WINT8/FP8 |||| ||128K |
21+
|ERNIE-4.5-21B-A3B-Base | BF16/WINT4/WINT8/FP8 |||| ||128K |
2222
|ERNIE-4.5-0.3B | BF16/WINT8/FP8 |||||| 128K |
2323

2424
## Documentation

docs/zh/get_started/installation/nvidia_gpu.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@
1515
**注意**: 如下镜像仅支持SM 80/90架构GPU(A800/H800等),如果你是在L20/L40/4090等SM 86/69架构的GPU上部署,请在创建容器后,卸载```fastdeploy-gpu```再重新安装如下文档指定支持86/89架构的`fastdeploy-gpu`包。
1616

1717
``` shell
18-
docker pull ccr-2vdh3abv-pub.cnc.bj.baidubce.com/paddlepaddle/fastdeploy-cuda-12.6:2.0.0
18+
docker pull ccr-2vdh3abv-pub.cnc.bj.baidubce.com/paddlepaddle/fastdeploy-cuda-12.6:2.1.0
1919
```
2020

2121
## 2. 预编译Pip安装

docs/zh/index.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -13,12 +13,12 @@
1313

1414
| Model | Data Type | PD Disaggregation | Chunked Prefill | Prefix Caching | MTP | CUDA Graph | Maximum Context Length |
1515
|:--- | :------- | :---------- | :-------- | :-------- | :----- | :----- | :----- |
16-
|ERNIE-4.5-300B-A47B | BF16/WINT4/WINT8/W4A8C8/WINT2/FP8 ||||(WINT4)| WIP |128K |
17-
|ERNIE-4.5-300B-A47B-Base| BF16/WINT4/WINT8 ||||✅(WINT4)| WIP | 128K |
16+
|ERNIE-4.5-300B-A47B | BF16/WINT4/WINT8/W4A8C8/WINT2/FP8 ||||| WIP |128K |
17+
|ERNIE-4.5-300B-A47B-Base| BF16/WINT4/WINT8 ||||| WIP | 128K |
1818
|ERNIE-4.5-VL-424B-A47B | BF16/WINT4/WINT8 | WIP || WIP || WIP |128K |
1919
|ERNIE-4.5-VL-28B-A3B | BF16/WINT4/WINT8 ||| WIP || WIP |128K |
20-
|ERNIE-4.5-21B-A3B | BF16/WINT4/WINT8/FP8 |||| WIP ||128K |
21-
|ERNIE-4.5-21B-A3B-Base | BF16/WINT4/WINT8/FP8 |||| WIP ||128K |
20+
|ERNIE-4.5-21B-A3B | BF16/WINT4/WINT8/FP8 |||| ||128K |
21+
|ERNIE-4.5-21B-A3B-Base | BF16/WINT4/WINT8/FP8 |||| ||128K |
2222
|ERNIE-4.5-0.3B | BF16/WINT8/FP8 |||||| 128K |
2323

2424
## 文档说明

mkdocs.yml

Lines changed: 7 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
site_name: 'FastDeploy 2.0: Large Language Model Deployement'
1+
site_name: 'FastDeploy : Large Language Model Deployement'
22
repo_url: https://github.com/PaddlePaddle/FastDeploy
33
repo_name: FastDeploy
44

@@ -34,14 +34,14 @@ plugins:
3434
- locale: en
3535
default: true
3636
name: English
37-
site_name: 'FastDeploy 2.0: Large Language Model Deployement'
37+
site_name: 'FastDeploy: Large Language Model Deployement'
3838
build: true
3939
- locale: zh
4040
name: 简体中文
4141
site_name: 飞桨大语言模型推理部署工具包
42-
link: /zh/
42+
link: /./zh/
4343
nav_translations:
44-
FastDeploy 2.0: FastDeploy 2.0
44+
FastDeploy: FastDeploy
4545
Quick Start: 快速入门
4646
Installation: 安装
4747
Nvidia GPU: 英伟达 GPU
@@ -58,7 +58,7 @@ plugins:
5858
Monitor Metrics: 监控Metrics
5959
Scheduler: 调度器
6060
Offline Inference: 离线推理
61-
Optimal Deployment: 最佳实践
61+
Best Practices: 最佳实践
6262
ERNIE-4.5-0.3B: ERNIE-4.5-0.3B
6363
ERNIE-4.5-21B-A3B: ERNIE-4.5-21B-A3B
6464
ERNIE-4.5-300B-A47B: ERNIE-4.5-300B-A47B
@@ -89,7 +89,7 @@ plugins:
8989
Environment Variables: 环境变量
9090

9191
nav:
92-
- 'FastDeploy 2.0': index.md
92+
- 'FastDeploy': index.md
9393
- 'Quick Start':
9494
- Installation:
9595
- 'Nvidia GPU': get_started/installation/nvidia_gpu.md
@@ -106,7 +106,7 @@ nav:
106106
- 'Monitor Metrics': online_serving/metrics.md
107107
- 'Scheduler': online_serving/scheduler.md
108108
- 'Offline Inference': offline_inference.md
109-
- Optimal Deployment:
109+
- Best Practices:
110110
- ERNIE-4.5-0.3B: best_practices/ERNIE-4.5-0.3B-Paddle.md
111111
- ERNIE-4.5-21B-A3B: best_practices/ERNIE-4.5-21B-A3B-Paddle.md
112112
- ERNIE-4.5-300B-A47B: best_practices/ERNIE-4.5-300B-A47B-Paddle.md
@@ -135,4 +135,3 @@ nav:
135135
- 'Log Description': usage/log.md
136136
- 'Code Overview': usage/code_overview.md
137137
- 'Environment Variables': usage/environment_variables.md
138-

0 commit comments

Comments
 (0)