Skip to content

Commit bca7d35

Browse files
authored
Merge branch 'PaddlePaddle:develop' into dev_20250110_update_fuse_for_Qwen2MoE
2 parents 9acff15 + 54b8882 commit bca7d35

File tree

117 files changed

+3832
-1241
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

117 files changed

+3832
-1241
lines changed

README.md

Lines changed: 20 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@
77
------------------------------------------------------------------------------------------
88

99
<p align="center">
10-
<a href="./LICENSE"><img src="https://img.shields.io/badge/license-Apache%202-dfd.svg"></a>
10+
<a href="https://paddlenlp.readthedocs.io/en/latest/?badge=latest"><img src="https://readthedocs.org/projects/paddlenlp/badge/?version=latest">
1111
<a href="https://github.com/PaddlePaddle/PaddleNLP/releases"><img src="https://img.shields.io/github/v/release/PaddlePaddle/PaddleNLP?color=ffa"></a>
1212
<a href=""><img src="https://img.shields.io/badge/python-3.7+-aff.svg"></a>
1313
<a href=""><img src="https://img.shields.io/badge/os-linux%2C%20win%2C%20mac-pink.svg"></a>
@@ -16,6 +16,7 @@
1616
<a href="https://pypi.org/project/paddlenlp/"><img src="https://img.shields.io/pypi/dm/paddlenlp?color=9cf"></a>
1717
<a href="https://github.com/PaddlePaddle/PaddleNLP/issues"><img src="https://img.shields.io/github/issues/PaddlePaddle/PaddleNLP?color=9cc"></a>
1818
<a href="https://github.com/PaddlePaddle/PaddleNLP/stargazers"><img src="https://img.shields.io/github/stars/PaddlePaddle/PaddleNLP?color=ccf"></a>
19+
<a href="./LICENSE"><img src="https://img.shields.io/badge/license-Apache%202-dfd.svg"></a>
1920
</p>
2021

2122
<h4 align="center">
@@ -69,6 +70,9 @@
6970

7071
大模型套件高性能推理模块内置动态插入和全环节算子融合策略,极大加快并行推理速度。底层实现细节封装化,实现开箱即用的高性能并行推理能力。
7172

73+
## 文档
74+
更多详细文档, 请访问 [PaddleNLP Documentation](https://paddlenlp.readthedocs.io/).
75+
7276
------------------------------------------------------------------------------------------
7377

7478
## 模型支持
@@ -90,6 +94,8 @@
9094
| [ChatGLM2](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/llm/config/chatglm2) | THUDM/chatglm2-6b |
9195
| [ChatGLM3](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/llm/config/chatglm2) | THUDM/chatglm3-6b |
9296
| [DeepSeekV2](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/llm/config/deepseek-v2) | deepseek-ai/DeepSeek-V2, deepseek-ai/DeepSeek-V2-Chat, deepseek-ai/DeepSeek-V2-Lite, deepseek-ai/DeepSeek-V2-Lite-Chat, deepseek-ai/DeepSeek-Coder-V2-Base, deepseek-ai/DeepSeek-Coder-V2-Instruct, deepseek-ai/DeepSeek-Coder-V2-Lite-Base, deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct |
97+
| [DeepSeekV3](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/llm/config/deepseek-v2) | deepseek-ai/DeepSeek-V3, deepseek-ai/DeepSeek-V3-Base |
98+
| [DeepSeek-R1](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/llm/config/deepseek-v2) | deepseek-ai/DeepSeek-R1, deepseek-ai/DeepSeek-R1-Zero, deepseek-ai/DeepSeek-R1-Distill-Llama-70B, deepseek-ai/DeepSeek-R1-Distill-Llama-8B, deepseek-ai/DeepSeek-R1-Distill-Qwen-14B, deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B, deepseek-ai/DeepSeek-R1-Distill-Qwen-32B, deepseek-ai/DeepSeek-R1-Distill-Qwen-7B |
9399
| [Gemma](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/llm/config/gemma) | google/gemma-7b, google/gemma-7b-it, google/gemma-2b, google/gemma-2b-it |
94100
| [Mistral](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/llm/config/mistral) | mistralai/Mistral-7B-Instruct-v0.3, mistralai/Mistral-7B-v0.1 |
95101
| [Mixtral](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/llm/config/mixtral) | mistralai/Mixtral-8x7B-Instruct-v0.1 |
@@ -130,19 +136,19 @@
130136

131137

132138
| Model | Pretrain | SFT | LoRA | FlashMask | Prefix Tuning | DPO/SimPO/ORPO/KTO | RLHF | Mergekit | Quantization |
133-
|--------------------------------------------|:--------:|:---:|:----:|:---------:|:-------------:|:--------------:|:----:|:-----:|:------------:|
134-
| [Llama](./llm/config/llama) |||||||| ||
135-
| [Qwen](./llm/config/qwen) ||||||| 🚧 | | 🚧 |
136-
| [Mixtral](./llm/config/mixtral) |||| 🚧 | 🚧 || 🚧 | | 🚧 |
137-
| [Mistral](./llm/config/mistral) |||| 🚧 ||| 🚧 | | 🚧 |
138-
| [Baichuan/Baichuan2](./llm/config/llama) ||||||| 🚧 | ||
139-
| [ChatGLM-6B](./llm/config/chatglm) |||| 🚧 || 🚧 | 🚧 | ||
140-
| [ChatGLM2/ChatGLM3](./llm/config/chatglm2) |||| 🚧 ||| 🚧 | ||
141-
| [Bloom](./llm/config/bloom) |||| 🚧 || 🚧 | 🚧 | ||
142-
| [GPT-3](./llm/config/gpt-3) ||| 🚧 | 🚧 | 🚧 | 🚧 | 🚧 | | 🚧 |
143-
| [OPT](./llm/config/opt) |||| 🚧 | 🚧 | 🚧 | 🚧 | | 🚧 |
144-
| [Gemma](./llm/config/gemma) |||| 🚧 | 🚧 || 🚧 | | 🚧 |
145-
| [Yuan](./llm/config/yuan) |||| 🚧 | 🚧 || 🚧 | | 🚧 |
139+
|--------------------------------------------|:--------:|:---:|:----:|:---------:|:-------------:|:------------------:|:----:|:--------:|:------------:|
140+
| [Llama](./llm/config/llama) |||||| || ||
141+
| [Qwen](./llm/config/qwen) |||||| | 🚧 | | 🚧 |
142+
| [Mixtral](./llm/config/mixtral) |||| 🚧 | 🚧 | | 🚧 | | 🚧 |
143+
| [Mistral](./llm/config/mistral) |||| 🚧 || | 🚧 | | 🚧 |
144+
| [Baichuan/Baichuan2](./llm/config/llama) |||||| | 🚧 | ||
145+
| [ChatGLM-6B](./llm/config/chatglm) |||| 🚧 || 🚧 | 🚧 | ||
146+
| [ChatGLM2/ChatGLM3](./llm/config/chatglm2) |||| 🚧 || | 🚧 | ||
147+
| [Bloom](./llm/config/bloom) |||| 🚧 || 🚧 | 🚧 | ||
148+
| [GPT-3](./llm/config/gpt-3) ||| 🚧 | 🚧 | 🚧 | 🚧 | 🚧 | | 🚧 |
149+
| [OPT](./llm/config/opt) |||| 🚧 | 🚧 | 🚧 | 🚧 | | 🚧 |
150+
| [Gemma](./llm/config/gemma) |||| 🚧 | 🚧 | | 🚧 | | 🚧 |
151+
| [Yuan](./llm/config/yuan) |||| 🚧 | 🚧 | | 🚧 | | 🚧 |
146152
* [大模型推理](./llm/docs/predict/inference.md)已支持 LLaMA 系列、Qwen 系列、Mistral 系列、ChatGLM 系列、Bloom 系列和 Baichuan 系列,支持 Weight Only INT8及 INT4推理,支持 WAC(权重、激活、Cache KV)进行 INT8、FP8量化的推理,【LLM】模型推理支持列表如下:
147153

148154
| 模型名称/量化类型支持 | FP16/BF16 | WINT8 | WINT4 | INT8-A8W8 | FP8-A8W8 | INT8-A8W8C8 |

README_en.md

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@
77
------------------------------------------------------------------------------------------
88

99
<p align="center">
10-
<a href="./LICENSE"><img src="https://img.shields.io/badge/license-Apache%202-dfd.svg"></a>
10+
<a href="https://paddlenlp.readthedocs.io/en/latest/?badge=latest"><img src="https://readthedocs.org/projects/paddlenlp/badge/?version=latest">
1111
<a href="https://github.com/PaddlePaddle/PaddleNLP/releases"><img src="https://img.shields.io/github/v/release/PaddlePaddle/PaddleNLP?color=ffa"></a>
1212
<a href=""><img src="https://img.shields.io/badge/python-3.7+-aff.svg"></a>
1313
<a href=""><img src="https://img.shields.io/badge/os-linux%2C%20win%2C%20mac-pink.svg"></a>
@@ -16,6 +16,7 @@
1616
<a href="https://pypi.org/project/paddlenlp/"><img src="https://img.shields.io/pypi/dm/paddlenlp?color=9cf"></a>
1717
<a href="https://github.com/PaddlePaddle/PaddleNLP/issues"><img src="https://img.shields.io/github/issues/PaddlePaddle/PaddleNLP?color=9cc"></a>
1818
<a href="https://github.com/PaddlePaddle/PaddleNLP/stargazers"><img src="https://img.shields.io/github/stars/PaddlePaddle/PaddleNLP?color=ccf"></a>
19+
<a href="./LICENSE"><img src="https://img.shields.io/badge/license-Apache%202-dfd.svg"></a>
1920
</p>
2021

2122
<h4 align="center">
@@ -52,6 +53,9 @@ The fine-tuning algorithms are deeply integrated with zero-padding data streams
5253

5354
The high-performance inference module of the large model toolkit incorporates dynamic insertion and operator fusion strategies throughout the entire process, greatly accelerating parallel inference speed. The underlying implementation details are encapsulated, enabling out-of-the-box high-performance parallel inference capabilities.
5455

56+
## Documentation
57+
For detailed documentation, visit the [PaddleNLP Documentation](https://paddlenlp.readthedocs.io/).
58+
5559
------------------------------------------------------------------------------------------
5660

5761
## Support Models
@@ -68,7 +72,7 @@ Detailed list 👉 [Supported Model List](https://github.com/PaddlePaddle/Paddle
6872
### Pip Installation
6973

7074
```shell
71-
pip install --upgrade paddlenlp==3.0.0b2
75+
pip install --upgrade paddlenlp==3.0.0b3
7276
```
7377

7478
or you can install the latest develop branch code with the following command:

csrc/README.md

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,9 @@
1-
# PaddleNLP 自定义 OP
1+
# PaddleNLP 大模型高性能自定义推理算子
22

3-
此文档介绍如何编译安装 PaddleNLP 自定义 OP。
3+
此文档介绍如何编译安装 PaddleNLP 大模型高性能自定义推理算子的安装教程。
4+
5+
使用这些高性能算子,可以大幅提升大模型推理速度。
6+
大模型推理相关教程详见[此处](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/llm/README.md#6-%E6%8E%A8%E7%90%86)
47

58
## 安装 C++ 依赖
69

0 commit comments

Comments
 (0)