Skip to content

Commit 16d121f

Browse files
authored
Support all Mcore Models Context Parallel Reinforcement Training via ChatLearn (#659)
* Support all Mcore Models Context Parallel Reinforcement Training via ChatLearn * Support all Mcore Models Context Parallel Reinforcement Training via ChatLearn
1 parent de17272 commit 16d121f

File tree

3 files changed

+20
-31
lines changed

3 files changed

+20
-31
lines changed

README.md

Lines changed: 9 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -3,28 +3,22 @@
33

44
| | Megatron-Core | ChatLearn | verl |
55
|:------------|:------------------------------------------------------------------------------------------------------------------------:|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------:|:-----------:|
6-
| Qwen3 | [ReadMe](https://github.com/alibaba/Pai-Megatron-Patch/blob/main/examples/qwen3/README.md#Megatron-Core模型训练流程) | [ReadMe](https://github.com/alibaba/Pai-Megatron-Patch/blob/main/examples/qwen3/README_grpo.md) | Coming Soon |
7-
| QwQ | [ReadMe](https://github.com/alibaba/Pai-Megatron-Patch/blob/main/examples/qwq/README.md#Megatron-Core模型训练流程) | N/A | N/A |
8-
| Qwen2.5-VL | [ReadMe](https://github.com/alibaba/Pai-Megatron-Patch/blob/main/examples/qwen2_5_vl/README.md#Megatron-Core模型训练流程) | N/A | N/A |
9-
| Moonlight | [ReadMe](https://github.com/alibaba/Pai-Megatron-Patch/blob/main/examples/moonlight/README.md#Megatron-Core-MoE模型训练流程) | [ReadMe](https://github.com/alibaba/Pai-Megatron-Patch/blob/main/examples/moonlight/README_grpo.md) | Coming Soon |
10-
| DeepSeek-V3 | [ReadMe](https://github.com/alibaba/Pai-Megatron-Patch/blob/main/examples/deepseek_v3/README.md#Megatron-Core-MoE模型训练流程) | [ReadMe](https://github.com/alibaba/Pai-Megatron-Patch/blob/main/examples/deepseek_v3/README_grpo.md) | Coming Soon |
11-
| Qwen2-VL | [ReadMe](https://github.com/alibaba/Pai-Megatron-Patch/blob/main/examples/qwen2_vl/README.md#Megatron-Core模型训练流程) | N/A | N/A |
12-
| LLaVA | [ReadMe](https://github.com/alibaba/Pai-Megatron-Patch/blob/main/examples/llava_mcore/README.md#Megatron-Core模型训练流程) | N/A | N/A |
13-
| Qwen2.5 | [ReadMe](https://github.com/alibaba/Pai-Megatron-Patch/blob/main/examples/qwen2_5/README.md#Megatron-Core-Dense模型训练流程) | N/A | N/A |
14-
| LLama3.1 | [ReadMe](https://github.com/alibaba/Pai-Megatron-Patch/blob/main/examples/llama3_1/README.md#Megatron-Core-Dense模型训练流程) | N/A | N/A |
15-
| LLama3 | [ReadMe](https://github.com/alibaba/Pai-Megatron-Patch/blob/main/examples/llama3/README.md#Megatron-Core-Dense模型训练流程) | N/A | N/A |
16-
| LLama2 | [ReadMe](https://github.com/alibaba/Pai-Megatron-Patch/blob/main/examples/llama2/README.md#Megatron-Core-Dense模型训练流程) | N/A | N/A |
17-
| Mistral | [ReadMe](https://github.com/alibaba/Pai-Megatron-Patch/blob/main/examples/mistral/README.md#Megatron-Core模型训练流程) | N/A | N/A |
18-
| Qwen2 | [ReadMe](https://github.com/alibaba/Pai-Megatron-Patch/blob/main/examples/qwen2/README.md#Megatron-Core模型训练流程) | N/A | N/A |
19-
| Qwen1.5 | [ReadMe](https://github.com/alibaba/Pai-Megatron-Patch/blob/main/examples/qwen1_5/README.md#Megatron-Core-Dense模型训练流程) | N/A | N/A |
20-
| DeepSeek-V2 | [ReadMe](https://github.com/alibaba/Pai-Megatron-Patch/blob/main/examples/deepseek_v2/README.md#Megatron-Core-MoE模型训练流程) | N/A | N/A |
6+
|Qwen3 |[ReadMe](https://github.com/alibaba/Pai-Megatron-Patch/blob/main/examples/qwen3/README.md)|[ReadMe](https://github.com/alibaba/Pai-Megatron-Patch/blob/main/examples/qwen3/README_grpo.md) | Coming Soon |
7+
|QwQ |[ReadMe](https://github.com/alibaba/Pai-Megatron-Patch/blob/main/examples/qwq/README.md)| N/A | N/A |
8+
|Qwen2.5-VL |[ReadMe](https://github.com/alibaba/Pai-Megatron-Patch/blob/main/examples/qwen2_5_vl/README.md)| N/A | N/A |
9+
|Moonlight |[ReadMe](https://github.com/alibaba/Pai-Megatron-Patch/blob/main/examples/moonlight/README.md)|[ReadMe](https://github.com/alibaba/Pai-Megatron-Patch/blob/main/examples/moonlight/README_grpo.md)| N/A |
10+
|DeepSeek-V3 |[ReadMe](https://github.com/alibaba/Pai-Megatron-Patch/blob/main/examples/deepseek_v3/README.md)| N/A | N/A |
11+
|DeepSeek-R1 | N/A |[ReadMe](https://github.com/alibaba/Pai-Megatron-Patch/blob/main/examples/deepseek_v3/README_grpo.md)| Coming Soon |
12+
13+
2114

2215
## Introduction
2316
English | [简体中文](./README_zh-CN.md)
2417

2518
Pai-Megatron-Patch (https://github.com/alibaba/Pai-Megatron-Patch) is a deep learning training toolkit built for developers to train and predict LLMs & VLMs by using Megatron framework easily. With the continuous development of LLMs, the model structure and scale are rapidly evolving. Although these models can be conveniently manufactured using Transformers or DeepSpeed training framework, the training efficiency is comparably low. This phenomenon becomes even severer when the model scale exceeds 10 billion. The primary objective of Pai-Megatron-Patch is to effectively utilize the computational power of GPUs for LLM. This tool allows convenient training of commonly used LLM with all the accelerating techniques provided by Megatron-LM.
2619

2720
What's New:
21+
- **Support all Mcore Models Context Parallel Reinforcement Training via ChatLearn** [🔥🔥 2025.08.31]
2822
- **Support all Mcore Models GSPO Reinforcement Training via ChatLearn** [🔥🔥 2025.08.11]
2923
- **Support DeepSeek-V3-671B GRPO Reinforcement Training using Megatron-Core and ChatLearn** [🔥🔥 2025.07.31]
3024
- **Support Qwen3-235B GRPO Reinforcement Training using Megatron-Core and ChatLearn** [🔥🔥 2025.07.20]

README_zh-CN.md

Lines changed: 10 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -1,22 +1,16 @@
11
## 快速开始
22

3+
34
| | Megatron-Core | ChatLearn | verl |
45
|:------------|:------------------------------------------------------------------------------------------------------------------------:|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------:|:-----------:|
5-
| Qwen3 | [ReadMe](https://github.com/alibaba/Pai-Megatron-Patch/blob/main/examples/qwen3/README.md#Megatron-Core模型训练流程) | [ReadMe](https://github.com/alibaba/Pai-Megatron-Patch/blob/main/examples/qwen3/README_grpo.md) | Coming Soon |
6-
| QwQ | [ReadMe](https://github.com/alibaba/Pai-Megatron-Patch/blob/main/examples/qwq/README.md#Megatron-Core模型训练流程) | N/A | N/A |
7-
| Qwen2.5-VL | [ReadMe](https://github.com/alibaba/Pai-Megatron-Patch/blob/main/examples/qwen2_5_vl/README.md#Megatron-Core模型训练流程) | N/A | N/A |
8-
| Moonlight | [ReadMe](https://github.com/alibaba/Pai-Megatron-Patch/blob/main/examples/moonlight/README.md#Megatron-Core-MoE模型训练流程) | [ReadMe](https://github.com/alibaba/Pai-Megatron-Patch/blob/main/examples/moonlight/README_grpo.md) | Coming Soon |
9-
| DeepSeek-V3 | [ReadMe](https://github.com/alibaba/Pai-Megatron-Patch/blob/main/examples/deepseek_v3/README.md#Megatron-Core-MoE模型训练流程) | [ReadMe](https://github.com/alibaba/Pai-Megatron-Patch/blob/main/examples/deepseek_v3/README_grpo.md) | Coming Soon |
10-
| Qwen2-VL | [ReadMe](https://github.com/alibaba/Pai-Megatron-Patch/blob/main/examples/qwen2_vl/README.md#Megatron-Core模型训练流程) | N/A | N/A |
11-
| LLaVA | [ReadMe](https://github.com/alibaba/Pai-Megatron-Patch/blob/main/examples/llava_mcore/README.md#Megatron-Core模型训练流程) | N/A | N/A |
12-
| Qwen2.5 | [ReadMe](https://github.com/alibaba/Pai-Megatron-Patch/blob/main/examples/qwen2_5/README.md#Megatron-Core-Dense模型训练流程) | N/A | N/A |
13-
| LLama3.1 | [ReadMe](https://github.com/alibaba/Pai-Megatron-Patch/blob/main/examples/llama3_1/README.md#Megatron-Core-Dense模型训练流程) | N/A | N/A |
14-
| LLama3 | [ReadMe](https://github.com/alibaba/Pai-Megatron-Patch/blob/main/examples/llama3/README.md#Megatron-Core-Dense模型训练流程) | N/A | N/A |
15-
| LLama2 | [ReadMe](https://github.com/alibaba/Pai-Megatron-Patch/blob/main/examples/llama2/README.md#Megatron-Core-Dense模型训练流程) | N/A | N/A |
16-
| Mistral | [ReadMe](https://github.com/alibaba/Pai-Megatron-Patch/blob/main/examples/mistral/README.md#Megatron-Core模型训练流程) | N/A | N/A |
17-
| Qwen2 | [ReadMe](https://github.com/alibaba/Pai-Megatron-Patch/blob/main/examples/qwen2/README.md#Megatron-Core模型训练流程) | N/A | N/A |
18-
| Qwen1.5 | [ReadMe](https://github.com/alibaba/Pai-Megatron-Patch/blob/main/examples/qwen1_5/README.md#Megatron-Core-Dense模型训练流程) | N/A | N/A |
19-
| DeepSeek-V2 | [ReadMe](https://github.com/alibaba/Pai-Megatron-Patch/blob/main/examples/deepseek_v2/README.md#Megatron-Core-MoE模型训练流程) | N/A | N/A |
6+
|Qwen3 |[ReadMe](https://github.com/alibaba/Pai-Megatron-Patch/blob/main/examples/qwen3/README.md)|[ReadMe](https://github.com/alibaba/Pai-Megatron-Patch/blob/main/examples/qwen3/README_grpo.md) | Coming Soon |
7+
|QwQ |[ReadMe](https://github.com/alibaba/Pai-Megatron-Patch/blob/main/examples/qwq/README.md)| N/A | N/A |
8+
|Qwen2.5-VL |[ReadMe](https://github.com/alibaba/Pai-Megatron-Patch/blob/main/examples/qwen2_5_vl/README.md)| N/A | N/A |
9+
|Moonlight |[ReadMe](https://github.com/alibaba/Pai-Megatron-Patch/blob/main/examples/moonlight/README.md)|[ReadMe](https://github.com/alibaba/Pai-Megatron-Patch/blob/main/examples/moonlight/README_grpo.md)| N/A |
10+
|DeepSeek-V3 |[ReadMe](https://github.com/alibaba/Pai-Megatron-Patch/blob/main/examples/deepseek_v3/README.md)| N/A | N/A |
11+
|DeepSeek-R1 | N/A |[ReadMe](https://github.com/alibaba/Pai-Megatron-Patch/blob/main/examples/deepseek_v3/README_grpo.md)| Coming Soon |
12+
13+
2014

2115
## Pai-Megatron-Patch是什么
2216
[English](./README.md) | 简体中文
@@ -46,6 +40,7 @@ Pai-Megatron-Patch是各类开源大模型和Megatron训练加速引擎之间的
4640
- [阿里云PAI获得FewCLUE基于大模型的小样本学习双料冠军](https://developer.aliyun.com/article/788081?spm=a2c6h.12873639.article-detail.17.11c5383cHpFZks&tlog=yuekan_8)
4741

4842
新功能:
43+
- **支持所有Mcore模型使用ChatLearn进行强化学习Context Parallel上下文并行降显存训练** [🔥🔥 2025.08.31]
4944
- **支持所有Mcore模型使用ChatLearn进行强化学习GSPO训练** [🔥🔥 2025.08.11]
5045
- **支持DeepSeek-V3-671B模型使用Mcore+ChatLearn进行强化学习GRPO训练** [🔥🔥 2025.07.31]
5146
- **支持Qwen3-235B模型使用Mcore+ChatLearn进行强化学习GRPO训练** [🔥🔥 2025.07.20]

backends/rl/ChatLearn

Submodule ChatLearn updated 105 files

0 commit comments

Comments
 (0)