Skip to content

Commit 8f38fe3

Browse files
authored
doc: update supported algorithms in readme (#411)
1 parent a80d33f commit 8f38fe3

File tree

1 file changed

+40
-43
lines changed

1 file changed

+40
-43
lines changed

README.md

Lines changed: 40 additions & 43 deletions
Original file line numberDiff line numberDiff line change
@@ -69,31 +69,6 @@ state-of-the-art 7B and 32B models for mathematical reasoning. Check out our
6969

7070
</details>
7171

72-
## 🚀 Getting Started
73-
74-
Our training scripts automatically download the required dataset (openai/gsm8k) and
75-
model (Qwen/Qwen2-1.5B-Instruct). To run on a single node:
76-
77-
```bash
78-
python3 -m areal.launcher.local \
79-
examples/math/gsm8k_grpo.py \
80-
--config examples/math/gsm8k_grpo.yaml
81-
```
82-
83-
To run on a Ray cluster with 2 nodes and 8 GPUs per node (remember to update paths in
84-
the YAML file to point to your shared storage):
85-
86-
```bash
87-
python3 -m areal.launcher.ray \
88-
examples/math/gsm8k_grpo.py \
89-
--config examples/math/gsm8k_grpo.yaml \
90-
cluster.n_nodes=2 \
91-
cluster.n_gpus_per_node=8
92-
```
93-
94-
For comprehensive setup instructions, see
95-
[our quickstart guide](https://inclusionai.github.io/AReaL/tutorial/quickstart.html).
96-
9772
## 📚 Examples
9873

9974
| Task | Description | Performance |
@@ -108,15 +83,19 @@ For comprehensive setup instructions, see
10883

10984
## 🔧 Support Matrix
11085

111-
### Algorithms
86+
### 🧠 Algorithms
11287

113-
- **[GRPO](docs/algorithms/grpo.md)**
114-
- **[DAPO](docs/algorithms/dapo.md)**
115-
- **[LitePPO](docs/algorithms/litePPO.md)**
116-
- **[DrGRPO](docs/algorithms/dr.GRPO.md)**
117-
- **[RLHF Reward Modeling](examples/alignment/)**
118-
- **[PPO](examples/math/gsm8k_ppo.py)**
119-
- **[SFT](examples/math/gsm8k_sft.py)**
88+
| Algorithm | Documentation | Paper | Configuration |
89+
| ------------------------ | ------------------------------------- | ---------------------------------------------- | ------------------------------------------------------------ |
90+
| **GRPO** | [📖 Docs](docs/algorithms/grpo.md) | [📄 Paper](https://arxiv.org/pdf/2402.03300) | [🔗 GSM8K Example](examples/math/gsm8k_grpo.yaml) |
91+
| **PPO** | - | [📄 Paper](https://arxiv.org/pdf/2203.02155) | [🔗 GSM8K Example](examples/math/gsm8k_ppo.yaml) |
92+
| **DAPO** | [📖 Docs](docs/algorithms/dapo.md) | [📄 Paper](https://arxiv.org/abs/2503.14476) | [🔗 GSM8K Example](examples/experimental/dapo/gsm8k_dapo.py) |
93+
| **LitePPO** | [📖 Docs](docs/algorithms/litePPO.md) | [📄 Paper](https://arxiv.org/abs/2508.08221) | - |
94+
| **Dr.GRPO** | [📖 Docs](docs/algorithms/dr.GRPO.md) | [📄 Paper](https://arxiv.org/abs/2503.20783) | - |
95+
| **REINFORCE++** | - | [📄 Paper](https://arxiv.org/pdf/2501.03262) | [🔗 GSM8K Example](examples/math/gsm8k_reinforce.yaml) |
96+
| **RLOO** | [📖 Docs](docs/algorithms/rloo.md) | [📄 Paper](https://arxiv.org/pdf/2402.14740v1) | [🔗 GSM8K Example](examples/math/gsm8k_rloo.yaml) |
97+
| **RLHF Reward Modeling** | - | - | [🔗 RLHF Example](examples/alignment/) |
98+
| **SFT** | - | - | [🔗 GSM8K Example](examples/math/gsm8k_sft.py) |
12099

121100
### Models
122101

@@ -142,16 +121,39 @@ For comprehensive setup instructions, see
142121
| **vLLM** ||||||
143122
| **SGLang** ||||||
144123

145-
## 📖 Resources
124+
## 🚀 Getting Started
146125

147-
- [Documentation](https://inclusionai.github.io/AReaL/)
148-
- [CLI Configurations](https://inclusionai.github.io/AReaL/cli_reference.html)
149-
- [Contributing](https://inclusionai.github.io/AReaL/contrib.html)
126+
Our training scripts automatically download the required dataset (openai/gsm8k) and
127+
model (Qwen/Qwen2-1.5B-Instruct). To run on a single node:
128+
129+
```bash
130+
python3 -m areal.launcher.local \
131+
examples/math/gsm8k_grpo.py \
132+
--config examples/math/gsm8k_grpo.yaml
133+
```
134+
135+
To run on a Ray cluster with 2 nodes and 8 GPUs per node (remember to update paths in
136+
the YAML file to point to your shared storage):
137+
138+
```bash
139+
python3 -m areal.launcher.ray \
140+
examples/math/gsm8k_grpo.py \
141+
--config examples/math/gsm8k_grpo.yaml \
142+
cluster.n_nodes=2 \
143+
cluster.n_gpus_per_node=8
144+
```
145+
146+
For comprehensive setup instructions, see
147+
[our quickstart guide](https://inclusionai.github.io/AReaL/tutorial/quickstart.html).
150148

151-
### Quickstart
149+
## 📖 Resources
152150

153151
- [Installation](https://inclusionai.github.io/AReaL/tutorial/installation.html)
154152
- [Quickstart](https://inclusionai.github.io/AReaL/tutorial/quickstart.html)
153+
- [CLI Configurations](https://inclusionai.github.io/AReaL/cli_reference.html)
154+
- [Debugging Best Practices](https://inclusionai.github.io/AReaL/best_practices/debugging.html)
155+
- [Handling OOM Issues](https://inclusionai.github.io/AReaL/best_practices/handling_oom.html)
156+
- [Contributing](https://inclusionai.github.io/AReaL/contrib.html)
155157

156158
### Code Walkthrough
157159

@@ -163,11 +165,6 @@ For comprehensive setup instructions, see
163165
- [Customize Agentic/RVLR rollout workflows with AReaL-lite](https://inclusionai.github.io/AReaL/customization/agent.html)
164166
- [Customize algorithms with AReaL-lite](https://inclusionai.github.io/AReaL/customization/algorithm.html)
165167

166-
### Advanced Usage
167-
168-
- [Debugging Best Practices](https://inclusionai.github.io/AReaL/best_practices/debugging.html)
169-
- [Handling OOM Issues](https://inclusionai.github.io/AReaL/best_practices/handling_oom.html)
170-
171168
## 🗺️ Future Roadmap
172169

173170
- [2025 Q3 Roadmap](https://github.com/inclusionAI/AReaL/issues/257)

0 commit comments

Comments
 (0)