Skip to content

Commit 73127a2

Browse files
chore: Move multi-turn math example outside the math folder and update README (#416)
* Move multi-turn math example outside the `math` folder and update README * Update README.md Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> --------- Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
1 parent 6138e3a commit 73127a2

File tree

5 files changed

+10
-9
lines changed

5 files changed

+10
-9
lines changed

README.md

Lines changed: 10 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -71,15 +71,16 @@ state-of-the-art 7B and 32B models for mathematical reasoning. Check out our
7171

7272
## 📚 Examples
7373

74-
| Task | Description | Performance |
75-
| ---------------------------------------------- | ------------------------------------------------------------------------------------ | --------------------------------------------------------------------------------- |
76-
| **[Math](examples/math/)** | Mathematical problem solving (SFT, GRPO, or PPO) | TBA |
77-
| **[LoRA Math](examples/lora/)** | Math Agent Trained With LoRA | TBA |
78-
| **[VLM Math](examples/vlm/)** | CLEVR visual counting tasks | TBA |
79-
| **[Reasoning](examples/countdown/)** | Countdown numbers game with custom rewards | [Training Curve](/examples/countdown/countdown_training_curve.png) |
80-
| **[Search Agent](examples/search-agent/)** | An agent with end-to-end reasoning, search, browsing, and summarization capabilities | [ASearcher Repo](https://github.com/inclusionAI/ASearcher) |
81-
| **[Tool-Integrated Reasoning](examples/tir/)** | An agent that can invoke tools during reasoning | [TIR Example](https://github.com/inclusionAI/AReaL/tree/main/examples/tir) |
82-
| **[RLHF](examples/alignment/)** | RLHF for LLM Alignment | [RLHF Example](https://github.com/inclusionAI/AReaL/tree/main/examples/alignment) |
74+
| Task | Description | Performance |
75+
| ------------------------------------------------ | ------------------------------------------------------------------------------------ | --------------------------------------------------------------------------------- |
76+
| **[Math](examples/math/)** | Mathematical problem solving (SFT, GRPO, or PPO) | TBA |
77+
| **[Multi-Turn Math](examples/multi-turn-math/)** | Iterative mathematical problem solving with self-correction | [Training Curve](examples/multi-turn-math/reward_curve.png) |
78+
| **[LoRA Math](examples/lora/)** | Math Agent Trained With LoRA | TBA |
79+
| **[VLM Math](examples/vlm/)** | CLEVR visual counting tasks | TBA |
80+
| **[Reasoning](examples/countdown/)** | Countdown numbers game with custom rewards | [Training Curve](/examples/countdown/countdown_training_curve.png) |
81+
| **[Search Agent](examples/search-agent/)** | An agent with end-to-end reasoning, search, browsing, and summarization capabilities | [ASearcher Repo](https://github.com/inclusionAI/ASearcher) |
82+
| **[Tool-Integrated Reasoning](examples/tir/)** | An agent that can invoke tools during reasoning | [TIR Example](https://github.com/inclusionAI/AReaL/tree/main/examples/tir) |
83+
| **[RLHF](examples/alignment/)** | RLHF for LLM Alignment | [RLHF Example](https://github.com/inclusionAI/AReaL/tree/main/examples/alignment) |
8384

8485
## 🔧 Support Matrix
8586

File renamed without changes.

0 commit comments

Comments
 (0)