Skip to content

Commit f3d77c9

Browse files
authored
Merge branch 'main' into murphy/dev-0813
2 parents b03486c + e75e0d7 commit f3d77c9

File tree

7 files changed

+641
-137
lines changed

7 files changed

+641
-137
lines changed

.env.example

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
OPENAI_API_BASE=
2+
OPENAI_API_KEY=<your api key>

README.md

Lines changed: 43 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ We are committed to regularly updating our exploration directions and results in
99

1010
We warmly welcome contributions from the broader community—join us in pushing the boundaries of agent reasoning and tool integration!
1111

12-
Code and dataset coming soon! Stay tuned!
12+
Code and dataset are now available! The `verl` submodule has been integrated for enhanced RL training capabilities.
1313

1414
<div style="display: flex; justify-content: center;">
1515
<div style="width: 100; transform: scale(1.0);">
@@ -59,7 +59,7 @@ Code and dataset coming soon! Stay tuned!
5959

6060

6161
## Current Team Members
62-
[@Kunlun Zhu](https://github.com/Kunlun-Zhu)(Ulab-UIUC), [@Jiayi Zhang](https://github.com/didiforgithub)(MetaGPT), [@Xinbing Liang](https://github.com/mannaandpoem),[@Xiangxin Zhou](https://github.com/zhouxiangxin1998), [@Yanfei Zhang](https://github.com/yanfei-zhang-95), [@Yingxuan Yang](https://github.com/zoe-yyx), [@Zeping Chen](https://github.com/rxdaozhang),[@Weijia Zhang](https://github.com/CharlieDreemur), [@Muxin Tian](https://github.com/realtmxi), [@Haofei Yu](https://github.com/lwaekfjlk)(Ulab-UIUC), [@Jinyu Xiang](https://github.com/XiangJinyu), [@Yifan Wu](https://github.com/Evanwu50020), [@Bowen Jin](https://github.com/PeterGriffinJin), [@Blair Yang](https://github.com/blairyeung)
62+
[@Kunlun Zhu](https://github.com/Kunlun-Zhu)(Ulab-UIUC), [@Muxin Tian](https://github.com/realtmxi), [@Zijia Liu](https://m-serious.github.io/)(Ulab-UIUC), [@Yingxuan Yang](https://github.com/zoe-yyx),[@Jiayi Zhang](https://github.com/didiforgithub)(MetaGPT), [@Xinbing Liang](https://github.com/mannaandpoem), [@Weijia Zhang](https://github.com/CharlieDreemur), [@Haofei Yu](https://github.com/lwaekfjlk)(Ulab-UIUC), [@Cheng Qian](https://qiancheng0.github.io/),[@Bowen Jin](https://github.com/PeterGriffinJin),
6363

6464
---
6565

@@ -146,11 +146,18 @@ Agents are equipped with action-space awareness, employing systematic exploratio
146146
### Integration with RL Tuning Frameworks
147147
We integrate insights and methodologies from leading RL tuning frameworks, including:
148148

149-
- **Verl**
149+
- **Verl** - **Integrated as Git Submodule** - Our primary RL framework, providing advanced training capabilities for agent optimization
150150
- **TinyZero**
151151
- **OpenR1**
152152
- **Trlx**
153153

154+
### Verl Integration
155+
The `verl` submodule is fully integrated into OpenManus-RL, providing:
156+
- **Advanced RL Algorithms** - PPO, DPO, and custom reward modeling
157+
- **Efficient Training** - Optimized for large language model fine-tuning
158+
- **Flexible Configuration** - Easy customization of training parameters
159+
- **Production Ready** - Battle-tested framework from Bytedance
160+
154161
Through these frameworks, agents can effectively balance exploration and exploitation, optimize reasoning processes, and adapt dynamically to novel environments.
155162

156163
In summary, our method systematically integrates advanced reasoning paradigms, diverse rollout strategies, sophisticated reward modeling, and robust RL frameworks, significantly advancing the capability and adaptability of reasoning-enhanced LLM agents.
@@ -208,6 +215,18 @@ We are still laboriously developing this part, welcome feedback.
208215

209216
## Installation
210217

218+
### Prerequisites
219+
This project uses git submodules. After cloning the repository, make sure to initialize and update the submodules:
220+
221+
```bash
222+
# Clone the repository with submodules
223+
git clone --recursive https://github.com/OpenManus/OpenManus-RL.git
224+
225+
# Or if already cloned, initialize and update submodules
226+
git submodule update --init --recursive
227+
```
228+
229+
### Environment Setup
211230
First, create a conda environment and activate it:
212231

213232
```bash
@@ -248,6 +267,7 @@ conda activate agentenv_webshop
248267
# Setup the environment
249268
bash ./setup.sh -d all
250269
```
270+
251271
### 2. ALFWorld
252272

253273
```bash
@@ -263,31 +283,17 @@ alfworld-download -f
263283
```
264284
Use `--extra` to download pre-trained checkpoints and seq2seq data.
265285

266-
### Launching the WebShop Server
286+
## Quick Start
267287

268-
After setting up the environment, you can launch the WebShop server:
288+
### 1. Environment Setup
289+
Make sure you have the required environments set up (see Environment Setup section above).
269290

270-
```bash
271-
# Make sure the webshop conda environment is activated
272-
conda activate webshop
273-
274-
# Launch the server (default port: 36001)
275-
webshop --port 36001
276-
```
291+
### 2. Data Preparation
292+
Download the OpenManus-RL dataset from [Hugging Face](https://huggingface.co/datasets/CharlieDreemur/OpenManus-RL).
277293

278-
Note: The WebShop environment requires specific versions of Python, PyTorch, Faiss, and Java. The setup script will handle these dependencies automatically.
294+
### 3. Training Examples
279295

280-
## Quick start
281-
282-
Train a reasoning + search LLM on NQ dataset with e5 as the retriever and wikipedia as the corpus.
283-
284-
(1) Download the indexing and corpus.
285-
286-
From https://huggingface.co/datasets/CharlieDreemur/OpenManus-RL
287-
288-
(3) Launch a local AgentGym server.
289-
290-
(4) Run RL training (PPO).
296+
#### ALFWorld RL Training (PPO)
291297
```bash
292298
conda activate openmanus-rl
293299
bash scripts/ppo_train/train_alfworld.sh
@@ -379,6 +385,19 @@ Please cite the following paper if you find OpenManus helpful!
379385
</a>
380386
</p>
381387

388+
## Project Structure
389+
390+
```
391+
OpenManus-RL/
392+
├── verl/ # Verl RL framework submodule
393+
├── openmanus_rl/ # Main OpenManus-RL library
394+
├── scripts/ # Training and evaluation scripts
395+
├── configs/ # Configuration files
396+
├── environments/ # Agent environment implementations
397+
├── docs/ # Documentation
398+
└── examples/ # Usage examples
399+
```
400+
382401
## Documentation
383402
- [Development Guide (English)](docs/DEVELOPMENT_GUIDE_EN.md)
384403
- [Development Guide (Chinese)](docs/DEVELOPMENT_GUIDE_ZH.md)

0 commit comments

Comments
 (0)